-
-
Notifications
You must be signed in to change notification settings - Fork 33.9k
gh-144127: Counter.most_common() Optimization for Small n + Test Cases
#144129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-144127: Counter.most_common() Optimization for Small n + Test Cases
#144129
Conversation
…ts for coverage; add NEWS file
417da82 to
4a9e2ec
Compare
|
Is this faster? What are the time comparisons like? |
| def test_most_common(self): | ||
| c = Counter(a=5, b=3, c=5, d=2, e=0, f=-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a particular reason for checking most_common again here, separately from the basic tests above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, there's not really anything to test against in this PR.
If this does increase test coverage, we can accept it in a separate PR, since this one won't be backported.
| Optimize :meth:`collections.Counter.most_common` for ``n=1`` to use | ||
| :func:`max` instead of :func:`heapq.nlargest`, improving performance for | ||
| the common case of finding the single most common element. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note the % increase in performance somewhere here.
| # Optimization: use max() instead of heapq for single element | ||
| # max() raises ValueError on empty sequence, so check first |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the second comment is particularly helpful:
| # Optimization: use max() instead of heapq for single element | |
| # max() raises ValueError on empty sequence, so check first | |
| # Optimization: use max() instead of heapq for single element |
| def test_most_common(self): | ||
| c = Counter(a=5, b=3, c=5, d=2, e=0, f=-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, there's not really anything to test against in this PR.
If this does increase test coverage, we can accept it in a separate PR, since this one won't be backported.
|
Sorry, but this doesn't make any sense. The code to |
All good! Found this as well while digging through the function calls and was going to close the PR + Issue. Will continue to look for any potential QOL improvements in the Cheers |
Description
For the
most_common()function, add a specific path to handlen == 1which usesmax()and isO(n)vs.O(nlog(n)).Add tests to increase code coverage.
Counter.most_common()Performance Improvement for Smalln#144127