Skip to content

Conversation

@udaypatel1
Copy link

@udaypatel1 udaypatel1 commented Jan 21, 2026

Description

For the most_common() function, add a specific path to handle n == 1 which uses max() and is O(n) vs. O(nlog(n)).

Add tests to increase code coverage.

@python-cla-bot
Copy link

python-cla-bot bot commented Jan 21, 2026

All commit authors signed the Contributor License Agreement.

CLA signed

@udaypatel1 udaypatel1 force-pushed the feature/optimize-most-common-func branch from 417da82 to 4a9e2ec Compare January 21, 2026 21:27
@udaypatel1 udaypatel1 marked this pull request as ready for review January 21, 2026 21:55
@davidlowryduda
Copy link
Contributor

Is this faster? What are the time comparisons like?

Comment on lines +2282 to +2283
def test_most_common(self):
c = Counter(a=5, b=3, c=5, d=2, e=0, f=-1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a particular reason for checking most_common again here, separately from the basic tests above?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, there's not really anything to test against in this PR.

If this does increase test coverage, we can accept it in a separate PR, since this one won't be backported.

Comment on lines +1 to +3
Optimize :meth:`collections.Counter.most_common` for ``n=1`` to use
:func:`max` instead of :func:`heapq.nlargest`, improving performance for
the common case of finding the single most common element.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note the % increase in performance somewhere here.

Comment on lines +638 to +639
# Optimization: use max() instead of heapq for single element
# max() raises ValueError on empty sequence, so check first
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the second comment is particularly helpful:

Suggested change
# Optimization: use max() instead of heapq for single element
# max() raises ValueError on empty sequence, so check first
# Optimization: use max() instead of heapq for single element

Comment on lines +2282 to +2283
def test_most_common(self):
c = Counter(a=5, b=3, c=5, d=2, e=0, f=-1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, there's not really anything to test against in this PR.

If this does increase test coverage, we can accept it in a separate PR, since this one won't be backported.

@rhettinger rhettinger self-assigned this Jan 22, 2026
@rhettinger
Copy link
Contributor

rhettinger commented Jan 22, 2026

Sorry, but this doesn't make any sense. The code to heapq.nlargest already has a fast path that calls max for the case `n == 1'. All this PR does is duplicate and inline that code.

@rhettinger rhettinger closed this Jan 22, 2026
@udaypatel1
Copy link
Author

Sorry, but this doesn't make any sense. The code to heapq.nlargest already has a fast path that calls max for the case `n == 1'. All this PR does is duplicate and inline that code.

@rhettinger -

All good! Found this as well while digging through the function calls and was going to close the PR + Issue.

Will continue to look for any potential QOL improvements in the collections folder.

Cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants