Skip to content

Comments

bpo-24263: Fix unittest to discover tests named with non-ascii characters#13149

Open
abadger wants to merge 3 commits intopython:mainfrom
abadger:rebase-unittest-unicode-pattern
Open

bpo-24263: Fix unittest to discover tests named with non-ascii characters#13149
abadger wants to merge 3 commits intopython:mainfrom
abadger:rebase-unittest-unicode-pattern

Conversation

@abadger
Copy link
Contributor

@abadger abadger commented May 7, 2019

This is a continuation of #1338 from @louisom which is itself based on a patch in the bpo issue tracker fromsih4sing5hong5.

In addition to rebasing louisom's work to current master, I've done some work to address comments on the earlier code:

it looks for isidentifier on $thing.py, but not on just $thing (which we need to do to handle packages, vs modules).

I've addressed that by adding a test using isidentifier() in the isdirectory section of the conditional: https://github.com/python/cpython/pull/13149/files#diff-c1c31e722f730423a5e6dcc3eb6eef67R471

and adding directory names which are not valid identifiers (but have __init__.py inside of them) to the test cases.

https://bugs.python.org/issue24263

path_lists = [['test2.py', 'test1.py', 'not_a_test.py', 'test_dir',
'test.foo', 'test-not-a-module.py', 'another_dir'],
path_lists = [ # Valid module and package names
['test2.py', 'test1.py', 'not_a_test.py', 'test_dir',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move the [ before the comment, otherwise it seems like the comment applies to the whole list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

# of a package directory
['測試4.py', '測試3.py', ]]
os.listdir = lambda path: path_lists.pop(0)
self.addCleanup(restore_listdir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of defining all the restore_* functions, I would just do something like self.addCleanup(setattr, os, 'listdir', original_listdir).
Another option is using mock.patch, but maybe here it's overkill.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, for the remainder of these comments.... It appears that this test_find_tests_with_unicode() function is a copy of the pre-existing test, test_find_tests(), adjusted to use new data.

If you'd like me to make these changes, I can, but I think I'd pair it with merging the two tests.


def isfile(path):
# '另外的_資料夾' is not a package and so shouldn't be recursed into
return not path.endswith('資料夾') and not '另外的_資料夾' in path
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return not path.endswith('資料夾') and '另外的_資料夾' not in path


def isdir(path):
return path.endswith('資料夾')
os.path.isdir = isdir
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why here you defined an isdir function instead of using a lambda?
(FWIW I think for the sake of brevity lambdas are fine in this context.)

# execution order.
expected = [[name + ' module tests'] for name in
('測試1', '測試2', '測試_資料夾')]
expected.extend([[('測試_資料夾.%s' % name) + ' module tests'] for name in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected.extend([['測試_資料夾.%s module tests' % name] for name in

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@abadger
Copy link
Contributor Author

abadger commented May 12, 2019

@rbtcollins, it seems like the people presently interested in this PR are leaning towards removing the Python2 compatible code so I'm going to go ahead and remove that.

@abadger
Copy link
Contributor Author

abadger commented May 12, 2019

@ezio-melotti Updated. I believe with all of the changes that you and @vstinner asked for.

@abadger abadger force-pushed the rebase-unittest-unicode-pattern branch 2 times, most recently from 883876f to b4808be Compare May 12, 2019 17:03
@abadger
Copy link
Contributor Author

abadger commented May 12, 2019

I have made the requested changes; please review again

@bedevere-bot
Copy link

Thanks for making the requested changes!

@ezio-melotti: please review the changes made to this pull request.

@csabella csabella requested a review from ezio-melotti January 25, 2020 20:12
# what about .pyc (etc)
# we would need to avoid loading the same tests multiple times
# from '.py', *and* '.pyc'
VALID_MODULE_NAME = re.compile(r'[_a-z]\w*\.py$', re.IGNORECASE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to keep this global and update it to use r'[^\W\d]\w*\.py$', but as @serhiy-storchaka pointed out in #68451 (comment) this doesn't match all identifiers.

Perhaps a is_valid_module_name() global function could be added instead -- this will still provide a way to control what is valid, instead of hardcoding the check in _find_test_path().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added this as a private function in loader: _is_valid_module_filename(). The code that makes use of the function makes use of it to tell if a filename is a valid python module name. A different code path operates on directory names. I think I put the amount of logic into the function that you were thinking (the equivalent of what the regex replaced). Let me know if you were thinking of something with a larger scope.

@abadger abadger force-pushed the rebase-unittest-unicode-pattern branch 3 times, most recently from 3ad3161 to 9273219 Compare April 24, 2023 18:35
louisom and others added 3 commits April 24, 2023 11:35
* The unittest test discovery was incorrectly looking for tests in
  directories which had non-identifiers in the directory name.  This
  change makes the discovery only enter directories which can be valid
  python identifiers

* Add a changelog entry for the unittest test discovery fix

* Remove python2-straddling code

* Merge test_find_tests_with_unicode with test_find_tests
  as they are really just variants on the data provided.

* Simplify a few sections of the code as suggested by ezio melotti
@abadger abadger force-pushed the rebase-unittest-unicode-pattern branch from 9273219 to 688e2ef Compare April 24, 2023 18:36
@abadger abadger requested a review from ezio-melotti April 24, 2023 22:05
@abadger
Copy link
Contributor Author

abadger commented Apr 24, 2023

I have made the requested changes; please review again

@bedevere-bot
Copy link

Thanks for making the requested changes!

@ezio-melotti: please review the changes made to this pull request.

@ziima
Copy link
Contributor

ziima commented Jul 13, 2024

@ezio-melotti please review the changes.

I found the issue and PR as part of EuroPython 24 sprints.

@github-actions
Copy link

github-actions bot commented Feb 1, 2026

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Feb 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting change review stale Stale PR or inactive for long period of time.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants