gh-146492: Improve SyntaxError for missing comma between import clauses#146493
gh-146492: Improve SyntaxError for missing comma between import clauses#146493brianschubert wants to merge 6 commits intopython:mainfrom
Conversation
| RAISE_SYNTAX_ERROR_STARTING_FROM(token, "Expected one or more names after 'import'") } | ||
| invalid_dotted_as_name: | ||
| | a=dotted_name b=['as' NAME] c=dotted_name { | ||
| RAISE_SYNTAX_ERROR_KNOWN_RANGE(b ? (expr_ty) b : a, c, "expected comma between import clauses") } |
There was a problem hiding this comment.
Unless I'm missing something, is this rule a bit too broad now?
compile("import a b()", "<repro>", "exec")
compile("import a b + c", "<repro>", "exec")
compile("from x import a b[c]", "<repro>", "exec")This prints:
expected comma between import clauses | line=1 col=8 end_line=1 end_col=11
expected comma between import clauses | line=1 col=8 end_line=1 end_col=11
expected comma between import clauses | line=1 col=15 end_line=1 end_col=18
In all three cases, adding a comma still wouldn't make the code valid, so this doesn't seem like a real missing comma between import clauses situation. Should this only trigger when the would-be second clause is actually a valid import target?
There was a problem hiding this comment.
Hmm, I see what you mean. I might argue that the missing comma is the "first" error in each of those cases, and that the rest is more-or-less an unrelated additional syntax error. Adding a comma would at least make those cases more correct and move the syntax error location start of the "real" problem. But I'm happy to exclude those if you prefer.
I pushed a commit that prevents this from triggering for those cases. It still triggers for this case:
>>> import a b as c()
File "<python-input-0>", line 1
import a b as c()
^^^
SyntaxError: expected comma between import clausesbut I'm not sure there's a good way to exclude this too without making it much more complicated.
| Traceback (most recent call last): | ||
| SyntaxError: Expected one or more names after 'import' | ||
|
|
||
| >>> import a b |
There was a problem hiding this comment.
Are we only pinning the message text for a few simple one-line cases here? For example, these both reproduce the new behavior:
compile("import a, b c", "<repro>", "exec")
compile("from x import (a\n b)", "<repro>", "exec")with:
expected comma between import clauses | line=1 col=11 end_line=1 end_col=14
expected comma between import clauses | line=1 col=16 end_line=2 end_col=3
But from what I can see, the new tests here don't assert the caret/range at all, and they also don't cover either a later-clause case or the parenthesized multiline form. Would it make sense to add at least one range-aware assertion and one nontrivial shape, so the main user-visible part of the change is actually locked down?
There was a problem hiding this comment.
Added some more tests to exercise some other forms and to assert the expected error range. Let me know if there are any other forms you think would be worth covering
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
|
While playing with this locally, I noticed some other related forms that produce the same confusing error message on main: >>> import a as b 1
File "<python-input-2>", line 1
import a as b 1
^
SyntaxError: cannot use name as import target
>>> import a as b as c
File "<python-input-3>", line 1
import a as b as c
^
SyntaxError: cannot use name as import target
>>> import a as b import foo
File "<python-input-2>", line 1
import a as b import foo
^
SyntaxError: cannot use name as import targetI think these could be fixed with this patch: diff --git a/Grammar/python.gram b/Grammar/python.gram
index 3acf8db9af3..2e1b1982fe2 100644
--- a/Grammar/python.gram
+++ b/Grammar/python.gram
@@ -1434,13 +1434,13 @@ invalid_import:
invalid_dotted_as_name:
| a=dotted_name b=['as' NAME] c=dotted_name ('as' | ',' | ')' | ';' | NEWLINE) {
RAISE_SYNTAX_ERROR_KNOWN_RANGE(b ? (expr_ty) b : a, c, "expected comma between import clauses") }
- | dotted_name 'as' !(NAME (',' | ')' | ';' | NEWLINE)) a=expression {
+ | dotted_name 'as' !(NAME (',' | ')' | ';' | NEWLINE)) a=expression (',' | ')' | ';' | NEWLINE) {
RAISE_SYNTAX_ERROR_KNOWN_LOCATION(a,
"cannot use %s as import target", _PyPegen_get_expr_name(a)) }
invalid_import_from_as_name:
| [NAME 'as'] a=NAME b=NAME ('as' | ',' | ')' | ';' | NEWLINE) {
RAISE_SYNTAX_ERROR_KNOWN_RANGE(a, b, "expected comma between import clauses") }
- | NAME 'as' !(NAME (',' | ')' | ';' | NEWLINE)) a=expression {
+ | NAME 'as' !(NAME (',' | ')' | ';' | NEWLINE)) a=expression (',' | ')' | ';' | NEWLINE) {
RAISE_SYNTAX_ERROR_KNOWN_LOCATION(a,
"cannot use %s as import target", _PyPegen_get_expr_name(a)) }Would that be better to include in this PR or a separate PR? (Reason I ask is that I'm not sure if this PR is something we'd generally backport, but a fix like the above is something we presumably would want to backport). |
In this PR is fine. We normally don't backport unless is pretty clear that's a bug and that is fundamentally broken. |
Demo: