Skip to content

Commit

Permalink
Tree-sitter rolling fixes, 1.125 (or 1.124.1) edition (#1172)
Browse files Browse the repository at this point in the history
* [language-python] Fix some indentation corner cases…

…like indentation after a comment…

  if foo: # something

…and not indenting on one-liners…

  if foo: pass

…and not indenting after list slice syntax:

  x[1:2]


Also added specs for Python indentation, as it's getting somewhat complex.

* (whoops)

* Un-focus spec

* [language-css] Update `tree-sitter-css` to latest…

…and improve highlighting of selectors while typing at end of file.

* Bump `parserSource` for CSS grammar

* [language-css] Add scope tests `parentOfType` and `childOfType`…

…to support scenarios where you need to check parent/child relationship to an `ERROR` node, or to several different kinds of nodes at once.

* Fix grammar specs (don't trim the start of the line!)
  • Loading branch information
savetheclocktower authored Jan 17, 2025
1 parent ebed0ba commit ec55082
Show file tree
Hide file tree
Showing 12 changed files with 547 additions and 43 deletions.
2 changes: 1 addition & 1 deletion packages/language-css/grammars/modern-tree-sitter-css.cson
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ fileTypes: [
]

treeSitter:
parserSource: 'github:tree-sitter/tree-sitter-css#9af0bdd9d225edee12f489cfa8284e248321959b'
parserSource: 'github:tree-sitter/tree-sitter-css#5b24cbe3301a81b00c85d6c38db3bf2561e9ca41'
grammar: 'tree-sitter/tree-sitter-css.wasm'
highlightsQuery: 'tree-sitter/queries/highlights.scm'
foldsQuery: 'tree-sitter/queries/folds.scm'
Expand Down
51 changes: 36 additions & 15 deletions packages/language-css/grammars/tree-sitter/queries/highlights.scm
Original file line number Diff line number Diff line change
@@ -1,23 +1,46 @@


; NOTE: `tree-sitter-css` recovers poorly from invalidity inside a block when
; you're adding a new property-value pair above others in a list. When the user
; is typing and the file is temporarily invalid, it will make incorrect guesses
; about tokens that occur between the cursor and the end of the block.
; WORKAROUNDS
; ===========

; Mark `ERROR` nodes that occur inside blocks. We are much more cautious about
; inferences inside blocks because we're often wrong.
(
(ERROR) @_IGNORE_
(#is? test.childOfType "block")
(#set! isErrorInsideBlock true)
)

; (stylesheet (ERROR)) can't be queried directly, but it's important that we be
; able to detect it.
(
(ERROR) @_IGNORE_
(#is? test.childOfType "stylesheet")
(#set! isErrorAtTopLevel true)
)

; This selector captures an empty file with (e.g.) the word `div` typed.
(ERROR
(identifier) @entity.name.tag.css
(#set! capture.final)
(#is? test.descendantOfNodeWithData isErrorAtTopLevel)
)

; When there's a parsing error inside a `block` node, too many things get
; incorrectly interpreted as `tag_name`s. Ignore all `tag_name` nodes until the
; error is resolved.
;
; The fix here is for `tree-sitter-css` to get better at recovering from its
; parsing error, but parser authors don't currently have much control over
; that. In the meantime, this query is a decent mitigation: it colors the
; affected tokens like plain text instead of assuming (nearly always
; incorrectly) them to be tag names.
; This should be fixed upstream because it has undesirable effects on nested
; selectors, but in the meantime this workaround is better than doing nothing.
;
; Ideally, this is temporary, and we can remove it soon. Until then, it makes
; syntax highlighting less obnoxious.
; Keep an eye on https://github.com/tree-sitter/tree-sitter-css/issues/65 to
; know when this workaround might no longer be necessary.

((tag_name) @_IGNORE_
(#is? test.descendantOfType "ERROR")
(#is? test.descendantOfNodeWithData isErrorInsideBlock)
(#set! capture.final))


(ERROR
(attribute_name) @_IGNORE_
(#set! capture.final))
Expand All @@ -26,8 +49,6 @@
(attribute_name) @invalid.illegal)
(#set! capture.final))

; WORKAROUND:
;
; In `::after`, the "after" has a node type of `tag_name`. Unclear whether this
; is a bug or intended behavior. We want to catch it here so that it doesn't
; get scoped like an HTML tag name in a selector.
Expand All @@ -38,7 +59,7 @@
(#set! adjust.startAt lastChild.previousSibling.startPosition)
(#set! adjust.endAt lastChild.endPosition))

; Claim this range and block it from being scoped as a tag name.
; Claim the `tag_name` range and block it from being scoped as a tag name.
(pseudo_element_selector
(tag_name) @_IGNORE_
(#is? test.last true)
Expand Down
Binary file modified packages/language-css/grammars/tree-sitter/tree-sitter-css.wasm
Binary file not shown.
14 changes: 14 additions & 0 deletions packages/language-css/spec/.eslintrc.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
module.exports = {
env: { jasmine: true },
globals: {
waitsForPromise: true,
runGrammarTests: true,
runFoldsTests: true
},
rules: {
"node/no-unpublished-require": "off",
"node/no-extraneous-require": "off",
"no-unused-vars": "off",
"no-empty": "off"
}
};
5 changes: 5 additions & 0 deletions packages/language-css/spec/fixtures/ends-in-tag-name.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@

/* Ensure that a file that ends in a bare tag name scopes the tag name properly. */

div
/* <- entity.name.tag.css */
10 changes: 10 additions & 0 deletions packages/language-css/spec/fixtures/sample.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
div {
/* <- entity.name.tag.css */
/* ^ punctuation.section.property-list.begin.bracket.curly.css */
}

/* Next test verifies that selectors are properly inferred at the end of a file before the `{` is typed. */

div.foo
/* <- entity.name.tag.css */
/* ^ entity.other.attribute-name.class.css */
21 changes: 21 additions & 0 deletions packages/language-css/spec/tree-sitter-grammar-spec.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
const path = require('path');

describe('WASM Tree-sitter CSS grammar', () => {

beforeEach(async () => {
await atom.packages.activatePackage('language-css');
});

it('passes grammar tests', async () => {
await runGrammarTests(
path.join(__dirname, 'fixtures', 'sample.css'),
/\/\*/,
/\*\//
);
await runGrammarTests(
path.join(__dirname, 'fixtures', 'ends-in-tag-name.css'),
/\/\*/,
/\*\//
);
});
});
176 changes: 172 additions & 4 deletions packages/language-python/grammars/ts/indents.scm
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
; Excluding dictionary key/value separators…

; IGNORE NON-BLOCK-STARTING COLONS
; ================================

; First, exclude dictionary key/value separators…
(dictionary
(pair ":" @_IGNORE_
(#set! capture.final)))
Expand All @@ -7,22 +11,186 @@
((lambda ":" @_IGNORE_)
(#set! capture.final))

; …and type annotations on function parameters/class members…
; …list subscript syntax…
(slice ":" @_IGNORE_
(#set! capture.final))

; …and type annotations on function parameters/class members.
(":" @_IGNORE_ . (type) (#set! capture.final))

; …all other colons we encounter hint at upcoming indents.
; IGNORE BLOCK-STARTING COLONS BEFORE ONE-LINERS
; ==============================================

; Now that we've done that, all block-starting colons that have their
; consequence block start and end on the same line should be filtered out.
;
; We also test for `lastTextOnRow` to ensure we're not followed by an _empty_
; consequence block, which is surprisingly common. Probably a bug, but it's got
; to be worked around in the meantime.
;
; We check for adjacency between the `:` and the `block` because otherwise we
; might incorrectly match cases like
;
; if 2 > 1: # some comment
;
; since those comments can also be followed by an empty `block` node on the same
; line.
;
(if_statement
":" @_IGNORE_
.
consequence: (block)
(#is-not? test.lastTextOnRow)
(#is? test.startsOnSameRowAs "nextSibling.endPosition")
(#set! capture.final)
)

(elif_clause
":" @_IGNORE_
.
consequence: (block)
(#is-not? test.lastTextOnRow)
(#is? test.startsOnSameRowAs "nextSibling.endPosition")
(#set! capture.final)
)

(else_clause
":" @_IGNORE_
.
body: (block)
(#is-not? test.lastTextOnRow)
(#is? test.startsOnSameRowAs "nextSibling.endPosition")
(#set! capture.final)
)

(match_statement
":" @_IGNORE_
.
body: (block)
(#is-not? test.lastTextOnRow)
(#is? test.startsOnSameRowAs "nextSibling.endPosition")
(#set! capture.final)
)

(case_clause
":" @_IGNORE_
.
consequence: (block)
(#is-not? test.lastTextOnRow)
(#is? test.startsOnSameRowAs "nextSibling.endPosition")
(#set! capture.final)
)

(while_statement
":" @_IGNORE_
.
body: (block)
(#is-not? test.lastTextOnRow)
(#is? test.startsOnSameRowAs "nextSibling.endPosition")
(#set! capture.final)
)

(for_statement
":" @_IGNORE_
.
body: (block)
(#is-not? test.lastTextOnRow)
(#is? test.startsOnSameRowAs "nextSibling.endPosition")
(#set! capture.final)
)

(try_statement
":" @_IGNORE_
.
body: (block)
(#is-not? test.lastTextOnRow)
(#is? test.startsOnSameRowAs "nextSibling.endPosition")
(#set! capture.final)
)

(except_clause
":" @_IGNORE_
.
(block)
(#is-not? test.lastTextOnRow)
(#is? test.startsOnSameRowAs "nextSibling.endPosition")
(#set! capture.final)
)

; Special case for try/except statements, since they don't seem to be valid
; until they're fully intact. If we don't do this, `except` doesn't dedent.
;
; This is like the `elif`/`else` problem below, but it's trickier because an
; identifier could plausibly begin with the string `except` and we don't want
; to make an across-the-board assumption.
(ERROR
"try"
":" @indent
(block
(expression_statement
(identifier) @dedent
(#match? @dedent "except")
)
)
)

(function_definition
":" @_IGNORE_
.
body: (block)
(#is-not? test.lastTextOnRow)
(#is? test.startsOnSameRowAs "nextSibling.endPosition")
(#set! capture.final)
)

(class_definition
":" @_IGNORE_
.
body: (block)
(#is-not? test.lastTextOnRow)
(#is? test.startsOnSameRowAs "nextSibling.endPosition")
(#set! capture.final)
)


; REMAINING COLONS
; ================

; Now that we've done this work, all other colons we encounter hint at upcoming
; indents.
;
; TODO: Based on the stuff we're doing above, it's arguable that the
; exclude-all-counterexamples approach is no longer useful and we should
; instead be opting into indentation. Revisit this!
":" @indent

; MISCELLANEOUS
; =============

; When typing out "else" after an "if" statement, tree-sitter-python won't
; acknowlege it as an `else` statement until it's indented properly, which is
; acknowledge it as an `else` statement until it's indented properly, which is
; quite the dilemma for us. Before that happens, it's an identifier named
; "else". This has a chance of spuriously dedenting if you're typing out a
; variable called `elsewhere` or something, but I'm OK with that.
;
; This also means that we _should not_ mark an actual `else` keyword with
; `@dedent`, because if it's recognized as such, that's a sign that it's
; already indented correctly and we shouldn't touch it.
;
; All this also applies to `elif`.
((identifier) @dedent (#match? @dedent "^(elif|else)$"))

; Likewise, typing `case` at the beginning of a line within a match block — in
; cases where it's interpreted as an identifier — strongly suggests that we
; should dedent one level so it's properly recognized as a new `case` keyword.
(
(identifier) @dedent
(#equals? @dedent "case")
(#is? test.descendantOfType "case_clause")
)


; All instances of brackets/braces should be indented if they span multiple
; lines.
["(" "[" "{"] @indent
[")" "]" "}"] @dedent
14 changes: 14 additions & 0 deletions packages/language-python/spec/.eslintrc.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
module.exports = {
env: { jasmine: true },
globals: {
waitsForPromise: true,
runGrammarTests: true,
runFoldsTests: true
},
rules: {
"node/no-unpublished-require": "off",
"node/no-extraneous-require": "off",
"no-unused-vars": "off",
"no-empty": "off"
}
};
Loading

0 comments on commit ec55082

Please sign in to comment.