-
Notifications
You must be signed in to change notification settings - Fork 816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide range or bias direction for go-to-definition etc #1029
Comments
(Just noticed that the highlighting in the second screenshot shows the problem nicely. |
Also see microsoft/vscode-languageserver-node#593. |
As mentioned in microsoft/vscode-languageserver-node#593 we need to change the protocol if we need to let servers know about this. Adding a selection to the params might not help since VS Code doesn't have the information right now in the API and other editors might behave the same. So having a capability looks like a good approach to have more control over this. |
That's a VSCode issue, not a protocol issue, though right ? The protocol is explicit about which "character" is being requested (it's a row/column position, not a position bettween two columns). Why would we need to change the protocol for this? Surely this change can be made in VSCode only. Or am i missing something important? |
Makes sense to me. A capability seems less complex overall, with most of the complexity lying in the server rather than client or protocol.
I think it's the opposite:
Whatever the spec text currently says, both "line-cursor" and "block-cursor" editors exist, and it would be nice to specify something here to improve interop. |
Quite right, ugh.
Oh boy, i had not noticed that. So, how do you specify the start of the line ? On reflection, this model seems kinda wonky. For completion requests the character position is usually requested as the position immediately following the last char on the line, but this would indicate that the server should interpret is therefore as to between the "one-past-the-last-char-in-line" and the "two-past-the-last-char-in-line". Fortunately, either we (and most other clients) are not actually not sending the 'one-past-the-last-char-in-line" position or servers are already interpreting the completion position as if it were an actual position (not a virtual thing between 2 positions). I suspect the later.
I see your point. The "between characters" model works for an insert-mode-like situation, but doesn't exist in an equivalent of "normal-mode", or where the cursor has an explicit location. |
@puremourning Index 0 is the left of the first character |
But offset 0 means the first character in any |
@puremourning In this case the character does not matter because Or perhaps I'm misinterpreting something here? |
The
The
Whereas you're saying it should be interpreted as here:
Which i would largely agree is the only reasonable way this could work. But it makes the specification itself ambiguous. Agree with @sam-mccall that editors where the cursor position is an integral actual-character offset don't have any ambiguity here. in the above cases, position
|
Ah, that might be other experience leaking. I've only used line cursors, so think of it in terms of cursor position. The character and the cursor position to the left of it are interchangeable with this interpretation, giving a right bias on the cursor -> character transform. I agree, there doesn't seem to be any issue here then; Position's are specified as being character offsets, so it's up to the client to decide which character (as a Position) it wants to request for. It also makes sense for the client to decide this, to make the experience consistent across servers. |
We're drifting a little bit from the original issue :-) |
@sam-mccall I'm not sure what's off topic here: the spec says that a Position is the character offset, i.e., the character itself, and that the location for Goto Definition, etc., is given in terms of such a Position. Your original question about ambiguity is resolved with this understanding. I think the question of how to resolve a given kind of cursor to a character is a concern for the LSP client, not the protocol or the server. IMO the protocol should be view / edit model agnostic as much as possible. |
In microsoft#1029 it was pointed out that "between `character` and `character` + 1" might be off-by-one. Given line `abc`, and `Position { character: 0 }` - The offset of `a` is 0, and the offset of `b` is 1 (see previous section). - "between `character` and `character + 1`" may be read as "between a and b" - but this `Position` actually (in VSCode and other implementations) indicates start-of-line, i.e. before "a". This patch *doesn't* change the model (that a position is always between two characters), an microsoft#1029 suggests. However editors where the cursor lies on a character have to ignore that part, and this definition happens to specify the de-facto standard behavior for them too.
@sam-mccall you're almost certainly right. i do think this interpretation is material to this issue though. The reason i say that is that if the specification states that a position lies between characters, even for ad-hoc requests like go-to, etc. then it's material to the discussion how that position should be interpreted. I think this issue is about helping servers 'guess' what the user's intention is in the model between-characters model, whereas actually if the model is applied consistently, then it should be possible for all clients and servers to interop without new capabilities or protocol changes. I guess i'm exploring if that's possible. I realise that in the model where you specify the actual character position, there is no ambiguity. i'm thinking that it's the clients responsibility to remove the ambiguity from the user's perspective and the specification's responsibility to remove it from the client/server interpretation perspective. |
No, as previously quoted, the spec says: "A position is between two characters like an ‘insert’ cursor in a editor". This is in the section defining the Position structure specifically.
That's a perfectly self-consistent theory, but it doesn't match VSCode's selection behavior (see the OP). In practice this basically means the interpretation is wrong - the spec ~always follows VSCode's model, which is also perfectly self-consistent.
Concretely this might look like "Position now represents a character" and VSCode has to sometimes subtract 1 compared to what it currently sends (assuming no dramatic changes to its internal model, which I think are unlikely). Apart from the question of whether VSCode is willing to make such a potentially-breaking behavior change, this necessarily involves the editor guessing about the significance of the characters on the left vs right of the cursor. This is heuristic, and I've had nothing but trouble where editors have needed to do this (can dig up examples if needed). So my preference is to avoid trying to patch this up. |
Perhaps I'm missing something, but can't this be solved in a general way by having the protocol take a range rather than a position? A between-characters cursor can be modelled as a zero-length range, and an on-a-character cursor as a length-one range. Using ranges instead of positions also makes the protocol richer, e.g. see the existing discussion in #377 about hovers showing the type of a selected expression that's larger than just an identifier. |
OK, to be clear, I was basing my reasoning on the spec. I searched "position" in the specification, and the first result was as follows:
followed by an example explicitly using "character offsets" to refer to characters
The leading sentence of Position confirms this
... and then the second sentence throws everything established so far out completely by saying they measure the space between characters
...but wait, there's more: in the Range specification:
it says that an end position (represented with a Position) is exclusive. Consider the range [[0, 0], [0, 1]]; do we agree that this is the first character of the first line? If so, then it is exclusive under the character model (doesn't touch the index 1 character), but not exclusive under the between-character model (does touch the index 1 space between characters). So I hope you understand why I was quite confident about Positions being specified as character offsets, and not spaces between characters 🙂. Though I do feel better about my original comment now. It seems then that it is the spec itself that is inconsistent then, probably because it is natural to think about Positions in terms of between characters (to represent an I cursor location / inclusive selection range), and also because they are used to represent characters for things like Goto Definition. I feel the spec is still overwhelmingly in favour of them being character offsets, so I feel my conclusion above is still relevant, but there is definitely inconsistency there. However your point about needing a heuristic in the editor to decide which position to ask for is interesting; a simpler example might be
If using an I beam cursor, the editor would need to pick which way to bias it to a character offset. The user would expect a particular one to be resolved, but there is no way to communicate which to the server without an editor heuristic (e.g., favour word characters over punctuation). Otherwise, if the editor always biases the I beam cursor in a particular direction, it may ask for the |
Any update here? Seems like we could update the protocol to optionally include a range for selections? |
b56a40f seems to imply it's up to the client to decide what to send back if there's a selection. |
Right, but if the protocol only specifies a point, then the information channel is lossy: the client can decide to send the starting point of the selection or the ending point, but not both. For some use cases, it would be useful if the protocol allowed the client to send both endpoints of the selection. |
b56a40f is to clarify the current situation. That doesn't mean that we can't extend the protocol with the selection if we think that is necessary. |
This change adds an optional range field to textDocumentPositionParams so that clients can, for example, query a subexpression such as x.f within a larger expression g(x.f[i]). Also, clarify the text about cursor bias. Fixes microsoft#377 Fixes microsoft#1029
This change adds an optional range field to textDocumentPositionParams so that clients can, for example, query a subexpression such as x.f within a larger expression g(x.f[i]). Also, clarify the text about cursor bias. Fixes microsoft#377 Fixes microsoft#1029
Could one of the maintainers please review this PR: #2027? Thanks. |
Go-to-definition (and others) is triggered on a cursor position. In VSCode, the cursor lies between characters, so ambiguity: the thing on the left or the thing on the right?
A couple of factors make this worse:
To illustrate, these screenshots produce the same
textDocument/definition
request, but want different results.In vim. This should go to
Foo::operator->
.In VSCode. This should go to
object
.It'd be nice to have some more information so language servers can make better decisions. Some ideas:
I guess this is mostly an issue with languages that have overloaded operators. constructor and functor calls are common examples in C++.
The text was updated successfully, but these errors were encountered: