Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(python) fix or conflicts with string highlighting #3897

Merged

Conversation

MohamedAli00949
Copy link
Contributor

I have fixed the error of highlighting the reserved words such as 'or' in the below code.

print(test()or'string')

That should be in the below image.
image
Not as this highlight.
image

That happens because of the string prefixes (f, r, u, and b) or (F, R, U, and B).

This issue has been described at #3798

Changes

I have found that happens from string variants begin RegExes with the prefix (\W|^).

Checklist

  • Update the string variants begin RegExes with the prefix (\W|^).

@github-actions
Copy link

Build Size Report

Changes to minified artifacts in /build, after gzip compression.

4 files changed

Total change +17 B

View Changes
file base pr diff
es/core.min.js 8.14 KB 8.14 KB -1 B
es/highlight.min.js 8.14 KB 8.14 KB -1 B
es/languages/python.min.js 1.53 KB 1.54 KB +9 B
languages/python.min.js 1.54 KB 1.55 KB +10 B

@@ -176,7 +176,7 @@ export default function(hljs) {
contains: [ hljs.BACKSLASH_ESCAPE ],
variants: [
{
begin: /([uU]|[bB]|[rR]|[bB][rR]|[rR][bB])?'''/,
begin: /(\W|^)([uU]|[bB]|[rR]|[bB][rR]|[rR][bB])?'''/,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't you really just expressing "word boundary" here... and can't that be done much simpler using \b?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's doing the same thing here.
But it didn't come first to my mind.
Thanks for your advice and I will replace (\W|^) with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshgoebel please take another look at my changes.
Many thanks for considering my request.

@github-actions
Copy link

Build Size Report

Changes to minified artifacts in /build, after gzip compression.

3 files changed

Total change +4 B

View Changes
file base pr diff
es/core.min.js 8.14 KB 8.14 KB +1 B
es/highlight.min.js 8.14 KB 8.14 KB +1 B
highlight.min.js 8.17 KB 8.17 KB +2 B

@joshgoebel
Copy link
Member

We need to add some markup tests as well to prove this actually works as intended.

@MohamedAli00949
Copy link
Contributor Author

MohamedAli00949 commented Oct 24, 2023

I have tested the code with (\W|^) and passed all checks
But after I replaced it with \b it didn't pass the python and I didn't discover that at the same time.
I have tried to fix the tests but I find it difficult to understand them now so I want any resource (eg: docs, or videos) to understand them easily and fix the tests.

@MohamedAli00949
Copy link
Contributor Author

Now, I have discovered the issue with this code.
The issue is that the word bounding expression selects everything from the beginning to the end as a string and the contains also.
In this sample, you will see that the prompt '...' has been selected as a string not as meta.
image

That's the right show
image

I tried to solve it until now it didn't solve it.
My suggested solution is to contact the contains expirations with the beginning expression or go back to use the (\W|^) expression.

@MohamedAli00949
Copy link
Contributor Author

Hey @joshgoebel, what is your opinion I understand the test mockups and I think we should put some of them such as print(test()or'text') and try that with every string prefix.
I want your opinion on fixing and testing my changes bugs.

Thanks before any thing

@joshgoebel
Copy link
Member

An example of every string type could work... I assume we'd put that in a test called strings or some such if we don't have that already. As far as the exact regex I'll take another look at the smyself once we have all the test cases.

@MohamedAli00949 MohamedAli00949 force-pushed the fix-python-string-highlighting branch from b4e0be8 to 3e0f4ce Compare October 27, 2023 04:19
@github-actions
Copy link

Build Size Report

Changes to minified artifacts in /build, after gzip compression.

4 files changed

Total change +27 B

View Changes
file base pr diff
es/core.min.js 8.14 KB 8.14 KB +1 B
es/highlight.min.js 8.14 KB 8.14 KB +1 B
es/languages/python.min.js 1.53 KB 1.55 KB +13 B
languages/python.min.js 1.54 KB 1.55 KB +12 B

@MohamedAli00949
Copy link
Contributor Author

Hi @joshgoebel, I have added some markup tests to test my changes and fixed selecting the char before the string prefix as a car such as highlighting the 'plus operator(+)' at the following example print('kj'+f'kj').
Take a look and tell me if you require any other changes.

Many thanks for considering my request.

@joshgoebel
Copy link
Member

In your test you have:

orbr"https://"

Is this valid, an or followed by a string or invalid?

@MohamedAli00949
Copy link
Contributor Author

In your test you have:

orbr"https://"

Is this valid, an or followed by a string or invalid?

I think it's invalid, and orrb"https://" also.
I want to test or with br or rb prefixes and I have tested them at or rb"https://" and or br"https://".
Thanks for the good note

@github-actions
Copy link

Build Size Report

Changes to minified artifacts in /build, after gzip compression.

5 files changed

Total change +30 B

View Changes
file base pr diff
es/core.min.js 8.14 KB 8.14 KB +2 B
es/highlight.min.js 8.14 KB 8.14 KB +2 B
es/languages/python.min.js 1.53 KB 1.55 KB +13 B
highlight.min.js 8.17 KB 8.17 KB +1 B
languages/python.min.js 1.54 KB 1.55 KB +12 B

@joshgoebel
Copy link
Member

What is an rrf string?

@MohamedAli00949
Copy link
Contributor Author

MohamedAli00949 commented Oct 28, 2023

What is an rrf string?

i think its was for testing if the three letters prefix will be highlighted or highlighting the right string.

@joshgoebel
Copy link
Member

joshgoebel commented Oct 28, 2023

I meant is rrf valid python? Orr rff

@MohamedAli00949
Copy link
Contributor Author

MohamedAli00949 commented Oct 28, 2023

I meant is rrf valid python? Orr rff

The rrf and rff are not valid python and I have used them to test the regex of begin.
The result does not highlight any of them and starts highlighting the string from the quotes only.

If these tests weren't important or out of test context, I will delete them now.

@MohamedAli00949
Copy link
Contributor Author

Hi @joshgoebel,
I know that you have done hard work with my PRs and all project PRs and I appreciate that.
But I don't know what I can do now to merge my PR, fix more issues, and do more on this nice project.

Thanks in advance

@joshgoebel
Copy link
Member

The result does not highlight any of them and starts highlighting the string from the quotes only.

It's totally OK to include some false positives in the tests (with a comment to explain what purpose they are serving) - but any false positives should still be valid Python code. We only care about how well we highlight actual real-life valid Python.

@MohamedAli00949
Copy link
Contributor Author

The result does not highlight any of them and starts highlighting the string from the quotes only.

It's totally OK to include some false positives in the tests (with a comment to explain what purpose they are serving) - but any false positives should still be valid Python code. We only care about how well we highlight actual real-life valid Python.

You convinced me, thanks ❤

@github-actions
Copy link

Build Size Report

Changes to minified artifacts in /build, after gzip compression.

5 files changed

Total change +30 B

View Changes
file base pr diff
es/core.min.js 8.13 KB 8.14 KB +2 B
es/highlight.min.js 8.13 KB 8.14 KB +2 B
es/languages/python.min.js 1.53 KB 1.55 KB +13 B
highlight.min.js 8.17 KB 8.17 KB +1 B
languages/python.min.js 1.54 KB 1.55 KB +12 B

@MohamedAli00949 MohamedAli00949 force-pushed the fix-python-string-highlighting branch from 603e1e6 to 2028286 Compare October 28, 2023 23:42
@github-actions
Copy link

Build Size Report

Changes to minified artifacts in /build, after gzip compression.

2 files changed

Total change +25 B

View Changes
file base pr diff
es/languages/python.min.js 1.53 KB 1.55 KB +13 B
languages/python.min.js 1.54 KB 1.55 KB +12 B

@MohamedAli00949
Copy link
Contributor Author

The result does not highlight any of them and starts highlighting the string from the quotes only.

It's totally OK to include some false positives in the tests (with a comment to explain what purpose they are serving) - but any false positives should still be valid Python code. We only care about how well we highlight actual real-life valid Python.

I have committed the right tests now, feel free to review and give me your good advice.

@MohamedAli00949
Copy link
Contributor Author

Hi @joshgoebel, Any news about reviewing the changes?

@MohamedAli00949
Copy link
Contributor Author

Hi @joshgoebel, If I have removed some of the test commits and collected the testing at one commit would that be best?

@joshgoebel
Copy link
Member

If we just match or first - all your tests still pass... do you have additional failing tests you'd like to come up with?

Copy link

github-actions bot commented Nov 4, 2023

Build Size Report

Changes to minified artifacts in /build, after gzip compression.

2 files changed

Total change +19 B

View Changes
file base pr diff
es/languages/python.min.js 1.53 KB 1.54 KB +10 B
languages/python.min.js 1.54 KB 1.55 KB +9 B

@MohamedAli00949
Copy link
Contributor Author

If we just match or first - all your tests still pass... do you have additional failing tests you'd like to come up with?

Firstly, I am sorry for replying late, I didn't see your comment until now, and thanks for your changes.
Secondly, I think I didn't have any other tests.

src/languages/python.js Outdated Show resolved Hide resolved
@joshgoebel joshgoebel force-pushed the fix-python-string-highlighting branch from 658a345 to 4a1eddd Compare November 8, 2023 00:16
Copy link

github-actions bot commented Nov 8, 2023

Build Size Report

Changes to minified artifacts in /build, after gzip compression.

5 files changed

Total change +17 B

View Changes
file base pr diff
es/core.min.js 8.14 KB 8.14 KB +1 B
es/highlight.min.js 8.14 KB 8.14 KB +1 B
es/languages/python.min.js 1.53 KB 1.54 KB +8 B
highlight.min.js 8.17 KB 8.17 KB +1 B
languages/python.min.js 1.54 KB 1.55 KB +6 B

@joshgoebel
Copy link
Member

Please add a changelog entry then and we should be good to go.

Copy link

github-actions bot commented Nov 8, 2023

Build Size Report

Changes to minified artifacts in /build, after gzip compression.

5 files changed

Total change +17 B

View Changes
file base pr diff
es/core.min.js 8.14 KB 8.14 KB +1 B
es/highlight.min.js 8.14 KB 8.14 KB +1 B
es/languages/python.min.js 1.53 KB 1.54 KB +8 B
highlight.min.js 8.17 KB 8.17 KB +1 B
languages/python.min.js 1.54 KB 1.55 KB +6 B

@joshgoebel joshgoebel changed the title Fix python string highlighting fix(python) fix or conflicts with string highlighting Nov 8, 2023
Copy link

github-actions bot commented Nov 8, 2023

Build Size Report

Changes to minified artifacts in /build, after gzip compression.

5 files changed

Total change +23 B

View Changes
file base pr diff
es/core.min.js 8.14 KB 8.14 KB +3 B
es/highlight.min.js 8.14 KB 8.14 KB +3 B
es/languages/python.min.js 1.53 KB 1.54 KB +8 B
highlight.min.js 8.17 KB 8.17 KB +3 B
languages/python.min.js 1.54 KB 1.55 KB +6 B

@joshgoebel joshgoebel merged commit 5bb7114 into highlightjs:main Nov 8, 2023
13 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants