Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(rust): add support for r# raw identifiers #3947

Merged
merged 8 commits into from
Dec 26, 2023

Conversation

Jaebaek-Lee
Copy link
Contributor

@Jaebaek-Lee Jaebaek-Lee commented Dec 24, 2023

#3924

Changes

In order to support raw identifiers in Rust, the regular expressions for UNDERSCORE_IDENT_RE and IDENT_RE need to be modified to include the r# keyword. However, since modes.js is shared by other languages, we need to ensure that only rust supports the r# keyword.

Original (modes.js)

const UNDERSCORE_IDENT_RE = '[a-zA-Z_]\\w*';
const IDENT_RE = '[a-zA-Z]\\w*';

Change (rust.js)

So, add the following code to rust.js to support the r# keyword only in rust.

const UNDERSCORE_IDENT_RE = '(r#)?[a-zA-Z_]\\w*';
const IDENT_RE = '(r#)?[a-zA-Z]\\w*';

Checklist

  • Added markup tests, or they don't apply here because...
  • Updated the changelog at CHANGES.md

Comment on lines 179 to 186
$pattern: hljs.IDENT_RE + '!?',
$pattern: IDENT_RE + '!?',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think any of our literal list of keywords start with #r (since I don't see that you've added any keywords) so if that's the case this should be reverted to reduce the work the keyword engine needs to do.

"overmatching" with $pattern for things that will never be found in the keyword lists only slows the engine down.

Copy link
Contributor Author

@Jaebaek-Lee Jaebaek-Lee Dec 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment!

You're right, there are no keywords starting with r#. I fixed that without clearly understanding the object.

I'll reflect and commit!

src/languages/rust.js Outdated Show resolved Hide resolved
src/languages/rust.js Show resolved Hide resolved
Comment on lines 13 to 14
const UNDERSCORE_IDENT_RE = '(r#)?[a-zA-Z_]\\w*';
const IDENT_RE = '(r#)?[a-zA-Z]\\w*';
Copy link

@injae-kim injae-kim Dec 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const RAW_IDENTIFIER = '(r#)?'
const UNDERSCORE_INDENT_RE = RAW_IDENTIFIER  + hljs.UNDERSCORE_INDENT_RE;
const INDENT_RE = RAW_IDENTIFIER  + hljs.INDENT_RE

maybe above code looks better? (just suggestion 😄)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment!

That's a great suggestion. I think it looks much better! I'll reflect and commit!

export default function(hljs) {
// ============================================
// Added to support the r# keyword, which is a raw identifier in Rust.
const RAW_IDENTIFIER = '(r#)?';
Copy link
Member

@joshgoebel joshgoebel Dec 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets use // regex here (not string)... and pull in our regex utils and use concat for this... see abnf for an example.

https://github.com/highlightjs/highlight.js/blob/main/src/languages/abnf.js#L57

Copy link
Contributor Author

@Jaebaek-Lee Jaebaek-Lee Dec 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've committed to reflect your request.
However, i found one more issue.

스크린샷 2023-12-26 035347
Looking at the above test, the r# when the identifier is declared is working as desired, but the

println!("{}", r#try);

When the identifier is referenced again, try appears as a reserved keyword again in r#try.

This issue is addressed in
#3947 (comment)
Here, I found that changing the code back to the one I had reverted earlier fixed it.

스크린샷 2023-12-26 033717

  • Code where the reference to the r#try identifier is not recognized
    name: 'Rust',
    aliases: [ 'rs' ],
    keywords: {
      $pattern: hljs.IDENT_RE + '!?',
      type: TYPES,
      keyword: KEYWORDS,
      literal: LITERALS,
      built_in: BUILTINS
    },
  • Code where the reference to the r#try identifier is recognized
    name: 'Rust',
    aliases: [ 'rs' ],
    keywords: {
      $pattern: IDENT_RE + '!?',
      type: TYPES,
      keyword: KEYWORDS,
      literal: LITERALS,
      built_in: BUILTINS
    },

Copy link
Member

@joshgoebel joshgoebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last small thing and this is good to merge I think.

Thanks so much!

Copy link

Build Size Report

Changes to minified artifacts in /build, after gzip compression.

3 files changed

Total change +35 B

View Changes
file base pr diff
es/languages/rust.min.js 1.46 KB 1.48 KB +18 B
highlight.min.js 8.21 KB 8.21 KB -1 B
languages/rust.min.js 1.47 KB 1.49 KB +18 B

@joshgoebel
Copy link
Member

When the identifier is referenced again, try appears as a reserved keyword again in r#try.

Ok, now we're talking about false positives, which is a different situation entirely. I'm not sure we should use the keyword engine to consume these (but not match, and hence not apply scoping)... can we add a few failing tests for this case and I'll take another look?

@Jaebaek-Lee
Copy link
Contributor Author

Jaebaek-Lee commented Dec 26, 2023

When the identifier is referenced again, try appears as a reserved keyword again in r#try.

Ok, now we're talking about false positives, which is a different situation entirely. I'm not sure we should use the keyword engine to consume these (but not match, and hence not apply scoping)... can we add a few failing tests for this case and I'll take another look?

Good! I'll create a new issue for the new problem and figure out how to fix it.

Please feel free to provide any additional feedback or suggestions for improvement in this PR. I'm open to making further corrections if needed.

Thanks so much!

@joshgoebel joshgoebel merged commit 8b88a7d into highlightjs:main Dec 26, 2023
15 checks passed
@Jaebaek-Lee Jaebaek-Lee deleted the my-branch branch December 26, 2023 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants