-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fixed uri regex issue #3815
base: main
Are you sure you want to change the base?
fixed uri regex issue #3815
Conversation
pkg/detectors/uri/uri.go
Outdated
@@ -23,7 +23,7 @@ var _ detectors.Detector = (*Scanner)(nil) | |||
var _ detectors.CustomFalsePositiveChecker = (*Scanner)(nil) | |||
|
|||
var ( | |||
keyPat = regexp.MustCompile(`\b(?:https?:)?\/\/[\S]{3,50}:([\S]{3,50})@[-.%\w\/:]+\b`) | |||
keyPat = regexp.MustCompile(`\b(?:https?:)?\/\/[\w-\.]{3,50}:([\w-\.]{3,50})@[-.%\w\/:]+\b`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\S
matches any non-whitespace character, which is very broad. Instead, we are now using \w
, which matches [A-Za-z0-9_]
, and extending it by adding a few special characters to suit our needs.
441760e
to
546eb5c
Compare
@@ -23,7 +23,7 @@ var _ detectors.Detector = (*Scanner)(nil) | |||
var _ detectors.CustomFalsePositiveChecker = (*Scanner)(nil) | |||
|
|||
var ( | |||
keyPat = regexp.MustCompile(`\b(?:https?:)?\/\/[\S]{3,50}:([\S]{3,50})@[-.%\w\/:]+\b`) | |||
keyPat = regexp.MustCompile(`\b(?:https?:\/\/)?[\w-\.$~!]{3,50}:([\w-\.%$^&#]{3,50})@[-.\w]+\b`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is missing a large number of valid characters for usernames and passwords. The host pattern is also still fairly permissive and would match things that could never be valid, e.g. @----__-2as-2
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about something like:
\b(?:https?:\/\/)?[\w-\.$~!&'()*+,;=:%-]{3,50}:([\w-\.%$^#&'()*+,;=:%-]{3,50})@[a-zA-Z0-9.-]+(?:\.[a-zA-Z]{2,})?\b
Added additional valid characters and fixed the host pattern too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example: https://regex101.com/r/qBF4nS/1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's looking a bit better. Some notes:
- The scheme prefix shouldn't be optional
:
isn't a valid username character- Username isn't always required (example)
- The username and password patterns are both missing special characters. It would be pragmatic to add all applicable special characters from
[[:graph:]]
, and remove them later if they're causing issues. e.g.,
\bhttps?:\/\/[\w!#$%&()*+,\-./;<=>?@[\\\]^_{|}~]{0,50}:([\w!#$%&()*+,\-./:;<=>?[\\\]^_{|}~]{3,50})@[a-zA-Z0-9.-]+(?:\.[a-zA-Z]{2,})?\b
- The pattern needs to be able to detect port as well as path
0ef5d64
to
a085517
Compare
Description:
This PR fixes github issue #3686
Checklist:
make test-community
)?make lint
this requires golangci-lint)?