Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixed uri regex issue #3815

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion pkg/detectors/privacy/privacy.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,11 @@ package privacy
import (
"context"
"fmt"
regexp "github.com/wasilibs/go-re2"
"net/http"
"strings"

regexp "github.com/wasilibs/go-re2"

"github.com/trufflesecurity/trufflehog/v3/pkg/common"
"github.com/trufflesecurity/trufflehog/v3/pkg/detectors"
"github.com/trufflesecurity/trufflehog/v3/pkg/pb/detectorspb"
Expand Down
3 changes: 2 additions & 1 deletion pkg/detectors/uri/uri.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ var _ interface {
} = (*Scanner)(nil)

var (
keyPat = regexp.MustCompile(`\b(?:https?:)?\/\/[\S]{3,50}:([\S]{3,50})@[-.%\w\/:]+\b`)
keyPat = regexp.MustCompile(`\b(?:https?:\/\/)?[\w-\.$~!]{3,50}:([\w-\.%$^&#]{3,50})@[-.\w]+\b`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing a large number of valid characters for usernames and passwords. The host pattern is also still fairly permissive and would match things that could never be valid, e.g. @----__-2as-2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something like:

\b(?:https?:\/\/)?[\w-\.$~!&'()*+,;=:%-]{3,50}:([\w-\.%$^#&'()*+,;=:%-]{3,50})@[a-zA-Z0-9.-]+(?:\.[a-zA-Z]{2,})?\b

Added additional valid characters and fixed the host pattern too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's looking a bit better. Some notes:

  1. The scheme prefix shouldn't be optional
  2. : isn't a valid username character
  3. Username isn't always required (example)
  4. The username and password patterns are both missing special characters. It would be pragmatic to add all applicable special characters from [[:graph:]], and remove them later if they're causing issues. e.g.,
\bhttps?:\/\/[\w!#$%&()*+,\-./;<=>?@[\\\]^_{|}~]{0,50}:([\w!#$%&()*+,\-./:;<=>?[\\\]^_{|}~]{3,50})@[a-zA-Z0-9.-]+(?:\.[a-zA-Z]{2,})?\b
  1. The pattern needs to be able to detect port as well as path


// TODO: make local addr opt-out
defaultClient = detectors.DetectorHttpClientWithNoLocalAddresses
Expand Down Expand Up @@ -131,6 +131,7 @@ func (s Scanner) FromData(ctx context.Context, verify bool, data []byte) (result
continue
}
}

results = append(results, r)
}

Expand Down
6 changes: 6 additions & 0 deletions pkg/detectors/uri/uri_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ import (

var (
validPattern = "https://kaNydBSAodo87dsm9asuiSAFtsd7.com:1234@qYY3SylY7fHP"
validPattern2 = `<p><a href="http://username:[email protected]">http://username:[email protected]</a></p>`
invalidPattern = "https://kaNydBSAodo87dsm9asuiSAFtsd7.com.1234@qYY3SylY7fHP"
keyword = "uri"
)
Expand All @@ -30,6 +31,11 @@ func TestURI_Pattern(t *testing.T) {
input: fmt.Sprintf("%s token = '%s'", keyword, validPattern),
want: []string{validPattern},
},
{
name: "valid pattern - do not process duplicate",
input: fmt.Sprintf("%s token = '%s'", keyword, validPattern2),
want: []string{"http://username:[email protected]"},
},
{
name: "invalid pattern",
input: fmt.Sprintf("%s = '%s'", keyword, invalidPattern),
Expand Down
Loading