-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python log parser - handle limited size log content #213
Conversation
// Since the log parser parse line by line, we need to create a parser that can capture group content that may span multiple lines. | ||
func getMultilineSplitCaptureOutputPattern(startCollectingPattern, captureGroup, endCollectingPattern string, handler func(pattern *gofrogcmd.CmdOutputPattern) (string, error)) (parsers []*gofrogcmd.CmdOutputPattern) { | ||
// Prepare regex patterns. | ||
oneLineRegex := regexp.MustCompile(startCollectingPattern + `(` + captureGroup + `)` + endCollectingPattern) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
regexp.MustCompile is heavy.
Let's make the effort to compile regex patterns on compilation time by putting them in variables outside the functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its dynamic patterns that are constructed by the given arguments to the method. I don't see any benefit to refactoring it outside because I need those values at the method
// Create a parser for multi line pattern matches. | ||
lineBuffer := "" | ||
collectingMultiLineValue := false | ||
parsers = append(parsers, &gofrogcmd.CmdOutputPattern{RegExp: regexp.MustCompile(".*"), ExecFunc: func(pattern *gofrogcmd.CmdOutputPattern) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.*
does not include newline characters. Is it intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the parser that we use goes over the content line by line so It will not include newlines anyway since it is the delimiter used for tokens.
The purpose of this parser is to remove any newlines and concatenate the text removing new lines that was added if split
We have cases when installing that the log content has limited size and the content is split to multiple content, for example with
pipenv
:the split lines cause our code not to match the expected Regex, and we could not collect the dependencies