Skip to content

Commit

Permalink
Merge branch 'master' into output-format
Browse files Browse the repository at this point in the history
  • Loading branch information
iBug committed Oct 2, 2024
2 parents db202f6 + 948fe54 commit 8ae4772
Show file tree
Hide file tree
Showing 4 changed files with 124 additions and 108 deletions.
55 changes: 31 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# ayano

Follow nginx log, and find out bad guys! Ayano parses nginx log and shows clients eating most bandwidth every few seconds.
Follow nginx log, and find out bad guys! Ayano parses web server log and shows clients eating most bandwidth every few seconds.

## Build

Expand All @@ -15,31 +15,44 @@ $ ./ayano
A simple log analysis tool for Nginx, Apache, or other web server logs

Usage:
ayano [flags]
ayano [command]

Available Commands:
analyze Log analyse mode (no tail following, only show top N at the end, and implies --whole)
completion Generate the autocompletion script for the specified shell
daemon Daemon mode, prints out IP CIDR and total size every 1 GiB
run Run and follow the log file
help Help about any command
list List various items
run Run and follow the log file(s)

Flags:
-h, --help help for ayano

Use "ayano [command] --help" for more information about a command.
$ ./ayano run --help
Run and follow the log file
Run and follow the log file(s)

Usage:
ayano run [filename] [flags]
ayano run [filename...] [flags]

Flags:
-a, --absolute Show absolute time for each item
-h, --help help for run
--no-netstat Do not detect active connections
-o, --outlog string Change log output file
-p, --parser string Log parser (nginx-combined|nginx-json|caddy-json|goaccess) (default "nginx-json")
-r, --refresh int Refresh interval in seconds (default 5)
-s, --server string Server IP to filter (nginx-json only)
-S, --sort-by string Sort result by (size|requests) (default "size")
-t, --threshold size Threshold size for request (only requests at least this large will be counted) (default 10 MB)
-n, --top int Number of top items to show (default 10)
-w, --whole Analyze whole log file and then tail it
-a, --absolute Show absolute time for each item
-g, --group Try to group CIDRs
-h, --help help for run
--no-netstat Do not detect active connections
-o, --outlog string Change log output file
-p, --parser string Log parser (see "ayano list parsers") (default "nginx-json")
--prefixv4 int Group IPv4 by prefix (default 24)
--prefixv6 int Group IPv6 by prefix (default 48)
-r, --refresh int Refresh interval in seconds (default 5)
-s, --server string Server IP to filter (nginx-json only)
-S, --sort-by string Sort result by (size|requests) (default "size")
-t, --threshold size Threshold size for request (only requests at least this large will be counted) (default 10 MB)
-n, --top int Number of top items to show (default 10)
--truncate Truncate long URLs from output
--truncate-to int Truncate URLs to given length, overrides --truncate
-w, --whole Analyze whole log file and then tail it

# Example 1
$ ./ayano run -n 20 --threshold 50M /var/log/nginx/access_json.log
Expand All @@ -49,14 +62,7 @@ $ ./ayano run -n 50 --whole --parser nginx-combined /var/log/nginx/access.log
$ ./ayano analyze -n 100 /var/log/nginx/access_json.log
```

By default, it would output like this every 5 seconds:

```log
2024/07/10 00:13:48 2222:222:2222::/48 (active, 1): 457 MiB 2 228 MiB /some/big/file (from 6 seconds ago, last accessed 6 seconds ago)
2024/07/10 00:13:48 111.11.111.0/24: 268 MiB 1 268 MiB /another/big/file (from 13 seconds ago, last accessed 13 seconds ago)
```

`457 MiB 2 228 MiB` means it downloads 457 MiB large files in total, with 2 requests and 228 MiB on average.
Ayano would output a table which is easy for humans to read.

### Daemon mode (experimental)

Expand Down Expand Up @@ -88,7 +94,7 @@ which means that "114.5.14.0/24" takes at least 36GiB bandwidth, and "191.9.81.0

## Format support

Ayano supports two types of nginx log:
Ayano supports following types of log format. You could also use `ayano list parsers` to check.

1. Standard "combined" format access log.
2. JSON format access log configured as:
Expand Down Expand Up @@ -118,6 +124,7 @@ Ayano supports two types of nginx log:
**Note**: If you are using Caddy behind a reverse proxy, please upgrade Caddy to 2.7.0+ and set `trusted_proxies` (and `client_ip_headers`) in configuration file to let log have `client_ip` field outputted.
4. GoAccess format string. You shall set `GOACCESS_CONFIG` env to a goaccess config file beforehand ([format recognized](https://github.com/taoky/goaccessfmt?tab=readme-ov-file#config-file-format), [example](assets/goaccess.conf)).
5. Tencent CDN log format.
## Note
Expand Down
25 changes: 17 additions & 8 deletions pkg/parser/common.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ package parser

import (
"bytes"
"errors"
"fmt"
"time"
)

Expand Down Expand Up @@ -37,25 +37,34 @@ func findEndingDoubleQuote(data []byte) int {

func splitFields(line []byte) ([][]byte, error) {
res := make([][]byte, 0, 16)
loop:
for baseIdx := 0; baseIdx < len(line); {
if line[baseIdx] == '"' {
switch line[baseIdx] {
case '"':
quoteIdx := findEndingDoubleQuote(line[baseIdx+1:])
if quoteIdx == -1 {
return res, errors.New("unexpected format: unbalanced quotes")
return res, fmt.Errorf("unexpected format: unbalanced quotes [ at %d", baseIdx)
}
res = append(res, line[baseIdx+1:baseIdx+quoteIdx+1])
baseIdx += quoteIdx + 2
if line[baseIdx] == ' ' {
baseIdx++
case '[':
closingIdx := bytes.IndexByte(line[baseIdx+1:], ']')
if closingIdx == -1 {
return res, fmt.Errorf("unexpected format: unmatched [ at %d", baseIdx)
}
} else {
res = append(res, line[baseIdx+1:baseIdx+closingIdx+1])
baseIdx += closingIdx + 2
default:
spaceIdx := bytes.IndexByte(line[baseIdx:], ' ')
if spaceIdx == -1 {
res = append(res, line[baseIdx:])
break
break loop
}
res = append(res, line[baseIdx:baseIdx+spaceIdx])
baseIdx += spaceIdx + 1
baseIdx += spaceIdx
}
if baseIdx < len(line) && line[baseIdx] == ' ' {
baseIdx++
}
}
return res, nil
Expand Down
57 changes: 57 additions & 0 deletions pkg/parser/common_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
package parser

import (
"testing"
"time"

"github.com/stretchr/testify/assert"
)

func TestClfDateParse(t *testing.T) {
expected := time.Date(2006, time.January, 2, 15, 4, 5, 0, time.FixedZone("", -7*60*60))
assert.Equal(t, expected, clfDateParse([]byte(CommonLogFormat)))
assert.Equal(t, expected, clfDateParseString(CommonLogFormat))
}

func TestFindEndingDoubleQuote(t *testing.T) {
type testCase struct {
input []byte
expected int
}
testCases := []testCase{
{[]byte(`abc"`), 3},
{[]byte(`ab\"c"`), 5},
{[]byte(`ab\\c"`), 5},
{[]byte(`ab`), -1},
}
for _, c := range testCases {
assert.Equal(t, c.expected, findEndingDoubleQuote(c.input))
}
}

func TestSplitFields(t *testing.T) {
type testCase struct {
line []byte
expected [][]byte
}
testCases := []testCase{
{
[]byte(`127.0.0.1 - - [2/Jan/2006:15:04:05 -0700] "GET /blog/2021/01/hello-world HTTP/1.1" 200 512`),
[][]byte{
[]byte(`127.0.0.1`),
[]byte(`-`),
[]byte(`-`),
[]byte(`2/Jan/2006:15:04:05 -0700`),
[]byte(`GET /blog/2021/01/hello-world HTTP/1.1`),
[]byte(`200`),
[]byte(`512`),
},
},
}
for _, c := range testCases {
res, err := splitFields(c.line)
if assert.NoError(t, err) {
assert.Equal(t, c.expected, res)
}
}
}
95 changes: 19 additions & 76 deletions pkg/parser/nginx-combined.go
Original file line number Diff line number Diff line change
Expand Up @@ -36,46 +36,24 @@ func init() {
})
}

func ParseNginxCombined(line []byte) (LogItem, error) {
baseIdx := 0
// get the first -
delimIndex := bytes.IndexByte(line, '-')
if delimIndex == -1 {
return LogItem{}, errors.New("unexpected format: no - (empty identity)")
}

clientIP := line[:delimIndex-1]
baseIdx = delimIndex + 1
// get time within [$time_local]
leftBracketIndex := bytes.IndexByte(line[baseIdx:], '[')
if leftBracketIndex == -1 {
return LogItem{}, errors.New("unexpected format: no [ (datetime)")
func ParseNginxCombined(line []byte) (logItem LogItem, err error) {
fields, err := splitFields(line)
if err != nil {
return logItem, err
}
rightBracketIndex := bytes.IndexByte(line[baseIdx+leftBracketIndex+1:], ']')
if rightBracketIndex == -1 {
return LogItem{}, errors.New("unexpected format: no ] (datetime)")
if len(fields) != 9 {
return logItem, fmt.Errorf("invalid format: expected 9 fields, got %d", len(fields))
}

localTimeByte := line[baseIdx+leftBracketIndex+1 : baseIdx+leftBracketIndex+rightBracketIndex+1]
// localTime, err := time.Parse("02/Jan/2006:15:04:05 -0700", string(localTimeByte))
// if err != nil {
// return LogItem{}, err
// }
localTime := clfDateParse(localTimeByte)
baseIdx += leftBracketIndex + rightBracketIndex + 2

// get URL within first "$request"
leftQuoteIndex := bytes.IndexByte(line[baseIdx:], '"')
if leftQuoteIndex == -1 {
return LogItem{}, errors.New("unexpected format: no \" (request)")
}
rightQuoteIndex := findEndingDoubleQuote(line[baseIdx+leftQuoteIndex+1:])
if rightQuoteIndex == -1 {
return LogItem{}, errors.New("unexpected format: no \" after first \" (request)")
if string(fields[1]) != "-" {
return logItem, errors.New("unexpected format: no - (empty identity)")
}

url := line[baseIdx+leftQuoteIndex+1 : baseIdx+leftQuoteIndex+rightQuoteIndex+1]
baseIdx += leftQuoteIndex + rightQuoteIndex + 2
logItem.Client = string(fields[0])
logItem.Time = clfDateParse(fields[3])

requestLine := fields[4]
url := requestLine
// strip HTTP method in url
spaceIndex := bytes.IndexByte(url, ' ')
if spaceIndex == -1 {
Expand All @@ -91,51 +69,16 @@ func ParseNginxCombined(line []byte) (LogItem, error) {
} else {
url = url[:spaceIndex]
}
logItem.URL = string(url)

// get size ($body_bytes_sent)
baseIdx += 1
leftSpaceIndex := bytes.IndexByte(line[baseIdx:], ' ')
if leftSpaceIndex == -1 {
return LogItem{}, errors.New("unexpected format: no space after $request (code)")
}
rightSpaceIndex := bytes.IndexByte(line[baseIdx+leftSpaceIndex+1:], ' ')
if rightSpaceIndex == -1 {
return LogItem{}, errors.New("unexpected format: no space after $body_bytes_sent (size)")
}
sizeBytes := line[baseIdx+leftSpaceIndex+1 : baseIdx+leftSpaceIndex+rightSpaceIndex+1]
size, err := strconv.ParseUint(string(sizeBytes), 10, 64)
sizeBytes := fields[6]
logItem.Size, err = strconv.ParseUint(string(sizeBytes), 10, 64)
if err != nil {
return LogItem{}, err
return logItem, err
}
baseIdx += leftSpaceIndex + rightSpaceIndex + 2

// skip referer
leftQuoteIndex = bytes.IndexByte(line[baseIdx:], '"')
if leftQuoteIndex == -1 {
return LogItem{}, errors.New("unexpected format: no \" (referer)")
}
rightQuoteIndex = findEndingDoubleQuote(line[baseIdx+leftQuoteIndex+1:])
if rightQuoteIndex == -1 {
return LogItem{}, errors.New("unexpected format: no \" after first \" (referer)")
}
baseIdx += 1 + leftQuoteIndex + rightQuoteIndex + 2
// get UA
leftQuoteIndex = bytes.IndexByte(line[baseIdx:], '"')
if leftQuoteIndex == -1 {
return LogItem{}, errors.New("unexpected format: no \" (user-agent)")
}
rightQuoteIndex = findEndingDoubleQuote(line[baseIdx+leftQuoteIndex+1:])
if rightQuoteIndex == -1 {
return LogItem{}, errors.New("unexpected format: no \" after first \" (user-agent)")
}
userAgent := line[baseIdx+leftQuoteIndex+1 : baseIdx+leftQuoteIndex+rightQuoteIndex+1]
return LogItem{
Size: size,
Client: string(clientIP),
Time: localTime,
URL: string(url),
Useragent: string(userAgent),
}, nil
logItem.Useragent = string(fields[8])
return
}

var nginxCombinedRe = regexp.MustCompile(
Expand Down

0 comments on commit 8ae4772

Please sign in to comment.