Goal
To derive various metrics from the Nginx access log via ngxtop.
Assumptions
To complete this, you will need:
- The Platform.sh CLI tool installed
- The Python pip tool installed
Problems
Sometimes, you may find the need to derive metrics from the access log files and identify bad traffic. Using raw Linux commands to perform the analysis may not be efficient enough.
Steps
1. Check ngxtop
installation
Make sure ngxtop
is installed locally via the below command:
pip install --user ngxtop
2. Get web traffic overview
To gather an overview of recent web traffic, use below command:
platform log access -p <project id> -e master --lines 102400 | ngxtop --no-follow
< skipped >
Summary:
| count | avg_bytes_sent | 2xx | 3xx | 4xx | 5xx |
|---------+------------------+-------+-------+-------+-------|
| 41553 | 6342.284 | 27380 | 6474 | 100 | 7599 |
Detailed:
| request_path | count | avg_bytes_sent | 2xx | 3xx | 4xx | 5xx |
|-----------------------------------------------+---------+------------------+-------+-------+-------+-------|
| /scripts/user-widget/dist/user-widget.min.js | 1190 | 29123.529 | 962 | 228 | 0 | 0 |
| /gitbook/gitbook-plugin-codetabs/codetabs.js | 1185 | 186.957 | 961 | 224 | 0 | 0 |
| /gitbook/style.css | 1185 | 8088.093 | 964 | 221 | 0 | 0 |
| /gitbook/gitbook-plugin-edit-link/plugin.js | 1184 | 340.759 | 961 | 223 | 0 | 0 |
| /gitbook/gitbook-plugin-gtm/plugin.js | 1184 | 199.188 | 960 | 224 | 0 | 0 |
| /gitbook/gitbook-plugin-atoc/atoc.js | 1183 | 323.057 | 961 | 222 | 0 | 0 |
| /gitbook/gitbook-plugin-highlight/website.css | 1181 | 2065.165 | 961 | 220 | 0 | 0 |
| /gitbook/gitbook-plugin-reveal/reveal.js | 1181 | 229.190 | 958 | 223 | 0 | 0 |
| /gitbook/gitbook.js | 1181 | 23890.862 | 959 | 222 | 0 | 0 |
| /gitbook/theme.js | 1179 | 25528.468 | 957 | 222 | 0 | 0 |
3. View top visitors
It is also possible to find the top visitor IP address and HTTP User Agent:
$> platform log access -p <project id> -e master --lines 102400 | ngxtop --no-follow top remote_addr http_user_agent
< skipped >
top remote_addr
| remote_addr | count |
|----------------+---------|
| 92.60.188.221 | 6675 |
| 179.33.202.58 | 586 |
| 94.143.189.241 | 547 |
| 81.200.189.9 | 461 |
| 35.193.89.58 | 444 |
| 82.255.18.24 | 313 |
| 78.246.179.170 | 305 |
| 151.16.42.51 | 302 |
| 89.188.6.127 | 292 |
| 84.43.189.112 | 285 |
top http_user_agent
| http_user_agent | count |
|---------------------------------------------------------------------------------------------------------------------------+---------|
| Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.21 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.21 | 6623 |
| Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36 | 4195 |
| Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36 | 3563 |
| Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36 | 2242 |
| Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:65.0) Gecko/20100101 Firefox/65.0 | 1799 |
| Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36 | 1358 |
| Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0 | 1189 |
| Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36 | 1136 |
| Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:65.0) Gecko/20100101 Firefox/65.0 | 1012 |
| Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36 | 973 |
4. Filter by IP address
We could further filter the visits coming from IP address 92.60.188.221
:
$> platform log access -p <project id> -e master --lines 102400 | ngxtop --no-follow -i 'remote_addr == "92.60.188.221"' top request status http_user_agent
< skipped >
top request
| request | count |
|--------------------------------------------------+---------|
| GET /development/ HTTP/1.1 | 6 |
| POST //index.php/api/xmlrpc HTTP/1.1 | 5 |
| POST //xmlrpc HTTP/1.1 | 5 |
| POST //xmlrpc.php HTTP/1.1 | 5 |
| POST /gitbook/gitbook-plugin-edit-link/ HTTP/1.1 | 5 |
| POST /gitbook/gitbook-plugin-gtm/ HTTP/1.1 | 5 |
| GET /development/logs.html HTTP/1.1 | 4 |
| GET /gettingstarted/tools.html HTTP/1.1 | 4 |
| GET /styles/styles.css HTTP/1.1 | 4 |
| POST //soap.php HTTP/1.1 | 4 |
top status
| status | count |
|----------+---------|
| 502 | 6338 |
| 200 | 223 |
| 403 | 69 |
| 304 | 25 |
| 405 | 20 |
top http_user_agent
| http_user_agent | count |
|-------------------------------------------------------------------------------------------------------------+---------|
| Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.21 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.21 | 6623 |
| Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0 | 51 |
| Googlebot/2.1 (+http://www.googlebot.com/bot.html) | 1 |
From the above output, 92.60.188.221
is generating a lot of bad traffic and should be blacklisted.
Conclusion
With the help of ngxtop
and Platform CLI, it is easy to perform simple analysis on the Nginx access log and identify bad traffic.