How to derive metrics from access log via ngxtop

Goal

To derive various metrics from the Nginx access log via ngxtop.

Assumptions

To complete this, you will need:

Problems

Sometimes, you may find the need to derive metrics from the access log files and identify bad traffic. Using raw Linux commands to perform the analysis may not be efficient enough.

Steps

1. Check ngxtop installation

Make sure ngxtop is installed locally via the below command:

pip install --user ngxtop

2. Get web traffic overview

To gather an overview of recent web traffic, use below command:

platform log access -p <project id> -e master --lines 102400 | ngxtop --no-follow

< skipped >

Summary:
|   count |   avg_bytes_sent |   2xx |   3xx |   4xx |   5xx |
|---------+------------------+-------+-------+-------+-------|
|   41553 |         6342.284 | 27380 |  6474 |   100 |  7599 |

Detailed:
| request_path                                  |   count |   avg_bytes_sent |   2xx |   3xx |   4xx |   5xx |
|-----------------------------------------------+---------+------------------+-------+-------+-------+-------|
| /scripts/user-widget/dist/user-widget.min.js  |    1190 |        29123.529 |   962 |   228 |     0 |     0 |
| /gitbook/gitbook-plugin-codetabs/codetabs.js  |    1185 |          186.957 |   961 |   224 |     0 |     0 |
| /gitbook/style.css                            |    1185 |         8088.093 |   964 |   221 |     0 |     0 |
| /gitbook/gitbook-plugin-edit-link/plugin.js   |    1184 |          340.759 |   961 |   223 |     0 |     0 |
| /gitbook/gitbook-plugin-gtm/plugin.js         |    1184 |          199.188 |   960 |   224 |     0 |     0 |
| /gitbook/gitbook-plugin-atoc/atoc.js          |    1183 |          323.057 |   961 |   222 |     0 |     0 |
| /gitbook/gitbook-plugin-highlight/website.css |    1181 |         2065.165 |   961 |   220 |     0 |     0 |
| /gitbook/gitbook-plugin-reveal/reveal.js      |    1181 |          229.190 |   958 |   223 |     0 |     0 |
| /gitbook/gitbook.js                           |    1181 |        23890.862 |   959 |   222 |     0 |     0 |
| /gitbook/theme.js                             |    1179 |        25528.468 |   957 |   222 |     0 |     0 |

3. View top visitors

It is also possible to find the top visitor IP address and HTTP User Agent:

$> platform log access -p <project id> -e master --lines 102400 | ngxtop --no-follow top remote_addr http_user_agent

< skipped >

top remote_addr
| remote_addr    |   count |
|----------------+---------|
| 92.60.188.221  |    6675 |
| 179.33.202.58  |     586 |
| 94.143.189.241 |     547 |
| 81.200.189.9   |     461 |
| 35.193.89.58   |     444 |
| 82.255.18.24   |     313 |
| 78.246.179.170 |     305 |
| 151.16.42.51   |     302 |
| 89.188.6.127   |     292 |
| 84.43.189.112  |     285 |

top http_user_agent
| http_user_agent                                                                                                           |   count |
|---------------------------------------------------------------------------------------------------------------------------+---------|
| Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.21 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.21               |    6623 |
| Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36 |    4195 |
| Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36       |    3563 |
| Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36                 |    2242 |
| Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:65.0) Gecko/20100101 Firefox/65.0                                              |    1799 |
| Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36 |    1358 |
| Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0                                        |    1189 |
| Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36 |    1136 |
| Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:65.0) Gecko/20100101 Firefox/65.0                                            |    1012 |
| Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36 |     973 |

4. Filter by IP address

We could further filter the visits coming from IP address 92.60.188.221:

$> platform log access -p <project id> -e master --lines 102400 | ngxtop --no-follow -i 'remote_addr == "92.60.188.221"' top request status http_user_agent

< skipped >

top request
| request                                          |   count |
|--------------------------------------------------+---------|
| GET /development/ HTTP/1.1                       |       6 |
| POST //index.php/api/xmlrpc HTTP/1.1             |       5 |
| POST //xmlrpc HTTP/1.1                           |       5 |
| POST //xmlrpc.php HTTP/1.1                       |       5 |
| POST /gitbook/gitbook-plugin-edit-link/ HTTP/1.1 |       5 |
| POST /gitbook/gitbook-plugin-gtm/ HTTP/1.1       |       5 |
| GET /development/logs.html HTTP/1.1              |       4 |
| GET /gettingstarted/tools.html HTTP/1.1          |       4 |
| GET /styles/styles.css HTTP/1.1                  |       4 |
| POST //soap.php HTTP/1.1                         |       4 |

top status
|   status |   count |
|----------+---------|
|      502 |    6338 |
|      200 |     223 |
|      403 |      69 |
|      304 |      25 |
|      405 |      20 |

top http_user_agent
| http_user_agent                                                                                             |   count |
|-------------------------------------------------------------------------------------------------------------+---------|
| Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.21 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.21 |    6623 |
| Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0                                |      51 |
| Googlebot/2.1 (+http://www.googlebot.com/bot.html)                                                          |       1 |

From the above output, 92.60.188.221 is generating a lot of bad traffic and should be blacklisted.

Conclusion

With the help of ngxtop and Platform CLI, it is easy to perform simple analysis on the Nginx access log and identify bad traffic.