|
|
- # Apache HTTPD log parser
-
- Apache/HTTPD command-line log parser for Linux web server administrators.
-
- ## Motivation
-
- Keep it simple. Very simple.
-
- Although advanced and nice-looking log analytic tools such as [Elastic Stack](https://www.elastic.co/products/) exists, I wanted something far more simple and with far less overhead for weekly tasks and for configuring an Apache web server. Therefore, I wrote this simple Python script to parse Apache web server logs.
-
- **Advantages** of this tool are little overhead, piping output to other Unix tools and doing some quick log checks. The main idea is to give desired output for short analysis so that you can properly configure your web server protection mechanisms and network environment based on the actual server data.
-
- This tool is not for intrusion detection/prevention or does not alert administration about hostile penetration attempts. However, it may reveal simple underlying misconfigurations such as invalid URL references on your website.
-
- ## Requirements
-
- Following Python packages (Arch Linux):
-
- ```
- python
- python-apachelogs
- ```
-
- [python-apachelogs](https://github.com/jwodder/apachelogs/) is not available either on Arch Linux repositories or AUR repositories. Therefore, I provide a PKGBUILD file to install it. [python-apachelogs - PKGBUILD](python-apachelogs/PKGBUILD)
-
- `python-apachelogs` has a sub-dependency of [python-pydicti](python-apachelogs/python-pydicti/PKGBUILD) package.
-
- Recommended packages for IP address geo-location:
-
- ```
- geoip
- geoip-database
- ```
-
- ## Installation
-
- Arch Linux:
-
- run `updpkgsums && makepkg -Cfi` in [apache-logparser](apache-logparser/) directory. The command installs `httpd-logparser` executable file in `/usr/bin/` folder.
-
- ## Features
-
- - Multiple Linux distributions supported
- - Supported output formats: `table` and `csv`
- - Use output log entry field ordering
- - Include and exclude log entry fields
- - Date ranges
- - Geo IP lookup for log entries
- - Get origin countries and cities
- - Unknown cities: give coordinates instead
- - Check also: [MaxMind DB Apache Module](https://github.com/maxmind/mod_maxminddb)
- - Output field filters
- - Limit processed log entries with `--head` and `--tail` parameters
- - Get only interesting HTTP response codes
- - Get only interesting countries of origin
- - Process multiple log files at once, either by providing a list of files or matching regex
- - Show processing status
- - Show processing summary
- - List invalid log entries that couldn't be processed
-
- ## Examples
-
- **Q: List unique connections (IP addresses) associated with country and city location data, using the last Apache log file?**
-
- ```
- httpd-logparser --files-list /var/log/httpd/access_log --included-fields time,remote_host,country,city | sort -k 2 -u | sort -k 3
-
- 103.102.153.XXX Indonesia Unknown: -6.175000, 106.828598 2022-06-12 10:33:58
- 103.102.153.XXX Indonesia Unknown: -6.175000, 106.828598 2022-06-12 10:33:59
- 103.102.153.XXX Indonesia Unknown: -6.175000, 106.828598 2022-06-12 10:34:00
- 103.102.153.XXX Indonesia Unknown: -6.175000, 106.828598 2022-06-12 10:34:01
- 103.144.178.XXX Indonesia Unknown: -6.175000, 106.828598 2022-06-16 06:34:19
- 62.214.113.XXX Germany Unterhaching 2022-06-10 14:39:16
- 62.214.113.XXX Germany Unterhaching 2022-06-10 16:34:15
- 62.214.113.XXX Germany Unterhaching 2022-06-10 16:40:03
- 62.214.113.XXX Germany Unterhaching 2022-06-10 16:40:04
- 62.214.113.XXX Germany Unterhaching 2022-06-10 16:40:05
- 84.234.169.XXX Norway Valderoy 2022-06-06 00:20:18
- 194.137.241.XXX Finland Vantaa 2022-06-07 12:20:42
- 194.137.241.XXX Finland Vantaa 2022-06-07 12:20:43
- 194.137.241.XXX Finland Vantaa 2022-06-07 12:20:44
- ...
- 176.108.111.XXX Ukraine Vyzhnytsya 2022-06-07 21:25:38
- 176.108.111.XXX Ukraine Vyzhnytsya 2022-06-07 21:25:39
- 176.108.111.XXX Ukraine Vyzhnytsya 2022-06-07 21:25:40
- 176.108.111.XXX Ukraine Vyzhnytsya 2022-06-07 21:25:41
- 176.108.111.XXX Ukraine Vyzhnytsya 2022-06-07 21:25:42
- 176.108.111.XXX Ukraine Vyzhnytsya 2022-06-08 23:35:25
- 176.108.111.XXX Ukraine Vyzhnytsya 2022-06-11 19:52:42
- 82.207.245.XXX Germany Wachtberg 2022-06-03 02:26:58
- 82.207.245.XXX Germany Wachtberg 2022-06-03 02:27:08
- 82.207.245.XXX Germany Wachtberg 2022-06-03 02:27:09
- 82.207.245.XXX Germany Wachtberg 2022-06-03 02:27:10
- 79.191.159.XXX Poland Warsaw 2022-06-11 18:05:13
- 49.7.20.XXX China Wenzhou 2022-06-09 15:26:26
- 49.7.21.XXX China Wenzhou 2022-06-09 23:25:57
- 49.7.20.XXX China Wenzhou 2022-06-19 01:41:41
- 81.82.244.XXX Belgium Wetteren 2022-06-13 13:45:21
- 81.82.244.XXX Belgium Wetteren 2022-06-13 13:49:10
- 81.82.244.XXX Belgium Wetteren 2022-06-13 13:49:11
- 81.82.244.XXX Belgium Wetteren 2022-06-13 13:49:12
- 81.82.244.XXX Belgium Wetteren 2022-06-13 13:49:13
- 81.82.244.XXX Belgium Wetteren 2022-06-13 13:49:14
- 81.82.244.XXX Belgium Wetteren 2022-06-13 13:49:41
- 81.82.244.XXX Belgium Wetteren 2022-06-13 13:49:46
- 95.223.231.XXX Germany Wiesbaden 2022-06-04 21:42:20
- 95.223.231.XXX Germany Wiesbaden 2022-06-04 21:42:21
- 95.223.231.XXX Germany Wiesbaden 2022-06-04 21:42:23
- 95.223.231.XXX Germany Wiesbaden 2022-06-04 21:42:28
- 37.201.116.XXX Germany Wiesbaden 2022-06-10 19:55:50
- 113.57.152.XXX China Wuhan 2022-06-14 15:51:21
- 113.57.152.XXX China Wuhan 2022-06-14 15:51:22
- 113.57.152.XXX China Wuhan 2022-06-14 15:51:23
- 113.57.152.XXX China Wuhan 2022-06-14 15:51:25
- 113.57.152.XXX China Wuhan 2022-06-14 15:51:26
- 113.57.152.XXX China Wuhan 2022-06-14 15:51:57
- 113.57.152.XXX China Wuhan 2022-06-14 15:51:58
- 113.57.152.XXX China Wuhan 2022-06-14 15:52:01
- 89.164.183.XXX Croatia Zagreb 2022-06-04 11:44:22
- 89.164.183.XXX Croatia Zagreb 2022-06-04 11:44:23
- 89.164.183.XXX Croatia Zagreb 2022-06-04 11:44:24
- 89.164.183.XXX Croatia Zagreb 2022-06-04 11:44:25
- 89.164.183.XXX Croatia Zagreb 2022-06-04 11:44:26
- 86.32.46.XXX Croatia Zagreb 2022-06-04 16:45:49
- 86.32.46.XXX Croatia Zagreb 2022-06-04 16:45:51
- 86.32.46.XXX Croatia Zagreb 2022-06-04 16:45:52
- 86.32.46.XXX Croatia Zagreb 2022-06-04 16:45:53
- 86.32.46.XXX Croatia Zagreb 2022-06-04 16:45:55
- 86.32.46.XXX Croatia Zagreb 2022-06-04 16:45:56
- 86.32.46.XXX Croatia Zagreb 2022-06-04 16:45:59
- 86.32.46.XXX Croatia Zagreb 2022-06-04 16:46:00
- 85.10.56.XXX Croatia Zagreb 2022-06-09 19:39:55
- 85.10.56.XXX Croatia Zagreb 2022-06-17 19:57:56
- 122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:41
- 122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:42
- 122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:43
- 122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:44
- 122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:45
- 122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:46
- 122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:47
- 122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:48
- 122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:49
- 121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:22
- 121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:23
- 121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:24
- 121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:25
- 121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:26
- 121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:27
- 121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:28
- 121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:29
- 121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:30
- 185.113.213.XXX Netherlands Zennewijnen 2022-06-15 11:54:36
- 185.113.213.XXX Netherlands Zennewijnen 2022-06-15 11:54:37
- 185.113.213.XXX Netherlands Zennewijnen 2022-06-15 11:54:39
- ```
-
- NOTE: The last numerical part of all ip addresses are anonymized with `XXX` string.
-
- **Q: How many valid requests from Finland and Sweden occured between 15th - 24th April 2022?**
-
- ```
- httpd-logparser --files-regex "/var/log/httpd/access*log" --included-fields time,http_status,country --sort-by time --status-codes ^20* --day-lower "15-04-2022" --day-upper "24-04-2022" --countries Finland,Sweden --show-stats --show-progress
-
- File count: 5
- Lines in total: 86876
- Processing file: /var/log/httpd/access_log (lines: 23116)
- Processing file: /var/log/httpd/access_log.1 (lines: 21566)
- Processing file: /var/log/httpd/access_log.2 (lines: 13490)
- Processing file: /var/log/httpd/access_log.3 (lines: 13822)
- Processing file: /var/log/httpd/access_log.4 (lines: 14882)
- Processing log entry: 81924 (94.30%)
-
- ...
- 200 Sweden 2022-04-17 21:51:09
- 200 Sweden 2022-04-17 21:51:10
- 200 Sweden 2022-04-17 21:51:10
- 200 Sweden 2022-04-17 23:41:35
- 200 Sweden 2022-04-17 23:41:36
- 200 Sweden 2022-04-17 23:41:36
- 200 Sweden 2022-04-17 23:41:39
- 200 Sweden 2022-04-18 11:23:18
- 200 Sweden 2022-04-19 07:16:25
- 200 Sweden 2022-04-19 07:16:34
- 200 Finland 2022-04-19 11:47:51
- 200 Finland 2022-04-19 11:47:52
- 200 Finland 2022-04-19 11:47:52
- 200 Finland 2022-04-19 11:47:52
- ...
- 200 Finland 2022-04-22 09:51:16
- 200 Finland 2022-04-22 09:51:16
- 200 Finland 2022-04-22 09:51:16
- 200 Finland 2022-04-22 09:51:16
- 200 Finland 2022-04-22 09:51:16
- 200 Finland 2022-04-22 09:51:16
- 200 Finland 2022-04-22 12:38:49
- 200 Finland 2022-04-22 16:53:11
- ...
-
- Processed files: /var/log/httpd/access_log, /var/log/httpd/access_log.1, /var/log/httpd/access_log.2, /var/log/httpd/access_log.3, /var/log/httpd/access_log.4
- Processed log entries: 86876
- Matched log entries: 533
- ```
-
- Answer: 533
-
- **Q: How many redirects have occured since the 1st April 2022 according to two selected log files?**
-
- ```
- httpd-logparser --outfields time http_status country -d /var/log/httpd/ -c ^30* -f access_log* -dl "01-04-2020" --sortby time --stats
-
- httpd-logparser --files-regex /var/log/httpd/access_log.\[2-3\] --included-fields time,http_status,country --sort-by time --status-codes ^30* --day-lower "01-04-2022" --show-stats
-
- ...
- 304 Canada 2022-05-23 01:52:45
- 302 Canada 2022-05-23 01:53:33
- 302 Europe 2022-05-23 01:56:03
- 302 Poland 2022-05-23 02:00:31
- 302 Russian Federation 2022-05-23 02:52:50
- 302 United States 2022-05-23 04:34:30
- 302 France 2022-05-23 04:51:31
- 302 Germany 2022-05-23 05:02:16
- 302 Russian Federation 2022-05-23 05:04:13
- 302 Russian Federation 2022-05-23 05:04:14
- 302 Russian Federation 2022-05-23 05:04:14
- 302 United States 2022-05-23 05:11:10
- 302 United States 2022-05-23 05:11:11
- 302 Russian Federation 2022-05-23 05:23:09
- 302 China 2022-05-23 05:54:41
- ...
- 302 Germany 2022-05-31 19:53:18
- 302 Germany 2022-05-31 19:53:18
- 302 Germany 2022-05-31 19:53:18
- 302 Germany 2022-05-31 19:53:19
- 302 Germany 2022-05-31 19:53:19
- 304 Finland 2022-05-31 20:06:55
- 304 Finland 2022-05-31 20:16:02
- 304 Finland 2022-05-31 20:16:03
- 304 Finland 2022-05-31 20:16:06
- 302 Russian Federation 2022-05-31 20:40:33
- 302 United Kingdom 2022-05-31 21:09:32
- 302 China 2022-05-31 21:13:38
- 302 Russian Federation 2022-05-31 21:20:09
- 302 Romania 2022-05-31 22:01:31
- 304 United States 2022-05-31 22:11:30
- 302 Russian Federation 2022-05-31 22:59:23
- 302 United States 2022-05-31 23:16:52
- 304 Ukraine 2022-05-31 23:22:50
- 302 Russian Federation 2022-05-31 23:30:51
- 302 Netherlands 2022-05-31 23:37:10
- 302 Netherlands 2022-05-31 23:37:11
- 302 Netherlands 2022-05-31 23:37:12
-
- Processed files: /var/log/httpd/access_log.2, /var/log/httpd/access_log.3
- Processed log entries: 77730
- Matched log entries: 6788
-
- Invalid lines:
- File: /var/log/httpd/access_log.2, line: 24668
-
- ```
-
- Answer: 6788
-
- You should also check any invalid log lines detected by the tool.
-
- **Q: How many `4XX` codes have connected clients from China and United States produced?**
-
- ```
- httpd-logparser --files-regex "/var/log/httpd/access*log" --included-fields time,country,http_status,http_request --countries "United States",China --sort-by time --status-codes ^4 --show-progress --show-stats
-
- File count: 2
- Lines in total: 23614
- Processing file: /var/log/httpd/access_log (lines: 12021)
- Processing file: /var/log/httpd/access_log.1 (lines: 11593)
- Processing log entry: 18423 (78.01%)
- ...
-
- 408 United States 2022-06-01 03:45:18 None
- 408 United States 2022-06-01 03:45:18 None
- 408 United States 2022-06-01 09:11:15 None
- 408 United States 2022-06-01 11:36:05 None
- 408 United States 2022-06-01 11:36:05 None
- 421 United States 2022-06-01 13:08:29 GET / HTTP/1.1
- 408 United States 2022-06-01 19:44:42 None
- 408 United States 2022-06-01 19:44:42 None
- 408 China 2022-06-02 06:30:51 None
- 408 China 2022-06-02 06:30:51 None
- 408 China 2022-06-02 06:30:51 None
- 408 United States 2022-06-02 11:45:57 None
- 408 United States 2022-06-02 11:46:05 None
- 408 United States 2022-06-02 11:46:18 None
- 408 United States 2022-06-02 20:53:49 None
- 408 United States 2022-06-02 20:53:49 None
- 408 United States 2022-06-03 00:01:39 None
- 408 United States 2022-06-03 00:02:04 None
- 408 United States 2022-06-03 00:02:37 None
- 408 United States 2022-06-03 00:21:26 None
- 408 China 2022-06-03 11:39:22 None
- 408 United States 2022-06-03 15:41:34 None
- 408 United States 2022-06-04 01:28:08 None
- 408 United States 2022-06-04 07:29:53 None
- 408 United States 2022-06-04 07:29:56 None
- 408 United States 2022-06-04 07:29:56 None
- 408 United States 2022-06-04 11:25:10 None
- 408 United States 2022-06-04 11:25:10 None
- 408 China 2022-06-04 11:37:11 None
- 408 United States 2022-06-04 17:36:35 None
- 408 China 2022-06-05 15:56:35 None
- 408 China 2022-06-05 15:56:45 None
- 408 United States 2022-06-06 01:32:25 None
- 408 United States 2022-06-06 01:32:25 None
- 408 United States 2022-06-06 01:32:29 None
- ...
-
- Processed files: /var/log/httpd/access_log, /var/log/httpd/access_log.1
- Processed log entries: 23614
- Matched log entries: 112
- ```
-
- Answer: 112
-
- **Q: Which user agents clients have used recently?**
-
- ```
- httpd-logparser --files-list /var/log/httpd/access_log --included-fields user_agent | sort -u
-
- facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
- fasthttp
- Go-http-client/1.1
- HTTP Banner Detection (https://security.ipip.net)
- kubectl/v1.12.0 (linux/amd64) kubernetes/0ed3388
- libwww-perl/5.833
- libwww-perl/6.06
- libwww-perl/6.43
- Microsoft Office Word 2014
- Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
- Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
- Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50728)
- Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 10.0; Win64; x64; Trident/7.0; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30729; Tablet PC 2.0)
- Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; InfoPath.2)
- ...
- ...
- Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0
- Mozilla/5.0 (X11; Linux x86_64; rv:73.0) Gecko/20100101 Firefox/73.0
- Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:58.0) Gecko/20100101 Firefox/58.0
- Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0
- Mozilla/5.0 zgrab/0.x
- Mozilla/5.0 zgrab/0.x (compatible; Researchscan/t12sns; +http://researchscan.comsys.rwth-aachen.de)
- Mozilla/5.0 zgrab/0.x (compatible; Researchscan/t13rl; +http://researchscan.comsys.rwth-aachen.de)
- NetSystemsResearch studies the availability of various services across the internet. Our website is netsystemsresearch.com
- None
- python-requests/1.2.3 CPython/2.7.16 Linux/4.14.165-102.185.amzn1.x86_64
- python-requests/2.10.0
- python-requests/2.19.1
- python-requests/2.22.0
- python-requests/2.23.0
- python-requests/2.6.0 CPython/2.7.5 Linux/3.10.0-1062.12.1.el7.x86_64
- python-requests/2.6.0 CPython/2.7.5 Linux/3.10.0-1062.18.1.el7.x86_64
- Python-urllib/3.7
- Ruby
- Wget/1.19.4 (linux-gnu)
- WinHTTP/1.1
- ```
-
- **Q: Which is time difference between single client requests? Exclude Finland. Include all access_log files.**
-
- ```
- httpd-logparser --included-fields http_status,time,time_diff,country --countries "\!Finland" --files-regex "/var/log/httpd/old/access*log"
-
- 200 Taiwan 2022-06-19 12:21:47 NEW_CONN
- 200 Taiwan 2022-06-19 12:21:48 +1
- 200 Taiwan 2022-06-19 12:21:49 +1
- 200 Taiwan 2022-06-19 12:21:49 0
- 200 Taiwan 2022-06-19 12:21:49 0
- 200 Taiwan 2022-06-19 12:21:49 0
- 200 Taiwan 2022-06-19 12:21:50 +1
- 200 Taiwan 2022-06-19 12:21:49 -1
- 200 Taiwan 2022-06-19 12:21:49 0
- 200 Taiwan 2022-06-19 12:21:50 +1
- 200 Taiwan 2022-06-19 12:21:50 0
- 200 Taiwan 2022-06-19 12:21:50 0
- 200 Taiwan 2022-06-19 12:21:51 +1
- 200 Taiwan 2022-06-19 12:21:56 +5
- 200 Taiwan 2022-06-19 12:22:04 +8
- 200 Taiwan 2022-06-19 12:22:05 +1
- 200 Taiwan 2022-06-19 12:22:06 +1
- 200 Taiwan 2022-06-19 12:22:06 0
- 200 Taiwan 2022-06-19 12:22:06 0
- 302 Taiwan 2022-06-19 12:22:07 +1
- 200 Taiwan 2022-06-19 12:22:07 0
- 200 Taiwan 2022-06-19 12:22:07 0
- 200 Taiwan 2022-06-19 12:22:07 0
- 200 Taiwan 2022-06-19 12:22:07 0
- 200 Taiwan 2022-06-19 12:22:07 0
- 200 Taiwan 2022-06-19 12:22:14 +7
- 200 Taiwan 2022-06-19 12:22:14 0
- 200 Japan 2022-06-19 12:34:49 NEW_CONN
- 200 Japan 2022-06-19 12:34:54 +5
- 200 United States 2022-06-19 12:55:44 NEW_CONN
- 200 United States 2022-06-19 12:55:44 0
- 200 United States 2022-06-19 12:55:50 +6
- 200 United States 2022-06-19 12:55:55 +5
- 302 France 2022-06-19 13:01:30 NEW_CONN
- 200 United States 2022-06-19 13:10:07 NEW_CONN
- 200 United States 2022-06-19 13:10:12 +5
- 302 China 2022-06-19 13:15:59 NEW_CONN
- 302 China 2022-06-19 13:16:10 +11
- 302 China 2022-06-19 13:16:11 +1
- 200 Germany 2022-06-19 13:27:42 NEW_CONN
- 200 Hong Kong 2022-06-19 13:40:02 NEW_CONN
- 200 Hong Kong 2022-06-19 13:40:02 0
- 200 Hong Kong 2022-06-19 13:40:02 0
- ...
- 200 India 2022-06-19 13:45:03 NEW_CONN
- 200 India 2022-06-19 13:45:04 +1
- 200 India 2022-06-19 13:45:04 0
- 200 India 2022-06-19 13:45:04 0
- 200 India 2022-06-19 13:45:04 0
- 200 India 2022-06-19 13:45:05 +1
- 200 India 2022-06-19 13:45:05 0
- 200 India 2022-06-19 13:45:05 0
- ...
- ```
-
- **Get CSV formatted output, selected fields only, use day limit, process last 100 server log entries. Print header information.**
-
- ```
- httpd-logparser --files-list /var/log/httpd/access_log --geo-location --sort-by time --included-fields time,country,city,http_request --day-lower 27-06-2022 --verbose --tail 100 --output csv --print-header
-
- Date/Time,Country,City,Request
- ...
- 2022-06-27 23:33:14,United States,Unknown: 37.750999, -97.821999,GET /git/explore/repos?sort=recentupdate&q=dds-format&tab= HTTP/1.1
- 2022-06-27 23:33:16,United States,Unknown: 37.750999, -97.821999,GET /git/explore/repos?sort=reversealphabetically&q=transmission&tab= HTTP/1.1
- 2022-06-27 23:33:19,United States,Unknown: 37.750999, -97.821999,GET /git/explore/repos?sort=feweststars&q=real-time-strategy&tab= HTTP/1.1
- 2022-06-27 23:33:21,United States,Unknown: 37.750999, -97.821999,GET /git/explore/repos?sort=feweststars&q=shell-script&tab= HTTP/1.1
- 2022-06-27 23:34:28,United States,Austin,GET /XXX HTTP/1.1
- 2022-06-27 23:34:28,United States,Austin,GET /css/XXX HTTP/1.1
- 2022-06-27 23:34:28,United States,Austin,GET /css/XXX HTTP/1.1
- 2022-06-27 23:34:28,United States,Austin,GET /js/XXX HTTP/1.1
- 2022-06-27 23:34:29,United States,Austin,GET /js/XXX HTTP/1.1
- 2022-06-27 23:34:29,United States,Austin,GET /js/XXX HTTP/1.1
- 2022-06-27 23:34:29,United States,Austin,GET /images/XXX HTTP/1.1
- 2022-06-27 23:34:29,United States,Austin,GET /js/XXX HTTP/1.1
- 2022-06-27 23:34:30,United States,Austin,GET /images/XXX HTTP/1.1
- 2022-06-27 23:34:30,United States,Austin,GET /images/XXX HTTP/1.1
- 2022-06-27 23:34:30,United States,Austin,GET /images/XXX HTTP/1.1
- 2022-06-27 23:34:30,United States,Austin,GET /images/XXX HTTP/1.1
- 2022-06-27 23:34:30,United States,Austin,GET /images/XXX HTTP/1.1
- 2022-06-27 23:34:30,United States,Austin,GET /images/XXX HTTP/1.1
- 2022-06-27 23:34:31,United States,Austin,GET /images/XXX HTTP/1.1
- 2022-06-27 23:34:31,United States,Austin,GET /images/XXX HTTP/1.1
- 2022-06-27 23:34:31,United States,Austin,GET /images/XXX HTTP/1.1
- 2022-06-27 23:34:31,United States,Austin,GET /images/XXX HTTP/1.1
- 2022-06-27 23:34:31,United States,Austin,GET /images/XXX HTTP/1.1
- 2022-06-27 23:34:31,United States,Austin,GET /images/XXX HTTP/1.1
- 2022-06-27 23:34:31,United States,Austin,GET /images/XXX HTTP/1.1
- 2022-06-27 23:34:31,United States,Austin,GET /webfonts/XXX HTTP/1.1
- 2022-06-27 23:34:31,United States,Austin,GET /webfonts/XXX HTTP/1.1
- 2022-06-27 23:34:31,United States,Austin,GET /images/XXX HTTP/1.1
- 2022-06-27 23:34:31,United States,Austin,GET /webfonts/XXX HTTP/1.1
- 2022-06-27 23:34:32,United States,Austin,GET /images/XXX HTTP/1.1
- 2022-06-27 23:34:32,United States,Austin,GET / HTTP/1.1
- 2022-06-27 23:34:32,United States,Austin,GET /images/favicon-32x32.png HTTP/1.1
- 2022-06-27 23:34:32,United States,Austin,GET /XXX HTTP/1.1
- 2022-06-27 23:34:37,United States,Austin,GET /images/favicon-32x32.png HTTP/1.1
- 2022-06-27 23:34:59,United States,Austin,None
- 2022-06-27 23:35:02,Germany,Unknown: 51.299301, 9.490900,GET /git/ HTTP/1.1
- 2022-06-27 23:35:04,United States,Austin,None
- ```
-
- ## Usage
-
- ```
- usage: httpd-logparser [-h] [-fr [FILES_REGEX]] [-f [FILES_LIST]] [-c CODES [CODES ...]] [-cf [COUNTRIES]] [-tf [TIME_FORMAT]] [-if [INCL_FIELDS]]
- [-ef [EXCL_FIELDS]] [-gl] [-ge [GEOTOOL_EXEC]] [-gd [GEO_DATABASE_LOCATION]] [-dl [DATE_LOWER]] [-du [DATE_UPPER]] [-sb [SORTBY_FIELD]]
- [-ro] [-st] [-p] [--httpd-conf-file] [--httpd-log-nickname] [-lf LOG_FORMAT] [-ph] [--output-format {table,csv}]
- [--head [READ_FIRST_LINES_NUM]] [--tail [READ_LAST_LINES_NUM]] [--sort-logs-by {date,size,name}] [--verbose]
-
- Apache HTTPD server log parser
-
- optional arguments:
- -h, --help show this help message and exit
- -fr [FILES_REGEX], --files-regex [FILES_REGEX]
- Apache log files matching input regular expression. (default: None)
- -f [FILES_LIST], --files-list [FILES_LIST]
- Apache log files. Regular expressions supported. (default: None)
- -c CODES [CODES ...], --status-codes CODES [CODES ...]
- Print only these numerical status codes. Regular expressions supported. (default: None)
- -cf [COUNTRIES], --countries [COUNTRIES]
- Include only these countries. Negative match (exclude): "\!Country" (default: None)
- -tf [TIME_FORMAT], --time-format [TIME_FORMAT]
- Output time format. (default: %d-%m-%Y %H:%M:%S)
- -if [INCL_FIELDS], --included-fields [INCL_FIELDS]
- Included fields. All fields: all, log_file_name, http_status, remote_host, country, city, time, time_diff, user_agent, http_request
- (default: http_status,remote_host,time,time_diff,user_agent,http_request)
- -ef [EXCL_FIELDS], --excluded-fields [EXCL_FIELDS]
- Excluded fields. (default: None)
- -gl, --geo-location Check origin countries with external "geoiplookup" tool. NOTE: Automatically includes "country" and "city" fields. (default: False)
- -ge [GEOTOOL_EXEC], --geotool-exec [GEOTOOL_EXEC]
- "geoiplookup" tool executable found in PATH. (default: geoiplookup)
- -gd [GEO_DATABASE_LOCATION], --geo-database-dir [GEO_DATABASE_LOCATION]
- Database file directory for "geoiplookup" tool. (default: /usr/share/GeoIP/)
- -dl [DATE_LOWER], --day-lower [DATE_LOWER]
- Do not check log entries older than this day. Day syntax: 31-12-2020 (default: None)
- -du [DATE_UPPER], --day-upper [DATE_UPPER]
- Do not check log entries newer than this day. Day syntax: 31-12-2020 (default: None)
- -sb [SORTBY_FIELD], --sort-by [SORTBY_FIELD]
- Sort by an output field. (default: None)
- -ro, --reverse Sort in reverse order. (default: False)
- -st, --show-stats Show short statistics at the end. (default: False)
- -p, --show-progress Show progress information. (default: False)
- --httpd-conf-file Apache HTTPD configuration file with LogFormat directive. (default: /etc/httpd/conf/httpd.conf)
- --httpd-log-nickname LogFormat directive nickname (default: combinedio)
- -lf LOG_FORMAT, --log-format LOG_FORMAT
- Log format, manually defined. (default: None)
- -ph, --print-headers Print column headers. (default: False)
- --output-format {table,csv}
- Output format for results. (default: table)
- --head [READ_FIRST_LINES_NUM]
- Read first N lines from all log entries. (default: None)
- --tail [READ_LAST_LINES_NUM]
- Read last N lines from all log entries. (default: None)
- --sort-logs-by {date,size,name}
- Sorting order for input log files. (default: name)
- --verbose Verbose output. (default: False)
- ```
-
- ## License
-
- GPLv3.
|