@ -8,11 +8,11 @@ Unix-alike systems only.
Keep it simple. Very simple.
Keep it simple. Very simple.
Although advanced and nice-looking log analytic tools such as [Elastic Stack ](https://www.elastic.co/products/ ) exists (I have used it) , I wanted something far more simple and with far less overhead for weekly tasks and for configuring an Apache web server. Therefore, I wrote this simple Python script to parse Apache web server logs.
Although advanced and nice-looking log analytic tools such as [Elastic Stack ](https://www.elastic.co/products/ ) exists, I wanted something far more simple and with far less overhead for weekly tasks and for configuring an Apache web server. Therefore, I wrote this simple Python script to parse Apache web server logs.
**Advantages** of this tool are little overhead, piping output to other Unix tools and doing some quick log checks. The main idea is to give desired output for short analysis so that you can properly configure your web server protection mechanisms and network environment based on the actual server data.
**Advantages** of this tool are little overhead, piping output to other Unix tools and doing some quick log checks. The main idea is to give desired output for short analysis so that you can properly configure your web server protection mechanisms and network environment based on the actual server data.
This tool is not for intrusion detection/prevention or does not alert administration about hostile penetration attempts. However, it may reveal simple underlying misconfigurations such as invalid URL references on your site.
This tool is not for intrusion detection/prevention or does not alert administration about hostile penetration attempts. However, it may reveal simple underlying misconfigurations such as invalid URL references on your web site.
## Requirements
## Requirements
@ -38,209 +38,274 @@ geoip-database
Arch Linux:
Arch Linux:
run `updpkgsums && makepkg -Cfi` in [apache-logparser ](apache-logparser/ ) directory. Installs `httpd-logparser` executable file in `/usr/bin/` folder.
run `updpkgsums && makepkg -Cfi` in [apache-logparser ](apache-logparser/ ) directory. The command installs `httpd-logparser` executable file in `/usr/bin/` folder.
## Supported output formats
- `table` and `csv`
## Examples
## Examples
**Q: Can you list me unique connections (IP addresses) associated with country and city location data, using the last Apache log file?**
**Q: List unique connections (IP addresses) associated with country and city location data, using the last Apache log file?**
```
```
httpd-logparser --outfields time remote_host country city -d /var/log/httpd/ -f access_log$ -np --stats | sort -k 3 -u | sort -k 4
httpd-logparser --files-list /var/log/httpd/access_log --included-fields time,remote_host,country,city | sort -k 2 -u | sort -k 3
103.102.153.XXX Indonesia Unknown: -6.175000, 106.828598 2022-06-12 10:33:58
Processed files: access_log
103.102.153.XXX Indonesia Unknown: -6.175000, 106.828598 2022-06-12 10:33:59
Matched log entries: 724
103.102.153.XXX Indonesia Unknown: -6.175000, 106.828598 2022-06-12 10:34:00
Processed log entries: 724
103.102.153.XXX Indonesia Unknown: -6.175000, 106.828598 2022-06-12 10:34:01
2021-06-06 10:00:57 135.23.195.XXX Canada Quebec
103.144.178.XXX Indonesia Unknown: -6.175000, 106.828598 2022-06-16 06:34:19
2021-06-06 04:58:58 8.210.233.XXX China Guangzhou
62.214.113.XXX Germany Unterhaching 2022-06-10 14:39:16
2021-06-06 05:01:37 23.228.109.XXX China Shanghai
62.214.113.XXX Germany Unterhaching 2022-06-10 16:34:15
2021-06-06 04:49:57 8.210.71.XXX China Unknown: 34.772499, 113.726601
62.214.113.XXX Germany Unterhaching 2022-06-10 16:40:03
2021-06-06 09:47:32 92.151.100.XXX France Boulogne-Billancourt
62.214.113.XXX Germany Unterhaching 2022-06-10 16:40:04
2021-06-06 02:05:38 195.154.122.XXX France Ivry-sur-Seine
62.214.113.XXX Germany Unterhaching 2022-06-10 16:40:05
2021-06-06 03:24:22 92.116.45.XXX Germany Bielefeld
84.234.169.XXX Norway Valderoy 2022-06-06 00:20:18
2021-06-06 06:06:58 207.154.218.XXX Germany Frankfurt am Main
194.137.241.XXX Finland Vantaa 2022-06-07 12:20:42
2021-06-06 10:45:40 172.105.77.XXX Germany Frankfurt am Main
194.137.241.XXX Finland Vantaa 2022-06-07 12:20:43
2021-06-06 00:25:20 92.116.52.XXX Germany Hamm
194.137.241.XXX Finland Vantaa 2022-06-07 12:20:44
2021-06-06 05:02:54 159.69.10.XXX Germany Mannheim
...
2021-06-06 06:24:55 89.246.127.XXX Germany Schloss Holte-Stukenbrock
176.108.111.XXX Ukraine Vyzhnytsya 2022-06-07 21:25:38
2021-06-06 10:08:21 138.201.56.XXX Germany Unknown: 51.299301, 9.490900
176.108.111.XXX Ukraine Vyzhnytsya 2022-06-07 21:25:39
2021-06-06 03:42:02 47.31.198.XXX India Delhi
176.108.111.XXX Ukraine Vyzhnytsya 2022-06-07 21:25:40
2021-06-06 00:15:16 92.118.160.XXX Lithuania Unknown: 56.000000, 24.000000
176.108.111.XXX Ukraine Vyzhnytsya 2022-06-07 21:25:41
2021-06-06 02:10:21 92.118.160.XXX Lithuania Unknown: 56.000000, 24.000000
176.108.111.XXX Ukraine Vyzhnytsya 2022-06-07 21:25:42
2021-06-06 02:32:48 92.118.160.XXX Lithuania Unknown: 56.000000, 24.000000
176.108.111.XXX Ukraine Vyzhnytsya 2022-06-08 23:35:25
2021-06-06 03:26:22 92.118.160.XXX Lithuania Unknown: 56.000000, 24.000000
176.108.111.XXX Ukraine Vyzhnytsya 2022-06-11 19:52:42
2021-06-06 06:52:23 92.118.160.XXX Lithuania Unknown: 56.000000, 24.000000
82.207.245.XXX Germany Wachtberg 2022-06-03 02:26:58
2021-06-06 07:00:48 92.118.160.XXX Lithuania Unknown: 56.000000, 24.000000
82.207.245.XXX Germany Wachtberg 2022-06-03 02:27:08
2021-06-06 11:10:59 92.118.160.XXX Lithuania Unknown: 56.000000, 24.000000
82.207.245.XXX Germany Wachtberg 2022-06-03 02:27:09
2021-06-06 00:23:05 92.118.160.XXX Lithuania Unknown: 56.000000, 24.000000
82.207.245.XXX Germany Wachtberg 2022-06-03 02:27:10
2021-06-06 02:46:33 92.118.160.XXX Lithuania Unknown: 56.000000, 24.000000
79.191.159.XXX Poland Warsaw 2022-06-11 18:05:13
2021-06-06 05:11:20 45.131.212.XXX Netherlands Amsterdam
49.7.20.XXX China Wenzhou 2022-06-09 15:26:26
2021-06-06 05:12:40 185.180.143.XXX Portugal Unknown: 38.705700, -9.135900
49.7.21.XXX China Wenzhou 2022-06-09 23:25:57
2021-06-06 07:55:47 89.137.179.XXX Romania Timisoara
49.7.20.XXX China Wenzhou 2022-06-19 01:41:41
2021-06-06 06:10:46 91.243.100.XXX Russian Federation Novocherkassk
81.82.244.XXX Belgium Wetteren 2022-06-13 13:45:21
2021-06-06 11:30:51 213.177.208.XXX Spain Palencia
81.82.244.XXX Belgium Wetteren 2022-06-13 13:49:10
2021-06-06 01:41:48 184.22.158.XXX Thailand Thalang
81.82.244.XXX Belgium Wetteren 2022-06-13 13:49:11
2021-06-06 08:14:41 176.88.78.XXX Turkey Ankara
81.82.244.XXX Belgium Wetteren 2022-06-13 13:49:12
2021-06-06 08:32:04 212.82.66.XXX United Kingdom Burnham
81.82.244.XXX Belgium Wetteren 2022-06-13 13:49:13
2021-06-06 03:53:41 45.146.164.XXX United Kingdom London
81.82.244.XXX Belgium Wetteren 2022-06-13 13:49:14
2021-06-06 04:33:42 185.158.250.XXX United Kingdom Manchester
81.82.244.XXX Belgium Wetteren 2022-06-13 13:49:41
2021-06-06 10:16:19 82.10.88.XXX United Kingdom Shrewsbury
81.82.244.XXX Belgium Wetteren 2022-06-13 13:49:46
2021-06-06 10:14:28 40.77.189.XXX United States Chicago
95.223.231.XXX Germany Wiesbaden 2022-06-04 21:42:20
2021-06-06 08:16:07 69.170.221.XXX United States Colorado Springs
95.223.231.XXX Germany Wiesbaden 2022-06-04 21:42:21
2021-06-06 10:57:25 192.241.206.XXX United States San Francisco
95.223.231.XXX Germany Wiesbaden 2022-06-04 21:42:23
2021-06-06 01:09:16 128.14.209.XXX United States Unknown: 37.750999, -97.821999
95.223.231.XXX Germany Wiesbaden 2022-06-04 21:42:28
2021-06-06 06:44:49 47.243.113.XXX United States Unknown: 37.750999, -97.821999
37.201.116.XXX Germany Wiesbaden 2022-06-10 19:55:50
2021-06-06 06:45:48 47.243.116.XXX United States Unknown: 37.750999, -97.821999
113.57.152.XXX China Wuhan 2022-06-14 15:51:21
2021-06-06 08:00:40 162.244.34.XXX United States Unknown: 37.750999, -97.821999
113.57.152.XXX China Wuhan 2022-06-14 15:51:22
2021-06-06 10:30:53 47.242.214.XXX United States Unknown: 37.750999, -97.821999
113.57.152.XXX China Wuhan 2022-06-14 15:51:23
2021-06-06 04:22:27 162.244.33.XXX United States Unknown: 37.750999, -97.821999
113.57.152.XXX China Wuhan 2022-06-14 15:51:25
2021-06-06 04:34:47 47.243.48.XXX United States Unknown: 37.750999, -97.821999
113.57.152.XXX China Wuhan 2022-06-14 15:51:26
2021-06-06 06:37:16 47.243.109.XXX United States Unknown: 37.750999, -97.821999
113.57.152.XXX China Wuhan 2022-06-14 15:51:57
2021-06-06 06:42:37 162.244.33.XXX United States Unknown: 37.750999, -97.821999
113.57.152.XXX China Wuhan 2022-06-14 15:51:58
2021-06-06 06:44:49 47.243.109.XXX United States Unknown: 37.750999, -97.821999
113.57.152.XXX China Wuhan 2022-06-14 15:52:01
2021-06-06 07:04:20 47.243.113.XXX United States Unknown: 37.750999, -97.821999
89.164.183.XXX Croatia Zagreb 2022-06-04 11:44:22
2021-06-06 07:44:23 47.243.110.XXX United States Unknown: 37.750999, -97.821999
89.164.183.XXX Croatia Zagreb 2022-06-04 11:44:23
2021-06-06 08:29:33 47.242.12.XXX United States Unknown: 37.750999, -97.821999
89.164.183.XXX Croatia Zagreb 2022-06-04 11:44:24
2021-06-06 10:38:15 128.14.133.XXX United States Unknown: 37.750999, -97.821999
89.164.183.XXX Croatia Zagreb 2022-06-04 11:44:25
2021-06-06 03:18:25 23.95.132.XXX United States Unknown: 37.750999, -97.821999
89.164.183.XXX Croatia Zagreb 2022-06-04 11:44:26
2021-06-06 04:13:55 128.1.248.XXX United States Unknown: 37.750999, -97.821999
86.32.46.XXX Croatia Zagreb 2022-06-04 16:45:49
2021-06-06 08:21:11 64.62.197.XXX United States Unknown: 37.750999, -97.821999
86.32.46.XXX Croatia Zagreb 2022-06-04 16:45:51
2021-06-06 11:17:33 47.243.95.XXX United States Unknown: 37.750999, -97.821999
86.32.46.XXX Croatia Zagreb 2022-06-04 16:45:52
2021-06-06 08:03:24 167.56.236.XXX Uruguay Castillos
86.32.46.XXX Croatia Zagreb 2022-06-04 16:45:53
86.32.46.XXX Croatia Zagreb 2022-06-04 16:45:55
86.32.46.XXX Croatia Zagreb 2022-06-04 16:45:56
86.32.46.XXX Croatia Zagreb 2022-06-04 16:45:59
86.32.46.XXX Croatia Zagreb 2022-06-04 16:46:00
85.10.56.XXX Croatia Zagreb 2022-06-09 19:39:55
85.10.56.XXX Croatia Zagreb 2022-06-17 19:57:56
122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:41
122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:42
122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:43
122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:44
122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:45
122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:46
122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:47
122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:48
122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:49
121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:22
121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:23
121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:24
121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:25
121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:26
121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:27
121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:28
121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:29
121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:30
185.113.213.XXX Netherlands Zennewijnen 2022-06-15 11:54:36
185.113.213.XXX Netherlands Zennewijnen 2022-06-15 11:54:37
185.113.213.XXX Netherlands Zennewijnen 2022-06-15 11:54:39
```
```
NOTE: The last numerical part of all ip addresses are anonymized with `XXX` string.
NOTE: The last numerical part of all ip addresses are anonymized with `XXX` string.
**Q: How many valid requests from Finland and Sweden occured between 15th - 24th April 2020?**
**Q: How many valid requests from Finland and Sweden occured between 15th - 24th April 2022 ?**
```
```
httpd-logparser --outfields time http_status country -d /var/log/httpd/ -c ^20* -f access_log* -cf Finland Sweden -dl "15-04-2020" -du "24-04-2020" --sortby time --stats
httpd-logparser --files-regex /var/log/httpd/access_log --included-fields time,http_status,country --sort-by time --status-codes ^20* --day-lower "15-04-2022" --day-upper "24-04-2022" --show-stats --show-progress
File count: 5
Lines in total: 151134
Processing file: access_log
Processing file: /var/log/httpd/access_log (lines: 40213)
Processing file: access_log.1
Processing file: /var/log/httpd/access_log.1 (lines: 37518)
Processing file: access_log.2
Processing file: /var/log/httpd/access_log.2 (lines: 23468)
Processing file: access_log.3
Processing file: /var/log/httpd/access_log.3 (lines: 24045)
Processing file: access_log.4
Processing file: /var/log/httpd/access_log.4 (lines: 25890)
Processing log entry: 883
Processing log entry: 142524 (94.30%)
2020-04-17 08:47:05 200 Finland
2020-04-17 08:47:05 200 Finland
2020-04-17 08:47:05 200 Finland
2020-04-17 08:47:05 200 Finland
2020-04-17 08:47:05 200 Finland
2020-04-17 08:47:05 200 Finland
2020-04-17 08:47:05 200 Finland
...
...
200 Italy 2022-04-22 20:54:39
200 Italy 2022-04-22 20:54:39
200 Italy 2022-04-22 20:54:39
200 Italy 2022-04-22 20:54:39
200 Italy 2022-04-22 20:54:39
200 Taiwan 2022-04-22 21:00:27
200 Taiwan 2022-04-22 21:00:28
200 Taiwan 2022-04-22 21:00:29
200 Taiwan 2022-04-22 21:00:29
...
...
2020-04-23 18:04:07 200 Finland
200 United States 2022-04-23 22:29:36
2020-04-23 18:04:07 200 Finland
200 United States 2022-04-23 22:38:50
2020-04-23 18:04:07 200 Finland
200 United States 2022-04-23 23:06:15
2020-04-23 18:04:07 200 Finland
200 United States 2022-04-23 23:14:08
2020-04-23 18:04:07 200 Finland
200 United States 2022-04-23 23:14:09
2020-04-23 18:04:07 200 Finland
200 United States 2022-04-23 23:24:21
2020-04-23 18:04:08 200 Finland
200 United States 2022-04-23 23:24:22
200 Canada 2022-04-23 23:31:52
Processed files: access_log, access_log.1, access_log.2, access.log_3, access_log.4
200 Ireland 2022-04-23 23:32:05
Processed log entries: 883
200 United States 2022-04-23 23:33:36
Matched log entries: 211
200 Vietnam 2022-04-23 23:53:37
Processed files: /var/log/httpd/access_log, /var/log/httpd/access_log.1, /var/log/httpd/access_log.2, /var/log/httpd/access_log.3, /var/log/httpd/access_log.4
Processed log entries: 151134
Matched log entries: 9012
```
```
Answer: 9012
**Q: How many redirects have occured since 01st April 2020?**
**Q: How many redirects have occured since the 1st April 2022 according to two selected log files ?**
```
```
httpd-logparser --outfields time http_status country -d /var/log/httpd/ -c ^30* -f access_log* -dl "01-04-2020" --sortby time --stats
httpd-logparser --outfields time http_status country -d /var/log/httpd/ -c ^30* -f access_log* -dl "01-04-2020" --sortby time --stats
Processing file: access_log
httpd-logparser --files-regex /var/log/httpd/access_log.\[2-3\] --included-fields time,http_status,country --sort-by time --status-codes ^30* --day-lower "01-04-2022" --show-stats
Processing file: access_log.1
Processing file: access_log.2
Processing file: access_log.3
Processing file: access_log.4
Processing log entry: 8993
2020-04-01 02:13:12 302 United States
2020-04-01 02:13:12 302 United States
2020-04-01 02:13:13 301 United States
2020-04-01 02:13:13 302 United States
2020-04-01 02:13:14 302 United States
2020-04-01 02:13:14 302 United States
2020-04-01 02:13:14 302 United States
2020-04-01 02:13:15 302 United States
2020-04-01 02:13:15 302 United States
2020-04-01 03:25:06 302 United States
2020-04-01 04:03:39 302 Russian Federation
2020-04-01 04:03:44 302 Russian Federation
...
...
304 Canada 2022-05-23 01:52:45
302 Canada 2022-05-23 01:53:33
302 Europe 2022-05-23 01:56:03
302 Poland 2022-05-23 02:00:31
302 Russian Federation 2022-05-23 02:52:50
302 United States 2022-05-23 04:34:30
302 France 2022-05-23 04:51:31
302 Germany 2022-05-23 05:02:16
302 Russian Federation 2022-05-23 05:04:13
302 Russian Federation 2022-05-23 05:04:14
302 Russian Federation 2022-05-23 05:04:14
302 United States 2022-05-23 05:11:10
302 United States 2022-05-23 05:11:11
302 Russian Federation 2022-05-23 05:23:09
302 China 2022-05-23 05:54:41
...
...
2020-05-01 18:53:05 302 Italy
302 Germany 2022-05-31 19:53:18
2020-05-01 18:53:21 301 Italy
302 Germany 2022-05-31 19:53:18
2020-05-01 18:53:22 301 Italy
302 Germany 2022-05-31 19:53:18
2020-05-01 18:53:24 302 Italy
302 Germany 2022-05-31 19:53:19
2020-05-01 18:53:25 302 Italy
302 Germany 2022-05-31 19:53:19
2020-05-01 18:53:26 302 Italy
304 Finland 2022-05-31 20:06:55
2020-05-01 18:53:26 302 Italy
304 Finland 2022-05-31 20:16:02
2020-05-01 18:54:20 302 Italy
304 Finland 2022-05-31 20:16:03
2020-05-01 19:18:15 301 Russian Federation
304 Finland 2022-05-31 20:16:06
2020-05-01 19:18:15 301 Russian Federation
302 Russian Federation 2022-05-31 20:40:33
2020-05-01 19:18:15 301 Russian Federation
302 United Kingdom 2022-05-31 21:09:32
2020-05-01 19:18:17 301 Russian Federation
302 China 2022-05-31 21:13:38
2020-05-01 19:21:19 302 France
302 Russian Federation 2022-05-31 21:20:09
302 Romania 2022-05-31 22:01:31
Processed files: access_log, access_log.1, access_log.2, access_log.3, access_log.4
304 United States 2022-05-31 22:11:30
Processed log entries: 8994
302 Russian Federation 2022-05-31 22:59:23
Matched log entries: 3207
302 United States 2022-05-31 23:16:52
304 Ukraine 2022-05-31 23:22:50
302 Russian Federation 2022-05-31 23:30:51
302 Netherlands 2022-05-31 23:37:10
302 Netherlands 2022-05-31 23:37:11
302 Netherlands 2022-05-31 23:37:12
Processed files: /var/log/httpd/access_log.2, /var/log/httpd/access_log.3
Processed log entries: 77730
Matched log entries: 6788
Invalid lines:
File: /var/log/httpd/access_log.2, line: 24668
```
```
**Q: How many `4XX` codes have connected clients from China and United States produced in all time?**
Answer: 6788
You should also check any invalid log lines detected by the tool.
**Q: How many `4XX` codes have connected clients from China and United States produced?**
```
```
httpd-logparser --outfields time country http_status http_request -d /var/log/httpd/ -c ^4 -f access_log* -cf "United States" China --sortby time --stats
httpd-logparser --files-regex /var/log/httpd/access_log --included-fields time,country,http_status,http_request --countries "United States",China --sort-by time --status-codes ^4 --show-progress --show-stats
File count: 2
Processing file: access_log
Lines in total: 23614
Processing file: access_log.1
Processing file: /var/log/httpd/access_log (lines: 12021)
Processing file: access_log.2
Processing file: /var/log/httpd/access_log.1 (lines: 11593)
Processing file: access_log.3
Processing log entry: 18423 (78.01%)
Processing file: access_log.4
Processing log entry: 10221
2020-03-29 18:49:34 United States 408 None
2020-03-29 18:49:34 United States 408 None
2020-03-29 19:28:02 China 408 None
2020-04-08 06:14:48 China 400 GET /phpMyAdmin/scripts/setup.php HTTP/1.1
2020-04-08 06:14:53 China 400 GET /horde/imp/test.php HTTP/1.1
2020-04-08 06:14:54 China 400 GET /login?from=0.000000 HTTP/1.1
...
...
408 United States 2022-06-01 03:45:18 None
408 United States 2022-06-01 03:45:18 None
408 United States 2022-06-01 09:11:15 None
408 United States 2022-06-01 11:36:05 None
408 United States 2022-06-01 11:36:05 None
421 United States 2022-06-01 13:08:29 GET / HTTP/1.1
408 United States 2022-06-01 19:44:42 None
408 United States 2022-06-01 19:44:42 None
408 China 2022-06-02 06:30:51 None
408 China 2022-06-02 06:30:51 None
408 China 2022-06-02 06:30:51 None
408 United States 2022-06-02 11:45:57 None
408 United States 2022-06-02 11:46:05 None
408 United States 2022-06-02 11:46:18 None
408 United States 2022-06-02 20:53:49 None
408 United States 2022-06-02 20:53:49 None
408 United States 2022-06-03 00:01:39 None
408 United States 2022-06-03 00:02:04 None
408 United States 2022-06-03 00:02:37 None
408 United States 2022-06-03 00:21:26 None
408 China 2022-06-03 11:39:22 None
408 United States 2022-06-03 15:41:34 None
408 United States 2022-06-04 01:28:08 None
408 United States 2022-06-04 07:29:53 None
408 United States 2022-06-04 07:29:56 None
408 United States 2022-06-04 07:29:56 None
408 United States 2022-06-04 11:25:10 None
408 United States 2022-06-04 11:25:10 None
408 China 2022-06-04 11:37:11 None
408 United States 2022-06-04 17:36:35 None
408 China 2022-06-05 15:56:35 None
408 China 2022-06-05 15:56:45 None
408 United States 2022-06-06 01:32:25 None
408 United States 2022-06-06 01:32:25 None
408 United States 2022-06-06 01:32:29 None
...
...
2020-04-24 10:40:16 United States 403 GET /MAPI/API HTTP/1.1
Processed files: /var/log/httpd/access_log, /var/log/httpd/access_log.1
2020-04-24 11:33:16 United States 403 GET /owa/auth/logon.aspx?url=https%3a%2f%2f1%2fecp%2f HTTP/1.1
Processed log entries: 23614
2020-04-24 13:00:12 United States 403 GET /cgi-bin/luci HTTP/1.1
Matched log entries: 112
2020-04-24 13:00:13 United States 403 GET /dana-na/auth/url_default/welcome.cgi HTTP/1.1
2020-04-24 13:00:15 United States 403 GET /remote/login?lang=en HTTP/1.1
2020-04-24 13:00:17 United States 403 GET /index.asp HTTP/1.1
2020-04-24 13:00:18 United States 403 GET /htmlV/welcomeMain.htm HTTP/1.1
2020-04-24 20:08:20 United States 403 GET /dana-na/auth/url_default/welcome.cgi HTTP/1.1
2020-04-24 20:08:22 United States 403 GET /remote/login?lang=en HTTP/1.1
2020-04-25 03:57:39 United States 403 GET /home.asp HTTP/1.1
2020-04-25 03:57:39 United States 403 GET /login.cgi?uri= HTTP/1.1
2020-04-25 03:57:39 United States 403 GET /vpn/index.html HTTP/1.1
2020-04-25 03:57:39 United States 403 GET /cgi-bin/luci HTTP/1.1
2020-04-25 03:57:40 United States 403 GET /dana-na/auth/url_default/welcome.cgi HTTP/1.1
2020-04-25 03:57:40 United States 403 GET /remote/login?lang=en HTTP/1.1
2020-04-25 03:57:40 United States 403 GET /index.asp HTTP/1.1
2020-04-25 03:57:40 United States 403 GET /htmlV/welcomeMain.htm HTTP/1.1
2020-04-25 11:56:32 United States 403 GET /owa/auth/logon.aspx?url=https%3a%2f%2f1%2fecp%2f HTTP/1.1
2020-04-25 21:29:50 United States 403 GET /images/favicon-32x32.png HTTP/1.1
2020-04-25 21:30:08 United States 408 None
Processed files: access_log, access_log.1, access_log.2, access_log.3, access_log.4
Processed log entries: 10222
Matched log entries: 90
```
```
**Q: Which user agents are used by all clients in all time?**
Answer: 112
**Q: Which user agents clients have used recently?**
```
```
httpd-logparser --outfields user_agent -d /var/log/httpd/ -f access_log* --noprogress | sort -u
httpd-logparser --files-list /var/log/httpd/access_log --included-fields user_agent | sort -u
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
fasthttp
fasthttp
@ -280,84 +345,115 @@ Wget/1.19.4 (linux-gnu)
WinHTTP/1.1
WinHTTP/1.1
```
```
**Q: Time difference between a single client requests? Exclude Finland! Include only the most recent access_log file .**
**Q: Which is time difference between single client requests? Exclude Finland. Include all access_log files .**
```
```
httpd-logparser --outfields http_status time time_diff country -d /var/log/httpd/ -cf "\!Finland" -f access_log$
httpd-logparser --included-fields http_status,time,time_diff,country --countries "\!Finland" --files-regex /var/log/httpd/old/access_log
200 Taiwan 2022-06-19 12:21:47 NEW_CONN
200 2020-05-01 18:53:07 +2.0 Italy
200 Taiwan 2022-06-19 12:21:48 +1
200 2020-05-01 18:53:19 +12.0 Italy
200 Taiwan 2022-06-19 12:21:49 +1
200 2020-05-01 18:53:20 +1.0 Italy
200 Taiwan 2022-06-19 12:21:49 0
200 2020-05-01 18:53:20 0.0 Italy
200 Taiwan 2022-06-19 12:21:49 0
200 2020-05-01 18:53:21 +1.0 Italy
200 Taiwan 2022-06-19 12:21:49 0
200 2020-05-01 18:53:20 -1.0 Italy
200 Taiwan 2022-06-19 12:21:50 +1
200 2020-05-01 18:53:21 +1.0 Italy
200 Taiwan 2022-06-19 12:21:49 -1
200 2020-05-01 18:53:21 0.0 Italy
200 Taiwan 2022-06-19 12:21:49 0
301 2020-05-01 18:53:21 0.0 Italy
200 Taiwan 2022-06-19 12:21:50 +1
301 2020-05-01 18:53:22 +1.0 Italy
200 Taiwan 2022-06-19 12:21:50 0
200 2020-05-01 18:53:22 0.0 Italy
200 Taiwan 2022-06-19 12:21:50 0
200 2020-05-01 18:53:22 0.0 Italy
200 Taiwan 2022-06-19 12:21:51 +1
200 2020-05-01 18:53:23 +1.0 Italy
200 Taiwan 2022-06-19 12:21:56 +5
200 2020-05-01 18:53:23 0.0 Italy
200 Taiwan 2022-06-19 12:22:04 +8
302 2020-05-01 18:53:24 +1.0 Italy
200 Taiwan 2022-06-19 12:22:05 +1
200 2020-05-01 18:53:24 0.0 Italy
200 Taiwan 2022-06-19 12:22:06 +1
200 2020-05-01 18:53:25 +1.0 Italy
200 Taiwan 2022-06-19 12:22:06 0
302 2020-05-01 18:53:25 0.0 Italy
200 Taiwan 2022-06-19 12:22:06 0
302 2020-05-01 18:53:26 +1.0 Italy
302 Taiwan 2022-06-19 12:22:07 +1
302 2020-05-01 18:53:26 0.0 Italy
200 Taiwan 2022-06-19 12:22:07 0
200 2020-05-01 18:53:26 0.0 Italy
200 Taiwan 2022-06-19 12:22:07 0
200 2020-05-01 18:53:27 +1.0 Italy
200 Taiwan 2022-06-19 12:22:07 0
200 2020-05-01 18:53:32 +5.0 Italy
200 Taiwan 2022-06-19 12:22:07 0
302 2020-05-01 18:54:20 +48.0 Italy
200 Taiwan 2022-06-19 12:22:07 0
408 2020-05-01 18:54:40 +20.0 Italy
200 Taiwan 2022-06-19 12:22:14 +7
...
200 Taiwan 2022-06-19 12:22:14 0
...
200 Japan 2022-06-19 12:34:49 NEW_CONN
200 2020-05-01 22:14:36 NEW_CONN Russian Federation
200 Japan 2022-06-19 12:34:54 +5
200 2020-05-01 22:30:40 +964.0 Russian Federation
200 United States 2022-06-19 12:55:44 NEW_CONN
500 2020-05-01 22:35:01 NEW_CONN Singapore
200 United States 2022-06-19 12:55:44 0
500 2020-05-01 22:35:06 +5.0 Singapore
200 United States 2022-06-19 12:55:50 +6
500 2020-05-01 22:35:09 +3.0 Singapore
200 United States 2022-06-19 12:55:55 +5
500 2020-05-01 22:35:14 +5.0 Singapore
302 France 2022-06-19 13:01:30 NEW_CONN
200 2020-05-01 22:37:47 NEW_CONN Russian Federation
200 United States 2022-06-19 13:10:07 NEW_CONN
200 United States 2022-06-19 13:10:12 +5
302 China 2022-06-19 13:15:59 NEW_CONN
302 China 2022-06-19 13:16:10 +11
302 China 2022-06-19 13:16:11 +1
200 Germany 2022-06-19 13:27:42 NEW_CONN
200 Hong Kong 2022-06-19 13:40:02 NEW_CONN
200 Hong Kong 2022-06-19 13:40:02 0
200 Hong Kong 2022-06-19 13:40:02 0
...
...
200 India 2022-06-19 13:45:03 NEW_CONN
200 India 2022-06-19 13:45:04 +1
200 India 2022-06-19 13:45:04 0
200 India 2022-06-19 13:45:04 0
200 India 2022-06-19 13:45:04 0
200 India 2022-06-19 13:45:05 +1
200 India 2022-06-19 13:45:05 0
200 India 2022-06-19 13:45:05 0
...
...
```
```
## Usage
## Usage
```
```
usage: httpd-logparser [-h] -d [LOG_DIR] -f LOG_FILE [LOG_FILE ...] [-s [LOG_SYNTAX]] [-c STATUS_CODE [STATUS_CODE ...]] [-cf COUNTRY [COUNTRY ...]] [-ot [OUT_TIMEFORMAT]] [-of OUT_FIELD [OUT_FIELD ...]] [-ng] [-gd [GEODB]] [-dl [DAY_LOWER]] [-du [DAY_UPPER]]
usage: httpd-logparser [-h] [-fr [FILES_REGEX]] [-f [FILES_LIST]] [-c CODES [CODES ...]] [-cf [COUNTRIES]] [-tf [TIME_FORMAT]] [-if [INCL_FIELDS]]
[-sb [SORTBY_FIELD]] [-sbr [SORTBY_FIELD_REVERSE]] [-st] [-np]
[-ef [EXCL_FIELDS]] [-gl] [-ge [GEOTOOL_EXEC]] [-gd [GEO_DATABASE_LOCATION]] [-dl [DATE_LOWER]] [-du [DATE_UPPER]]
[-sb [SORTBY_FIELD]] [-ro] [-st] [-p] [--httpd-conf-file] [--httpd-log-nickname] [-lf LOG_FORMAT] [-ph]
[--output-format {table,csv}]
Apache HTTPD server log parser
optional arguments:
optional arguments:
-h, --help show this help message and exit
-h, --help show this help message and exit
-d [LOG_DIR], --dir [LOG_DIR]
-fr [FILES_REGEX], --files-regex [FILES_REGEX]
Apache log file directory.
Apache log files matching input regular expression. (default: None)
-f LOG_FILE [LOG_FILE ...], --files LOG_FILE [LOG_FILE ...]
-f [FILES_LIST], --files-list [FILES_LIST]
Apache log files. Regular expressions supported.
Apache log files. Regular expressions supported. (default: None)
-s [LOG_SYNTAX], --logsyntax [LOG_SYNTAX]
-c CODES [CODES ...], --status-codes CODES [CODES ...]
Apache log files syntax, defined as "LogFormat" directive in Apache configuration.
Print only these numerical status codes. Regular expressions supported. (default: None)
-c STATUS_CODE [STATUS_CODE ...], --statuscodes STATUS_CODE [STATUS_CODE ...]
-cf [COUNTRIES], --countries [COUNTRIES]
Print only these status codes. Regular expressions supported.
Include only these countries. Negative match (exclude): "\!Country" (default: None)
-cf COUNTRY [COUNTRY ...], --countryfilter COUNTRY [COUNTRY ...]
-tf [TIME_FORMAT], --time-format [TIME_FORMAT]
Include only these countries. Negative match (exclude): "\!Country"
Output time format. (default: %d-%m-%Y %H:%M:%S)
-ot [OUT_TIMEFORMAT], --outtimeformat [OUT_TIMEFORMAT]
-if [INCL_FIELDS], --included-fields [INCL_FIELDS]
Output time format. Default: "%d-%m-%Y %H:%M:%S"
Included fields. All fields: all, log_file_name, http_status, remote_host, country, city, time, time_diff, user_agent,
-of OUT_FIELD [OUT_FIELD ...], --outfields OUT_FIELD [OUT_FIELD ...]
http_request (default: http_status, remote_host, time, time_diff, user_agent, http_request)
Output fields. Default: log_file_name, http_status, remote_host, country, city, time, time_diff, user_agent, http_request
-ef [EXCL_FIELDS], --excluded-fields [EXCL_FIELDS]
-ng, --nogeo Skip country check with external "geoiplookup" tool.
Excluded fields. (default: None)
-gd [GEODB], --geodir [GEODB]
-gl, --geo-location Check origin countries with external "geoiplookup" tool. NOTE: Automatically includes "country" and "city" fields. (default:
Database file directory for "geoiplookup" tool. Default: /usr/share/GeoIP/
False)
-dl [DAY_LOWER], --daylower [DAY_LOWER]
-ge [GEOTOOL_EXEC], --geotool-exec [GEOTOOL_EXEC]
Do not check log entries older than this day. Day syntax: 31-12-2020
"geoiplookup" tool executable found in PATH. (default: geoiplookup)
-du [DAY_UPPER], --dayupper [DAY_UPPER]
-gd [GEO_DATABASE_LOCATION], --geo-database-dir [GEO_DATABASE_LOCATION]
Do not check log entries newer than this day. Day syntax: 31-12-2020
Database file directory for "geoiplookup" tool. (default: /usr/share/GeoIP/)
-sb [SORTBY_FIELD], --sortby [SORTBY_FIELD]
-dl [DATE_LOWER], --day-lower [DATE_LOWER]
Sort by an output field.
Do not check log entries older than this day. Day syntax: 31-12-2020 (default: None)
-sbr [SORTBY_FIELD_REVERSE], --sortbyreverse [SORTBY_FIELD_REVERSE]
-du [DATE_UPPER], --day-upper [DATE_UPPER]
Sort by an output field, reverse order.
Do not check log entries newer than this day. Day syntax: 31-12-2020 (default: None)
-st, --stats Show short statistics at the end.
-sb [SORTBY_FIELD], --sort-by [SORTBY_FIELD]
-np, --noprogress Do not show progress information.
Sort by an output field. (default: None)
-ro, --reverse-order Sort in reverse order. (default: False)
-st, --show-stats Show short statistics at the end. (default: False)
-p, --show-progress Show progress information. (default: False)
--httpd-conf-file Apache HTTPD configuration file with LogFormat directive. (default: /etc/httpd/conf/httpd.conf)
--httpd-log-nickname LogFormat directive nickname (default: combinedio)
-lf LOG_FORMAT, --log-format LOG_FORMAT
Log format, manually defined. (default: None)
-ph, --print-headers Print column headers. (default: False)
--output-format {table,csv}
Output format for results. (default: table)
```
```
## License
## License