Simple Apache/HTTPD log parser for administrative analysis
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

461 lines
25 KiB

4 years ago
4 years ago
4 years ago
1 year ago
4 years ago
1 year ago
4 years ago
1 year ago
4 years ago
1 year ago
1 year ago
1 year ago
4 years ago
1 year ago
4 years ago
1 year ago
4 years ago
1 year ago
4 years ago
1 year ago
4 years ago
1 year ago
4 years ago
1 year ago
4 years ago
1 year ago
4 years ago
1 year ago
4 years ago
1 year ago
4 years ago
1 year ago
4 years ago
1 year ago
4 years ago
1 year ago
4 years ago
1 year ago
4 years ago
1 year ago
4 years ago
1 year ago
4 years ago
1 year ago
4 years ago
1 year ago
4 years ago
1 year ago
4 years ago
1 year ago
4 years ago
  1. # Apache log parser
  2. Simple Apache/HTTPD command-line log parser for short analysis, targeted to web server administration tasks.
  3. Unix-alike systems only.
  4. ## Motivation
  5. Keep it simple. Very simple.
  6. Although advanced and nice-looking log analytic tools such as [Elastic Stack](https://www.elastic.co/products/) exists, I wanted something far more simple and with far less overhead for weekly tasks and for configuring an Apache web server. Therefore, I wrote this simple Python script to parse Apache web server logs.
  7. **Advantages** of this tool are little overhead, piping output to other Unix tools and doing some quick log checks. The main idea is to give desired output for short analysis so that you can properly configure your web server protection mechanisms and network environment based on the actual server data.
  8. This tool is not for intrusion detection/prevention or does not alert administration about hostile penetration attempts. However, it may reveal simple underlying misconfigurations such as invalid URL references on your website.
  9. ## Requirements
  10. Following Arch Linux packages. If you use another distribution, refer to corresponding packages:
  11. ```
  12. python
  13. python-apachelogs
  14. ```
  15. [python-apachelogs](https://github.com/jwodder/apachelogs/) is not available either on Arch Linux repositories or AUR repositories. Therefore, I provide a PKGBUILD file to install it. [python-apachelogs - PKGBUILD](python-apachelogs/PKGBUILD)
  16. `python-apachelogs` has a sub-dependency of [python-pydicti](python-apachelogs/python-pydicti/PKGBUILD) package.
  17. Recommended packages for IP address geo-location:
  18. ```
  19. geoip
  20. geoip-database
  21. ```
  22. ## Installation
  23. Arch Linux:
  24. run `updpkgsums && makepkg -Cfi` in [apache-logparser](apache-logparser/) directory. The command installs `httpd-logparser` executable file in `/usr/bin/` folder.
  25. ## Supported output formats
  26. - `table` and `csv`
  27. ## Examples
  28. **Q: List unique connections (IP addresses) associated with country and city location data, using the last Apache log file?**
  29. ```
  30. httpd-logparser --files-list /var/log/httpd/access_log --included-fields time,remote_host,country,city | sort -k 2 -u | sort -k 3
  31. 103.102.153.XXX Indonesia Unknown: -6.175000, 106.828598 2022-06-12 10:33:58
  32. 103.102.153.XXX Indonesia Unknown: -6.175000, 106.828598 2022-06-12 10:33:59
  33. 103.102.153.XXX Indonesia Unknown: -6.175000, 106.828598 2022-06-12 10:34:00
  34. 103.102.153.XXX Indonesia Unknown: -6.175000, 106.828598 2022-06-12 10:34:01
  35. 103.144.178.XXX Indonesia Unknown: -6.175000, 106.828598 2022-06-16 06:34:19
  36. 62.214.113.XXX Germany Unterhaching 2022-06-10 14:39:16
  37. 62.214.113.XXX Germany Unterhaching 2022-06-10 16:34:15
  38. 62.214.113.XXX Germany Unterhaching 2022-06-10 16:40:03
  39. 62.214.113.XXX Germany Unterhaching 2022-06-10 16:40:04
  40. 62.214.113.XXX Germany Unterhaching 2022-06-10 16:40:05
  41. 84.234.169.XXX Norway Valderoy 2022-06-06 00:20:18
  42. 194.137.241.XXX Finland Vantaa 2022-06-07 12:20:42
  43. 194.137.241.XXX Finland Vantaa 2022-06-07 12:20:43
  44. 194.137.241.XXX Finland Vantaa 2022-06-07 12:20:44
  45. ...
  46. 176.108.111.XXX Ukraine Vyzhnytsya 2022-06-07 21:25:38
  47. 176.108.111.XXX Ukraine Vyzhnytsya 2022-06-07 21:25:39
  48. 176.108.111.XXX Ukraine Vyzhnytsya 2022-06-07 21:25:40
  49. 176.108.111.XXX Ukraine Vyzhnytsya 2022-06-07 21:25:41
  50. 176.108.111.XXX Ukraine Vyzhnytsya 2022-06-07 21:25:42
  51. 176.108.111.XXX Ukraine Vyzhnytsya 2022-06-08 23:35:25
  52. 176.108.111.XXX Ukraine Vyzhnytsya 2022-06-11 19:52:42
  53. 82.207.245.XXX Germany Wachtberg 2022-06-03 02:26:58
  54. 82.207.245.XXX Germany Wachtberg 2022-06-03 02:27:08
  55. 82.207.245.XXX Germany Wachtberg 2022-06-03 02:27:09
  56. 82.207.245.XXX Germany Wachtberg 2022-06-03 02:27:10
  57. 79.191.159.XXX Poland Warsaw 2022-06-11 18:05:13
  58. 49.7.20.XXX China Wenzhou 2022-06-09 15:26:26
  59. 49.7.21.XXX China Wenzhou 2022-06-09 23:25:57
  60. 49.7.20.XXX China Wenzhou 2022-06-19 01:41:41
  61. 81.82.244.XXX Belgium Wetteren 2022-06-13 13:45:21
  62. 81.82.244.XXX Belgium Wetteren 2022-06-13 13:49:10
  63. 81.82.244.XXX Belgium Wetteren 2022-06-13 13:49:11
  64. 81.82.244.XXX Belgium Wetteren 2022-06-13 13:49:12
  65. 81.82.244.XXX Belgium Wetteren 2022-06-13 13:49:13
  66. 81.82.244.XXX Belgium Wetteren 2022-06-13 13:49:14
  67. 81.82.244.XXX Belgium Wetteren 2022-06-13 13:49:41
  68. 81.82.244.XXX Belgium Wetteren 2022-06-13 13:49:46
  69. 95.223.231.XXX Germany Wiesbaden 2022-06-04 21:42:20
  70. 95.223.231.XXX Germany Wiesbaden 2022-06-04 21:42:21
  71. 95.223.231.XXX Germany Wiesbaden 2022-06-04 21:42:23
  72. 95.223.231.XXX Germany Wiesbaden 2022-06-04 21:42:28
  73. 37.201.116.XXX Germany Wiesbaden 2022-06-10 19:55:50
  74. 113.57.152.XXX China Wuhan 2022-06-14 15:51:21
  75. 113.57.152.XXX China Wuhan 2022-06-14 15:51:22
  76. 113.57.152.XXX China Wuhan 2022-06-14 15:51:23
  77. 113.57.152.XXX China Wuhan 2022-06-14 15:51:25
  78. 113.57.152.XXX China Wuhan 2022-06-14 15:51:26
  79. 113.57.152.XXX China Wuhan 2022-06-14 15:51:57
  80. 113.57.152.XXX China Wuhan 2022-06-14 15:51:58
  81. 113.57.152.XXX China Wuhan 2022-06-14 15:52:01
  82. 89.164.183.XXX Croatia Zagreb 2022-06-04 11:44:22
  83. 89.164.183.XXX Croatia Zagreb 2022-06-04 11:44:23
  84. 89.164.183.XXX Croatia Zagreb 2022-06-04 11:44:24
  85. 89.164.183.XXX Croatia Zagreb 2022-06-04 11:44:25
  86. 89.164.183.XXX Croatia Zagreb 2022-06-04 11:44:26
  87. 86.32.46.XXX Croatia Zagreb 2022-06-04 16:45:49
  88. 86.32.46.XXX Croatia Zagreb 2022-06-04 16:45:51
  89. 86.32.46.XXX Croatia Zagreb 2022-06-04 16:45:52
  90. 86.32.46.XXX Croatia Zagreb 2022-06-04 16:45:53
  91. 86.32.46.XXX Croatia Zagreb 2022-06-04 16:45:55
  92. 86.32.46.XXX Croatia Zagreb 2022-06-04 16:45:56
  93. 86.32.46.XXX Croatia Zagreb 2022-06-04 16:45:59
  94. 86.32.46.XXX Croatia Zagreb 2022-06-04 16:46:00
  95. 85.10.56.XXX Croatia Zagreb 2022-06-09 19:39:55
  96. 85.10.56.XXX Croatia Zagreb 2022-06-17 19:57:56
  97. 122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:41
  98. 122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:42
  99. 122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:43
  100. 122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:44
  101. 122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:45
  102. 122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:46
  103. 122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:47
  104. 122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:48
  105. 122.56.232.XXX New Zealand Auckland 2022-06-02 08:46:49
  106. 121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:22
  107. 121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:23
  108. 121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:24
  109. 121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:25
  110. 121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:26
  111. 121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:27
  112. 121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:28
  113. 121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:29
  114. 121.98.28.XXX New Zealand Dunedin 2022-06-08 14:32:30
  115. 185.113.213.XXX Netherlands Zennewijnen 2022-06-15 11:54:36
  116. 185.113.213.XXX Netherlands Zennewijnen 2022-06-15 11:54:37
  117. 185.113.213.XXX Netherlands Zennewijnen 2022-06-15 11:54:39
  118. ```
  119. NOTE: The last numerical part of all ip addresses are anonymized with `XXX` string.
  120. **Q: How many valid requests from Finland and Sweden occured between 15th - 24th April 2022?**
  121. ```
  122. httpd-logparser --files-regex /var/log/httpd/access_log --included-fields time,http_status,country --sort-by time --status-codes ^20* --day-lower "15-04-2022" --day-upper "24-04-2022" --show-stats --show-progress
  123. File count: 5
  124. Lines in total: 151134
  125. Processing file: /var/log/httpd/access_log (lines: 40213)
  126. Processing file: /var/log/httpd/access_log.1 (lines: 37518)
  127. Processing file: /var/log/httpd/access_log.2 (lines: 23468)
  128. Processing file: /var/log/httpd/access_log.3 (lines: 24045)
  129. Processing file: /var/log/httpd/access_log.4 (lines: 25890)
  130. Processing log entry: 142524 (94.30%)
  131. ...
  132. 200 Italy 2022-04-22 20:54:39
  133. 200 Italy 2022-04-22 20:54:39
  134. 200 Italy 2022-04-22 20:54:39
  135. 200 Italy 2022-04-22 20:54:39
  136. 200 Italy 2022-04-22 20:54:39
  137. 200 Taiwan 2022-04-22 21:00:27
  138. 200 Taiwan 2022-04-22 21:00:28
  139. 200 Taiwan 2022-04-22 21:00:29
  140. 200 Taiwan 2022-04-22 21:00:29
  141. ...
  142. 200 United States 2022-04-23 22:29:36
  143. 200 United States 2022-04-23 22:38:50
  144. 200 United States 2022-04-23 23:06:15
  145. 200 United States 2022-04-23 23:14:08
  146. 200 United States 2022-04-23 23:14:09
  147. 200 United States 2022-04-23 23:24:21
  148. 200 United States 2022-04-23 23:24:22
  149. 200 Canada 2022-04-23 23:31:52
  150. 200 Ireland 2022-04-23 23:32:05
  151. 200 United States 2022-04-23 23:33:36
  152. 200 Vietnam 2022-04-23 23:53:37
  153. Processed files: /var/log/httpd/access_log, /var/log/httpd/access_log.1, /var/log/httpd/access_log.2, /var/log/httpd/access_log.3, /var/log/httpd/access_log.4
  154. Processed log entries: 151134
  155. Matched log entries: 9012
  156. ```
  157. Answer: 9012
  158. **Q: How many redirects have occured since the 1st April 2022 according to two selected log files?**
  159. ```
  160. httpd-logparser --outfields time http_status country -d /var/log/httpd/ -c ^30* -f access_log* -dl "01-04-2020" --sortby time --stats
  161. httpd-logparser --files-regex /var/log/httpd/access_log.\[2-3\] --included-fields time,http_status,country --sort-by time --status-codes ^30* --day-lower "01-04-2022" --show-stats
  162. ...
  163. 304 Canada 2022-05-23 01:52:45
  164. 302 Canada 2022-05-23 01:53:33
  165. 302 Europe 2022-05-23 01:56:03
  166. 302 Poland 2022-05-23 02:00:31
  167. 302 Russian Federation 2022-05-23 02:52:50
  168. 302 United States 2022-05-23 04:34:30
  169. 302 France 2022-05-23 04:51:31
  170. 302 Germany 2022-05-23 05:02:16
  171. 302 Russian Federation 2022-05-23 05:04:13
  172. 302 Russian Federation 2022-05-23 05:04:14
  173. 302 Russian Federation 2022-05-23 05:04:14
  174. 302 United States 2022-05-23 05:11:10
  175. 302 United States 2022-05-23 05:11:11
  176. 302 Russian Federation 2022-05-23 05:23:09
  177. 302 China 2022-05-23 05:54:41
  178. ...
  179. 302 Germany 2022-05-31 19:53:18
  180. 302 Germany 2022-05-31 19:53:18
  181. 302 Germany 2022-05-31 19:53:18
  182. 302 Germany 2022-05-31 19:53:19
  183. 302 Germany 2022-05-31 19:53:19
  184. 304 Finland 2022-05-31 20:06:55
  185. 304 Finland 2022-05-31 20:16:02
  186. 304 Finland 2022-05-31 20:16:03
  187. 304 Finland 2022-05-31 20:16:06
  188. 302 Russian Federation 2022-05-31 20:40:33
  189. 302 United Kingdom 2022-05-31 21:09:32
  190. 302 China 2022-05-31 21:13:38
  191. 302 Russian Federation 2022-05-31 21:20:09
  192. 302 Romania 2022-05-31 22:01:31
  193. 304 United States 2022-05-31 22:11:30
  194. 302 Russian Federation 2022-05-31 22:59:23
  195. 302 United States 2022-05-31 23:16:52
  196. 304 Ukraine 2022-05-31 23:22:50
  197. 302 Russian Federation 2022-05-31 23:30:51
  198. 302 Netherlands 2022-05-31 23:37:10
  199. 302 Netherlands 2022-05-31 23:37:11
  200. 302 Netherlands 2022-05-31 23:37:12
  201. Processed files: /var/log/httpd/access_log.2, /var/log/httpd/access_log.3
  202. Processed log entries: 77730
  203. Matched log entries: 6788
  204. Invalid lines:
  205. File: /var/log/httpd/access_log.2, line: 24668
  206. ```
  207. Answer: 6788
  208. You should also check any invalid log lines detected by the tool.
  209. **Q: How many `4XX` codes have connected clients from China and United States produced?**
  210. ```
  211. httpd-logparser --files-regex /var/log/httpd/access_log --included-fields time,country,http_status,http_request --countries "United States",China --sort-by time --status-codes ^4 --show-progress --show-stats
  212. File count: 2
  213. Lines in total: 23614
  214. Processing file: /var/log/httpd/access_log (lines: 12021)
  215. Processing file: /var/log/httpd/access_log.1 (lines: 11593)
  216. Processing log entry: 18423 (78.01%)
  217. ...
  218. 408 United States 2022-06-01 03:45:18 None
  219. 408 United States 2022-06-01 03:45:18 None
  220. 408 United States 2022-06-01 09:11:15 None
  221. 408 United States 2022-06-01 11:36:05 None
  222. 408 United States 2022-06-01 11:36:05 None
  223. 421 United States 2022-06-01 13:08:29 GET / HTTP/1.1
  224. 408 United States 2022-06-01 19:44:42 None
  225. 408 United States 2022-06-01 19:44:42 None
  226. 408 China 2022-06-02 06:30:51 None
  227. 408 China 2022-06-02 06:30:51 None
  228. 408 China 2022-06-02 06:30:51 None
  229. 408 United States 2022-06-02 11:45:57 None
  230. 408 United States 2022-06-02 11:46:05 None
  231. 408 United States 2022-06-02 11:46:18 None
  232. 408 United States 2022-06-02 20:53:49 None
  233. 408 United States 2022-06-02 20:53:49 None
  234. 408 United States 2022-06-03 00:01:39 None
  235. 408 United States 2022-06-03 00:02:04 None
  236. 408 United States 2022-06-03 00:02:37 None
  237. 408 United States 2022-06-03 00:21:26 None
  238. 408 China 2022-06-03 11:39:22 None
  239. 408 United States 2022-06-03 15:41:34 None
  240. 408 United States 2022-06-04 01:28:08 None
  241. 408 United States 2022-06-04 07:29:53 None
  242. 408 United States 2022-06-04 07:29:56 None
  243. 408 United States 2022-06-04 07:29:56 None
  244. 408 United States 2022-06-04 11:25:10 None
  245. 408 United States 2022-06-04 11:25:10 None
  246. 408 China 2022-06-04 11:37:11 None
  247. 408 United States 2022-06-04 17:36:35 None
  248. 408 China 2022-06-05 15:56:35 None
  249. 408 China 2022-06-05 15:56:45 None
  250. 408 United States 2022-06-06 01:32:25 None
  251. 408 United States 2022-06-06 01:32:25 None
  252. 408 United States 2022-06-06 01:32:29 None
  253. ...
  254. Processed files: /var/log/httpd/access_log, /var/log/httpd/access_log.1
  255. Processed log entries: 23614
  256. Matched log entries: 112
  257. ```
  258. Answer: 112
  259. **Q: Which user agents clients have used recently?**
  260. ```
  261. httpd-logparser --files-list /var/log/httpd/access_log --included-fields user_agent | sort -u
  262. facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
  263. fasthttp
  264. Go-http-client/1.1
  265. HTTP Banner Detection (https://security.ipip.net)
  266. kubectl/v1.12.0 (linux/amd64) kubernetes/0ed3388
  267. libwww-perl/5.833
  268. libwww-perl/6.06
  269. libwww-perl/6.43
  270. Microsoft Office Word 2014
  271. Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
  272. Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
  273. Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50728)
  274. Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 10.0; Win64; x64; Trident/7.0; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30729; Tablet PC 2.0)
  275. Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; InfoPath.2)
  276. ...
  277. ...
  278. Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0
  279. Mozilla/5.0 (X11; Linux x86_64; rv:73.0) Gecko/20100101 Firefox/73.0
  280. Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:58.0) Gecko/20100101 Firefox/58.0
  281. Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0
  282. Mozilla/5.0 zgrab/0.x
  283. Mozilla/5.0 zgrab/0.x (compatible; Researchscan/t12sns; +http://researchscan.comsys.rwth-aachen.de)
  284. Mozilla/5.0 zgrab/0.x (compatible; Researchscan/t13rl; +http://researchscan.comsys.rwth-aachen.de)
  285. NetSystemsResearch studies the availability of various services across the internet. Our website is netsystemsresearch.com
  286. None
  287. python-requests/1.2.3 CPython/2.7.16 Linux/4.14.165-102.185.amzn1.x86_64
  288. python-requests/2.10.0
  289. python-requests/2.19.1
  290. python-requests/2.22.0
  291. python-requests/2.23.0
  292. python-requests/2.6.0 CPython/2.7.5 Linux/3.10.0-1062.12.1.el7.x86_64
  293. python-requests/2.6.0 CPython/2.7.5 Linux/3.10.0-1062.18.1.el7.x86_64
  294. Python-urllib/3.7
  295. Ruby
  296. Wget/1.19.4 (linux-gnu)
  297. WinHTTP/1.1
  298. ```
  299. **Q: Which is time difference between single client requests? Exclude Finland. Include all access_log files.**
  300. ```
  301. httpd-logparser --included-fields http_status,time,time_diff,country --countries "\!Finland" --files-regex /var/log/httpd/old/access_log
  302. 200 Taiwan 2022-06-19 12:21:47 NEW_CONN
  303. 200 Taiwan 2022-06-19 12:21:48 +1
  304. 200 Taiwan 2022-06-19 12:21:49 +1
  305. 200 Taiwan 2022-06-19 12:21:49 0
  306. 200 Taiwan 2022-06-19 12:21:49 0
  307. 200 Taiwan 2022-06-19 12:21:49 0
  308. 200 Taiwan 2022-06-19 12:21:50 +1
  309. 200 Taiwan 2022-06-19 12:21:49 -1
  310. 200 Taiwan 2022-06-19 12:21:49 0
  311. 200 Taiwan 2022-06-19 12:21:50 +1
  312. 200 Taiwan 2022-06-19 12:21:50 0
  313. 200 Taiwan 2022-06-19 12:21:50 0
  314. 200 Taiwan 2022-06-19 12:21:51 +1
  315. 200 Taiwan 2022-06-19 12:21:56 +5
  316. 200 Taiwan 2022-06-19 12:22:04 +8
  317. 200 Taiwan 2022-06-19 12:22:05 +1
  318. 200 Taiwan 2022-06-19 12:22:06 +1
  319. 200 Taiwan 2022-06-19 12:22:06 0
  320. 200 Taiwan 2022-06-19 12:22:06 0
  321. 302 Taiwan 2022-06-19 12:22:07 +1
  322. 200 Taiwan 2022-06-19 12:22:07 0
  323. 200 Taiwan 2022-06-19 12:22:07 0
  324. 200 Taiwan 2022-06-19 12:22:07 0
  325. 200 Taiwan 2022-06-19 12:22:07 0
  326. 200 Taiwan 2022-06-19 12:22:07 0
  327. 200 Taiwan 2022-06-19 12:22:14 +7
  328. 200 Taiwan 2022-06-19 12:22:14 0
  329. 200 Japan 2022-06-19 12:34:49 NEW_CONN
  330. 200 Japan 2022-06-19 12:34:54 +5
  331. 200 United States 2022-06-19 12:55:44 NEW_CONN
  332. 200 United States 2022-06-19 12:55:44 0
  333. 200 United States 2022-06-19 12:55:50 +6
  334. 200 United States 2022-06-19 12:55:55 +5
  335. 302 France 2022-06-19 13:01:30 NEW_CONN
  336. 200 United States 2022-06-19 13:10:07 NEW_CONN
  337. 200 United States 2022-06-19 13:10:12 +5
  338. 302 China 2022-06-19 13:15:59 NEW_CONN
  339. 302 China 2022-06-19 13:16:10 +11
  340. 302 China 2022-06-19 13:16:11 +1
  341. 200 Germany 2022-06-19 13:27:42 NEW_CONN
  342. 200 Hong Kong 2022-06-19 13:40:02 NEW_CONN
  343. 200 Hong Kong 2022-06-19 13:40:02 0
  344. 200 Hong Kong 2022-06-19 13:40:02 0
  345. ...
  346. 200 India 2022-06-19 13:45:03 NEW_CONN
  347. 200 India 2022-06-19 13:45:04 +1
  348. 200 India 2022-06-19 13:45:04 0
  349. 200 India 2022-06-19 13:45:04 0
  350. 200 India 2022-06-19 13:45:04 0
  351. 200 India 2022-06-19 13:45:05 +1
  352. 200 India 2022-06-19 13:45:05 0
  353. 200 India 2022-06-19 13:45:05 0
  354. ...
  355. ```
  356. ## Usage
  357. ```
  358. usage: httpd-logparser [-h] [-fr [FILES_REGEX]] [-f [FILES_LIST]] [-c CODES [CODES ...]] [-cf [COUNTRIES]] [-tf [TIME_FORMAT]] [-if [INCL_FIELDS]]
  359. [-ef [EXCL_FIELDS]] [-gl] [-ge [GEOTOOL_EXEC]] [-gd [GEO_DATABASE_LOCATION]] [-dl [DATE_LOWER]] [-du [DATE_UPPER]]
  360. [-sb [SORTBY_FIELD]] [-ro] [-st] [-p] [--httpd-conf-file] [--httpd-log-nickname] [-lf LOG_FORMAT] [-ph]
  361. [--output-format {table,csv}]
  362. Apache HTTPD server log parser
  363. optional arguments:
  364. -h, --help show this help message and exit
  365. -fr [FILES_REGEX], --files-regex [FILES_REGEX]
  366. Apache log files matching input regular expression. (default: None)
  367. -f [FILES_LIST], --files-list [FILES_LIST]
  368. Apache log files. Regular expressions supported. (default: None)
  369. -c CODES [CODES ...], --status-codes CODES [CODES ...]
  370. Print only these numerical status codes. Regular expressions supported. (default: None)
  371. -cf [COUNTRIES], --countries [COUNTRIES]
  372. Include only these countries. Negative match (exclude): "\!Country" (default: None)
  373. -tf [TIME_FORMAT], --time-format [TIME_FORMAT]
  374. Output time format. (default: %d-%m-%Y %H:%M:%S)
  375. -if [INCL_FIELDS], --included-fields [INCL_FIELDS]
  376. Included fields. All fields: all, log_file_name, http_status, remote_host, country, city, time, time_diff, user_agent,
  377. http_request (default: http_status, remote_host, time, time_diff, user_agent, http_request)
  378. -ef [EXCL_FIELDS], --excluded-fields [EXCL_FIELDS]
  379. Excluded fields. (default: None)
  380. -gl, --geo-location Check origin countries with external "geoiplookup" tool. NOTE: Automatically includes "country" and "city" fields. (default:
  381. False)
  382. -ge [GEOTOOL_EXEC], --geotool-exec [GEOTOOL_EXEC]
  383. "geoiplookup" tool executable found in PATH. (default: geoiplookup)
  384. -gd [GEO_DATABASE_LOCATION], --geo-database-dir [GEO_DATABASE_LOCATION]
  385. Database file directory for "geoiplookup" tool. (default: /usr/share/GeoIP/)
  386. -dl [DATE_LOWER], --day-lower [DATE_LOWER]
  387. Do not check log entries older than this day. Day syntax: 31-12-2020 (default: None)
  388. -du [DATE_UPPER], --day-upper [DATE_UPPER]
  389. Do not check log entries newer than this day. Day syntax: 31-12-2020 (default: None)
  390. -sb [SORTBY_FIELD], --sort-by [SORTBY_FIELD]
  391. Sort by an output field. (default: None)
  392. -ro, --reverse-order Sort in reverse order. (default: False)
  393. -st, --show-stats Show short statistics at the end. (default: False)
  394. -p, --show-progress Show progress information. (default: False)
  395. --httpd-conf-file Apache HTTPD configuration file with LogFormat directive. (default: /etc/httpd/conf/httpd.conf)
  396. --httpd-log-nickname LogFormat directive nickname (default: combinedio)
  397. -lf LOG_FORMAT, --log-format LOG_FORMAT
  398. Log format, manually defined. (default: None)
  399. -ph, --print-headers Print column headers. (default: False)
  400. --output-format {table,csv}
  401. Output format for results. (default: table)
  402. ```
  403. ## License
  404. GPLv3.