From 49b6a1df6e277863677e590f5aa89ae974bd5d2e Mon Sep 17 00:00:00 2001 From: Pekka Helenius Date: Thu, 11 Mar 2021 21:25:50 +0200 Subject: [PATCH] Update README --- README.md | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 5b6f3bb..a95d9e6 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,9 @@ This program extract various website information based on URL addresses. This da ### Features -The program does the following procedures: +**NOTE**: See sample JSON data: [Get file](sample_dataset.json) + +To summarize, the program does the following procedures for listed URLs: - Gets domain registrar - Gets webpage title and automatically compares it to the domain registrar name @@ -62,9 +64,17 @@ The following screenshots are generated with `matplotlib` ![](screenshots/domain_figure_tsfi.png) -## Sample data +## Known bugs issues and missing features + +- Non-UTF-8 character decoding not implemented + +- If multiple JSON data files exist, a wrong JSON data file is likely selected + +- Get URLs and other parameters from command line + +- More data visualization and compherensive analysis -- `JSON sample data`: [Get file](sample_dataset.json) +- Null data may be generated in some cases ## License