diff --git a/README.md b/README.md index 5b6f3bb..a95d9e6 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,9 @@ This program extract various website information based on URL addresses. This da ### Features -The program does the following procedures: +**NOTE**: See sample JSON data: [Get file](sample_dataset.json) + +To summarize, the program does the following procedures for listed URLs: - Gets domain registrar - Gets webpage title and automatically compares it to the domain registrar name @@ -62,9 +64,17 @@ The following screenshots are generated with `matplotlib` ![](screenshots/domain_figure_tsfi.png) -## Sample data +## Known bugs issues and missing features + +- Non-UTF-8 character decoding not implemented + +- If multiple JSON data files exist, a wrong JSON data file is likely selected + +- Get URLs and other parameters from command line + +- More data visualization and compherensive analysis -- `JSON sample data`: [Get file](sample_dataset.json) +- Null data may be generated in some cases ## License