|
|
@ -8,7 +8,9 @@ This program extract various website information based on URL addresses. This da |
|
|
|
|
|
|
|
### Features |
|
|
|
|
|
|
|
The program does the following procedures: |
|
|
|
**NOTE**: See sample JSON data: [Get file](sample_dataset.json) |
|
|
|
|
|
|
|
To summarize, the program does the following procedures for listed URLs: |
|
|
|
|
|
|
|
- Gets domain registrar |
|
|
|
- Gets webpage title and automatically compares it to the domain registrar name |
|
|
@ -62,9 +64,17 @@ The following screenshots are generated with `matplotlib` |
|
|
|
|
|
|
|
![](screenshots/domain_figure_tsfi.png) |
|
|
|
|
|
|
|
## Sample data |
|
|
|
## Known bugs issues and missing features |
|
|
|
|
|
|
|
- Non-UTF-8 character decoding not implemented |
|
|
|
|
|
|
|
- If multiple JSON data files exist, a wrong JSON data file is likely selected |
|
|
|
|
|
|
|
- Get URLs and other parameters from command line |
|
|
|
|
|
|
|
- More data visualization and compherensive analysis |
|
|
|
|
|
|
|
- `JSON sample data`: [Get file](sample_dataset.json) |
|
|
|
- Null data may be generated in some cases |
|
|
|
|
|
|
|
## License |
|
|
|
|
|
|
|