Browse Source

Update README

master
Pekka Helenius 3 years ago
parent
commit
e3d013852b
1 changed files with 70 additions and 2 deletions
  1. +70
    -2
      README.md

+ 70
- 2
README.md View File

@ -1,3 +1,71 @@
# url-analyzer
# URL Analyzer
URL data analyzer and extractor. Detect malicious signs and other useful data associated with URLs.
URL data analyzer and extractor. Detect malicious signs and other useful data associated with URLs.
## About
This program extract various website information based on URL addresses. This data can be used to analyze maliciousness of the given URL.
### Features
The program does the following procedures:
- Gets domain registrar
- Gets webpage title and automatically compares it to the domain registrar name
- Gets initial and final destination of a given URL
- Analyzes whether final destination domain is same than the initial one
- Gets URL redirects and HTTP response status codes
- Fetches WHOIS data
- Gets domain timestamps such as creation, update and expire days
- Exact days & days relative to the current day
- Gets content and number of iframes (for detecting possible XSS; Cross-Site Scripting)
- Gets URL references on a webpage
- **Local** domain referrals
- **External** URL referrals
- **Multidot** URLs (ones with `../` in the URL path)
- Gets domain registrars for each URL
## Requirements
```
Python 3
Python 3 BeautifulSoup4 python-beautifulsoup4
Python 3 whois <= 0.7.3 python-whois; PyPI
Python 3 JSON Schema python-jsonschema
Python 3 Numpy python-numpy
Python 3 matplotlib python-matplotlib
```
**NOTE**: Some Linux distributions may use `python3` executable instead of `python` for Python 3.
### Other requirements
- Jupyter (recommended)
- Working DNS name resolution
- Internet connection
## Code
- `jupyter notebook (python 3)`: [Get file](code/url-analyzer.ipynb)
- `python 3`: [Get file](code/url-analyzer.py)
## Screenshots
The following screenshots are generated with `matplotlib`
### Domains associated with HTML URL data
![](screenshots/domain_figure_hsfi.png)
![](screenshots/domain_figure_tsfi.png)
## Sample data
- `JSON sample data`: [Get file](sample_dataset.json)
## License
N/A

Loading…
Cancel
Save