Show HN: G-Scraper, a GUI Web Scraper, Written in Python

17 points

2 years ago

Target audience? Basically data collectors or anyone trying to scrape data from websites using a GUI

What my project does: -Take URLs -Take elements to scrape from those webpages (this is optional in the sense that if you dont specify any elements the app will just scrape the entire page) -You can also send web parameters like Headers, Payloads along with specific URLs. This means it can perform any logins that are necessary -Is able to log the results in a log file, a separate one for each scrape -Data is stored in form of .txt files

Some unique features of this project: -Can scrape multiple URLs -Can scrape multiple elements in a single URL -Supports GET and POST requests -Scraping runs in a separate thread than the GUI, so you can close the app or use it and the scraping will continue -You can edit the added variables or delete them. You can also reset the entire app's current data to start a new set of scrapes -Very very unique filenames for each file created -3 types of log files: webpage scrape log, element scrape log and error log

Some drawbacks of the project: -No output to user AT ALL so user has to rely on checking the output folder for scrape's status -Probably does not log all errors although I tried to recreate every possible error -Once scrape has started there is no way to stop it -Can only scrape textual data (texts, links etc.). So no scraping of things like images, videos -Cannot scrape text of a tags a.k.a link tags, only their links

Hope you enjoy it. Feel free to leave any suggestions

1 comment