List of python scraping libraries I use to develop crawlers based on my choice of scraping library:
- No other than - SCRAPY - fast high-level screen scraping and web crawling
framework, used to crawl websites and extract structured data from their
pages. It can be used for a wide range of purposes, from data mining to
monitoring and automated testing
- UrlLib2 + Beautiful Soup - If I had to build framework from scratch this is the first choice
- Mechanize + Beautiful Soup - Replace UrlLib2 with Mechanize - Easy HTML form filling, any URL can be opened, not just
HTTP,
Automatic handling of HTTP-Equiv and Refresh, Easy link parsing and following
Please leave a feedback if you are using some other library that I should list here.
BeClasp Consulting provides Python and .NET based website scraping service and have wrote 1000's of parsers so far ranging from data crawling for Bank Accounts reconciliation, e-commerce stores or other data mining services. Drop us an email at mail@beclaspconsulting.net to know more about the services we offer.