List of python scraping libraries I use to develop crawlers based on my choice of scraping library:
BeClasp Consulting provides Python and .NET based website scraping service and have wrote 1000's of parsers so far ranging from data crawling for Bank Accounts reconciliation, e-commerce stores or other data mining services. Drop us an email at mail@beclaspconsulting.net to know more about the services we offer.
- No other than - SCRAPY - fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing
- UrlLib2 + Beautiful Soup - If I had to build framework from scratch this is the first choice
- Mechanize + Beautiful Soup - Replace UrlLib2 with Mechanize - Easy HTML form filling, any URL can be opened, not just
HTTP,
Automatic handling of HTTP-Equiv and Refresh, Easy link parsing and following
BeClasp Consulting provides Python and .NET based website scraping service and have wrote 1000's of parsers so far ranging from data crawling for Bank Accounts reconciliation, e-commerce stores or other data mining services. Drop us an email at mail@beclaspconsulting.net to know more about the services we offer.