Warning
This project is deprecated and it has been merged into Scrapy Tutorial Series: Web Scraping Using Python
Getting help¶
Basic concepts¶
- Intro
- Introduction of this project
- Installation
- How to install and config this project
- Read before you start
- Something you should know before you start
Advanced topic¶
- Enhance your browser
- How to enhance your browser to make it help you develope spider
- Enhance your terminal
- How to enhance your terminal shell.
- Troubleshoot spider
- How to troubleshoot your scrapy spider.
- Mitmproxy
- How to inspect your http request.
Task List¶
- Basic extract
- Understand the spider workflow and basic xpath syntax.
- Json extract
- Learn to use json module to extract json data.
- Ajax extract
- Learn to inspect ajax request.
- Ajax Header
- Learn to inspect http header of ajax request.
- Meta StoreInfo
- Learn to pass additional data to callback functions
- Ajax Cookie
- Learn to analyze cookie of http request.
- Ajax Sign
- Learn to analyze minified js and debug code in browser.
- Regex extract
- Learn to use regex expression to extract info.
- List page and products extract
- Learn to extract products from list pages.
- List page and pagination extract
- Learn to extract info from list page and handle pagination.