Warning

This project is deprecated and it has been merged into Scrapy Tutorial Series: Web Scraping Using Python

Getting help

Basic concepts

Intro
Introduction of this project
Installation
How to install and config this project
Read before you start
Something you should know before you start

Advanced topic

Enhance your browser
How to enhance your browser to make it help you develope spider
Enhance your terminal
How to enhance your terminal shell.
Troubleshoot spider
How to troubleshoot your scrapy spider.
Mitmproxy
How to inspect your http request.

Task List

Basic extract
Understand the spider workflow and basic xpath syntax.
Json extract
Learn to use json module to extract json data.
Ajax extract
Learn to inspect ajax request.
Ajax Header
Learn to inspect http header of ajax request.
Meta StoreInfo
Learn to pass additional data to callback functions
Ajax Cookie
Learn to analyze cookie of http request.
Ajax Sign
Learn to analyze minified js and debug code in browser.
Regex extract
Learn to use regex expression to extract info.
List page and products extract
Learn to extract products from list pages.
List page and pagination extract
Learn to extract info from list page and handle pagination.