Mitmproxy¶
Intro¶
mitmproxy is an interactive, SSL-capable intercepting proxy with a console interface.
Pros and cons¶
Here is list of the popluar network tools user use to inspect http traffic.
wireshark | Fiddler | Charlesproxy | Mitmproxy |
Win, OSX, Linux | Win | OSX | Win, OSX, Linux |
GUI(Qt, GTK) | GUI(Native) | GUI(Native) | Console |
As we can see, mitmproxy has no gui interface for newbie user to inspect http request, but in my eyes this is the great advantage because we can launch mitmproxy in terminal and quickly detect http request
How to use mitmproxy to speed the development of spider¶
Terminal¶
Mitmproxy work fine in my terminal env and I can quickly switch between tools which used in spider deveopment. You can read Enhance your terminal.
In mitmproxy I can quickly check content of http requests by entering some key.
Filter¶
Sometime you know the website might use some ajax request to get the data you want to scrape, so you go to the network panel of your spider try to check the detail of the request. After you click 10+ links, your are tired and hope some tool can save your life here.
mitmproxy can really help you here.
You can write filter expression to make mitmproxy filter the request based on the expression. For example, if you want to filter http request which have content MAMA Jersey Top
, you can use the expression ~b "MAMA Jersey Top"
, or you can filter the http reqeusrt based on url, response.body, response.header
You can give it a try and I promise you will be surprised.
If you want to analyze https¶
When you start to use mitmproxy, it is stronglly recommened to install the CA certificate before you start because if you did not install the CA certificate you can not make mitmproxy inspect https request.
Take a look at this after install.