This project was built to help people and I did not earn money from my work. But you can still support my work
Many websites now minified js file when deploying the websites, for example uglify
, compressor
, so we should learn how to analyze the minimized code in browser and try to debug it in some cases to figure out the workflow. This process is like disassemble in reverse engineering.
You will see that the ajax URL used parameter sign in the URL but you have no idea where it is from, and it seems the js file detail_sign.js
is minified.
Tips:
Pretty print
it in your browser. Just use Google to help you.Web scraping using XPath or CSS expression
Load JSON string and extract data
Not only crawl products but also handle pagination
Inspect Ajax requests and mimic them
Learn to inspect the fields of HTTP request
Scraping Infinite Scrolling Pages (Ajax)
Learn to scrape infinite scrolling pages
Make your spider can work with the cookie
Scrape data behind login form
Learn to scrape data behind a captcha
Learn how to analyze minimized or compressed javascript