I'm curious what others use to scrape modern (javascript based) web applications.
The old web (html and links) work fine with tools like Scrapy, but for modern applications which rely on javascript this does no longer work.
For my last project I used a chrome plugin which controlled the browsers url locations and clicks. Results where transmitted to a backend server. New jobs (clicks, change urls) where retrieved from the server.
This worked fine but required some effort to implement. Is there an open source solution which is as helpful as Scrapy but solves the issues provided by modern javascript websites/applications?
With tools like Chrome headless this should now be possible, right?
Splash https://github.com/scrapy-plugins/scrapy-splash
Runs a little headless browser.