Apify Discord Mirror

Updated 5 months ago

Scetchy scrolling distance (apify/web-scraper)

At a glance

The community member is using the apify/web-scraper to scrape a website, but the pages don't initially load all the content, and they need to scroll down to see more. They have set the scroll distance setting to the highest number possible, but it only works half of the time. The community members suggest trying the infiniteScroll option and providing screenshots of the failed runs to help troubleshoot the issue. However, the community member was unable to get it to work and had to hire a freelancer to build a crawler outside of the APIFY platform. The community members note that having more control over the scrolling functionality within the web crawler actor would be a great feature for the next user.

Useful resources
Hi,

I'm using the apify/web-scraper to scrape a website. The pages don't initially load all the content, you need to scroll down to see more.

Now i have set the scroll distance setting to the highest number possible but still it only works half of the time.

Is there any way to improve the scroll behavior? Right now works on only let's say 50% of the requests, on what pages it works is different each run. Is there another way to control this functionality?

Thanks,
Bob
M
L
B
3 comments
I've had good success using the infiniteScroll option, might want to try that however not sure how you are currently attempting.
Example how to:
https://blog.apify.com/how-to-scrape-the-web-with-playwright-ece1ced75f73/
We would need to see the runs where it fails. Scrolling is tricky because sometimes the loading takes too long or there is a button etc.

Doing screenshots is good. At worst, you would need to rewrite it to new actor where you have more control over the scrolling functionality https://crawlee.dev/api/playwright-crawler/namespace/playwrightUtils#infiniteScroll
Thanks both for the suggestions, unfortunately i didn't get it to work. Had to hire a freelancer who build me a crawler outside of the APIFY platform. Definitely a feature that would be great to have just within the web crawler actor for the next guy πŸ™‚
Add a reply
Sign up and join the conversation on Discord