Apify and Crawlee Official Forum

Updated 4 months ago

AdaptivePlaywrightCrawler: programmatically deciding when to render JS

Using the new adaptative Playwright crawler, is it possible programmatically decide when to render JS?
For example using HTTP crawling by default, but if some condition is met (for example, finding the word 'captcha' in the loaded url), switch to JS rendering and try to unblock the page.

A similar question, for which I didn't find any answer in the docs, is how does the AdaptivePlaywrightCrawler decide to render JS or not?
A
J
4 comments
No official support atm, since related classes are private https://github.com/apify/crawlee/blob/master/packages/playwright-crawler/src/internals/adaptive-playwright-crawler.ts#L143 so you not expected to inherit from them with additional logic. However you can reuse or browse current code i.e. see how prediction works based on custom ratio https://github.com/apify/crawlee/blob/master/packages/playwright-crawler/src/internals/utils/rendering-type-prediction.ts#L32
Ok thanks for these precisions. Any chance of adding this to the roadmap? For example using Scrapy it's easy to mix HTTP-only and JS rendering within the same crawler, it would be great to have the same in Crawlee.
No plans, please add github issue with feature request, if it will become popular and attract feedback from other users then features will be considered.
Add a reply
Sign up and join the conversation on Discord