Apify and Crawlee Official Forum

Updated 2 months ago

How to set concurrency/cpu's/memory correcty

Hello, I would like to use PlayWrightCrawler for scraping , but it is not clear from the documentation how can I set up correctly concurrency, memory, cpu's, etc. Can someone help me out? What is the best practice to set up this Crawler to make scraping parallel? Thanks in advance!
M
1 comment
Hello! The best concurrency settings really depend on the context, for instance the available resources, the use-case and the scraped website. You can set the crawling options when creating the PlaywrightCrawler: see https://crawlee.dev/python/api/class/PlaywrightCrawler#__init__ and https://crawlee.dev/python/api/class/BasicCrawler#__init__. For instance, you can set concurrency_settings: https://crawlee.dev/python/api/class/ConcurrencySettings.
Add a reply
Sign up and join the conversation on Discord