Apify

Apify and Crawlee Official Forum

b
F
A
J
A

Trying to optimize autoscale options

Hello,

I am running my scraper on an AWS 8gb cpu, 16gb memory ecs.

Plain Text
    maxConcurrency: 200,
    maxRequestsPerCrawl: 500,
    maxRequestRetries: 2,
    requestHandlerTimeoutSecs: 185,


Right now the avg cpu and mem are both like 88%. Is there anything I can do here to optimize more?

I also have CRAWLEE_AVAILABLE_MEMORY_RATIO=.8
1
v
b
A
9 comments
Hi , this is a case-by-case thing. It highly depends on scraped sites, whether you are using a browser, browser settings,...
any guide lines?
just advanced to level 5! Thanks for your contributions! πŸŽ‰
also cpu seems to hit 99% no matter what
Plain Text
{"time":"2024-04-11T04:57:31.174Z","level":"INFO","msg":"PuppeteerCrawler:AutoscaledPool: state","scraper":"web","currentConcurrency":18,"desiredConcurrency":17,"systemStatus":{"isSystemIdle":false,"memInfo":{"isOverloaded":false,"limitRatio":0.2,"actualRatio":0},"eventLoopInfo":{"isOverloaded":true,"limitRatio":0.6,"actualRatio":0.736},"cpuInfo":{"isOverloaded":false,"limitRatio":0.4,"actualRatio":0},"clientInfo":{"isOverloaded":false,"limitRatio":0.3,"actualRatio":0}}}    
what does eventLoop overloaded mean?
or how come currentConccurency is 18 when I have maxConcurrency at 200 and there are plenty of request?
Add a reply
Sign up and join the conversation on Discord
Join