Hi All,
I have a playwright crawler that after a few hours exhausts its memory and ends up going extremely slowly. I havent set up any custom logic to manage the memory and concurrency of crawlee but it was my understanding that in general AutoScaledPool should deal with it anyway?
Most of my memory usage is coming from my chromium instances. there are currently 27 of them each taking between 50 and 100MB. the node process itself is taking around 500MB.
Here is my system stste message
{
"level": "info",
"service": "AutoscaledPool",
"message": "state",
"id": "5b83448e57d74571921de06df2d980f2",
"jobId": "testPayload4",
"currentConcurrency": 1,
"desiredConcurrency": 1,
"systemStatus": {
"isSystemIdle": false,
"memInfo": {
"isOverloaded": true,
"limitRatio": 0.2,
"actualRatio": 1
},
"eventLoopInfo": {
"isOverloaded": false,
"limitRatio": 0.6,
"actualRatio": 0.019
},
"cpuInfo": {
"isOverloaded": false,
"limitRatio": 0.4,
"actualRatio": 0
},
"clientInfo": {
"isOverloaded": false,
"limitRatio": 0.3,
"actualRatio": 0
}
}
}
and here is my memory warning message
{
"level": "warning",
"service": "Snapshotter",
"message": "Memory is critically overloaded. Using 7164 MB of 6065 MB (118%). Consider increasing available memory.",
"id": "5b83448e57d74571921de06df2d980f2",
"jobId": "testPayload4"
}
The PC it is running on has 24GB of RAM so the 6GB target makes sense with the default value for maxUsedMemoryRatio being 0.25. The PC also has pleanty of available ram above crawlee, sitting at about 67% usage currently.
Why isnt AutoScaledPool scaling down or otherwise clearing up chromium instances to improve its memory condition?