PlaywrightCrawler actor not finishing requestQueue

At a glance

The community member has a Playwright Actor that is not processing all 10 URLs in its queue. The Actor processes between 4 and 7 URLs, then the log shows repeated statistics messages. This issue occurs both locally and on the Apify platform. The community member has shared a run URL, but no one has been able to determine the cause of the issue. One community member had a similar problem with a Playwright crawler in Crawlee, but the issues seem unrelated. Another community member tried starting their project from scratch using the base Playwright Actor provided by Apify, which resolved the problem for them, but the original poster has not found a solution.

Useful resources

kkennysmithnanic

I have a playwright Actor that will has 10 URLs added to its queue before i kick it off with .run(). But the actor doesn't finish all 10 URLs. It will process between 4 and 7, then the Log for the run will just show statistics message repeated every second.

Note that this happens in my local runs of this Actor as well. The total number of URLs scraped (out of 10) varies from run to run, minimum 1 URL and max 7 (of 10 total).

This is the message it shows on repeat, on my local and on Apify platform:

Plain Text

2024-05-22T22:34:24.274Z INFO  Statistics: PlaywrightCrawler request statistics: {"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":35781,"requestsFinishedPerMinute":2,"requestsFailedPerMinute":0,"requestTotalDurationMillis":143124,"requestsTotal":4,"crawlerRuntimeMillis":120866,"retryHistogram":[4]}
2024-05-22T22:34:24.301Z INFO  PlaywrightCrawler:AutoscaledPool: state {"currentConcurrency":6,"desiredConcurrency":11,"systemStatus":{"isSystemIdle":true,"memInfo":{"isOverloaded":false,"limitRatio":0.2,"actualRatio":0},"eventLoopInfo":{"isOverloaded":false,"limitRatio":0.6,"actualRatio":0},"cpuInfo":{"isOverloaded":false,"limitRatio":0.4,"actualRatio":0},"clientInfo":{"isOverloaded":false,"limitRatio":0.3,"actualRatio":0}}}

Why would it stop pulling from the requestsQueue? There are no errors in the Actor prior to this.

8 comments

AApifyBot

just advanced to level 1! Thanks for your contributions! 🎉

oondro_k

Hi, could you share ID (or URL) of your run?

kkennysmithnanic

here ya go: https://console.apify.com/actors/Ii5B19kfFomWR9Ggo/runs/nkLUqHl7YkyCfF2Ox#output

ssabin9848

Hi, did you find out what was wrong? I am having the same issue while using a playwright browser in crawlee.

HHonzaS

I have had similar problem with playwright crawler that it finishes and there were still pending requests in the queue.
But now this happened. Pending requests = -5 , how can this happen?

Attachment

kkennysmithnanic

I never figured out why this was happening. I wound up starting my project from scratch again from the base playwright actor provided by apify and I haven’t had this problem again.

vvojtechmaslan

These two issues seem unrelated. In the run from kenny, all requests get fetched from the queue, but then the Actor stalls while handling them.

To check your issue , we would need more info.

HHonzaS

I have shared you the run via private message. Thanks.

Add a reply

Apify Discord Mirror

PlaywrightCrawler actor not finishing requestQueue