Apify

Apify and Crawlee Official Forum

b
F
A
J
A

WARN CheerioCrawler: Reclaiming failed request back to the list or queue. Detected a session error,

Whole error: WARN CheerioCrawler: Reclaiming failed request back to the list or queue. Detected a session error, rotating session...
What does this error mean? It shows when there is no webpage for example here http://www.cool-rent.eu/
Aside that this is really weird error message it is then retrying even when I have maxRequestRetries: 0.
Can anything be done about it?
I have tried useSessionPool: false but it did not help.
Thanks
2
H
v
m
10 comments
ok, I have tried to use older version of crawlee 3.3.0 and it works like it should, it just displays this error ERROR CheerioCrawler: Request failed and reached maximum retries. RequestError: getaddrinfo ENOTFOUND www.cool-rent.eu so I am not sure what happened in version 3.5.0
Hi Honza, thanks for submitting this issue. We've indeed changed the way proxy errors are handled in Crawlee v3.5.0 (relevant PR here - https://github.com/apify/crawlee/pull/2002). With this new mechanism, proxy and blocking errors are retried by default without increasing the request retry count (instead, they have a separate limit of 10 session retries per request - and after that, the crawl is interrupted as this is a clear telltale sign that something is really wrong with the proxy config).

Unfortunately, I cannot reproduce your case - the http://www.cool-rent.eu/ is unreachable (I cannot even resolve the server's IP address). Crawlee v3.5.0 without proxies processes this correctly by returning the same ENOTFOUND error as 3.3.0 . With proxies, I receive a 502 error (from the proxy server) - however, Crawlee does not recognize this error (which is imho correct behaviour) and the error is processed as a regular 5xx error with errorHandler. Can you please share more details about the proxies (or Apify proxy groups) you have used? Have you used proxies even in the 3.3.0 case?

Thanks!
Hi vroom,I have managed to reproduce this on the platform so I have shared the run url via private message.
Any news? Now the actor just crashes here is the run https://console.apify.com/view/runs/12EeBfXyGV3IoLUn9 there must be some issue with that PR.
after some time I am again working on actor that takes a lot of urls and some of them do not exist anymore
and the crawler again shows that rotating proxy error, is this expected behaviour? I would expect the 404 error

for example

Plain Text
WARN  PlaywrightCrawler: Reclaiming failed request back to the list or queue. Detected a session error, rotating session...
2024-01-13T11:13:43.450Z page.goto: net::ERR_TUNNEL_CONNECTION_FAILED at https://cudaops.com/
2024-01-13T11:13:43.451Z Call log:
2024-01-13T11:13:43.452Z   - navigating to "https://cudaops.com/", waiting until "load"
anybody? it is really annoying
did you figure this out?
Nope, I think this issue is still not solved.
Hey, guys.
I will try to check it with our team.
Also faced with it several times.

Thanks for patience)
Add a reply
Sign up and join the conversation on Discord
Join