Apify

Apify and Crawlee Official Forum

b
F
A
J
A

Prevent Clawler from adding failed request to default RequestQueue

Is there a way to prevent the crawler from adding a failed request to the default RequestQueue?

Plain Text
const crawler = new PuppeteerCrawler({
    proxyConfiguration,
    requestHandler: router,
    maxRequestRetries: 25,
    requestList: await RequestList.open(null, [initUrl]),
    requestHandlerTimeoutSecs: 2000,
    maxConcurrency: 1,
}, config);

I'm using the default RequestQueue to add productUrls, and they're being handled inside the defaultRequestHandler, but when some of them fails, I purposely throw an Error, expecting the failed request(which is the initUrl) goes back to RequestList, but it goes to the default RequestQueue too, which is not what I want.
n
A
2 comments
Prevent Clawler from adding failed request to default RequestQueue
do not throw error if you do not want to retry the same request, scraping logic to retry on errors s to resolve blocking by retries
Add a reply
Sign up and join the conversation on Discord
Join