All the data I need is on the response to an specific request, which occurs before the page is loaded. What I'm trying to achieve is to close the page and go to the next request as soon as I got what I need, so I tried doing it on preNavigationHooks:preNavigationHooks: [
async (crawlingContext, gotoOptions) => {
const { page, request, log } = crawlingContext;
gotoOptions.waitUntil = 'load';
if (isProductUrl) {
page.on('response', async (response) => {
if (response.request().url().includes('productdetail')) {
try {
const data = await response.json();
await Actor.pushData(data);
await defaultQueue.markRequestHandled(request);
page.removeAllListeners('response');
await page.close();
} catch (err) {
log.error(err);
}
}
});
}
},
]
But I'm getting this error:WARN PuppeteerCrawler: Reclaiming failed request back to the list or queue. Navigation failed because browser has disconnected!
When I remove the await page.close();
line I get this error:WARN PuppeteerCrawler: Reclaiming failed request back to the list or queue. requestHandler timed out after 130 seconds (o4wrbxkzgU1eP2n).
This is the default handler. It only contains code related to enqueuing URLs:router.addDefaultHandler(async ({ request }) => {
if (searchPageUrlPattern.test(request.url)) {
// Enqueue links...
}
});
Add a reply
Sign up and join the conversation on Discord
Join on Discord