Apify and Crawlee Official Forum

Updated 4 months ago

PuppeteerCrawler waitForResponse timeout issue. Seems like it skips desired request

I'm trying to get the data from ajax post call (graphQL) on a webpage but it does not seem to work
I have tried to run the crawler with headful mode and open the network tab, the request is being made and response is there but waitForResponse does not seem to work (
Here's my code:
Plain Text
const crawler = new PuppeteerCrawler({
    proxyConfiguration,
    requestQueue,
    maxRequestRetries: 5,
    navigationTimeoutSecs: 180,
    requestHandlerTimeoutSecs: 180,
    async requestHandler({ request, page }) {
// ...
   log.warning('GraphQL starting to wait');

    await page.waitForNetworkIdle();

    log.warning('IDLE!!!');

    await page.waitForRequest(
        (req) => req.url().includes(URL_PROPERTIES_DICTIONARY.GRAPHQL_PATH),
    );

    log.warning('GraphQL request is done');

    const response = await page.waitForResponse(
        (httpResponse) => httpResponse.status() === 200 && httpResponse.url().includes(URL_PROPERTIES_DICTIONARY.GRAPHQL_PATH),
        { timeout: 180 * 1000 },
    );

    log.warning('GraphQL response arrived');

    const data = await response.json();
//...

As you can see I also have added waitForNetworkIdle for testing and it finishes before waitForResponse, which is strange. See the logs:
Plain Text
INFO  Page opened. {"label":"vehicle","url":"https://www.autotrader.co.uk/car-details/202307270142806?sort=relevance&advertising-location=at_cars&make=Audi&model=A2&postcode=PO16%207GZ&fromsra"}
WARN  GraphQL starting to wait
WARN  IDLE!!!
WARN  PuppeteerCrawler: Reclaiming failed request back to the list or queue. Timed out after waiting 30000ms


Maybe I'm missing something?
By the way, the code was written for apify sdk version 1 and was working OK. I have upgraded to v3 and it stopped working OR it works reallly slow. like really slow
1
m
4
A
9 comments
I would use page.on response event, just add condition for that particular link, if you keep on struggling dm me
https://stackoverflow.com/questions/77397585/how-to-wait-for-specific-ajax-request-in-puppeteer-crawler

Few month ago I was fixing this exact scrapper and had this same issue. But I was able to solve it with waitForResponse and it was working OK with Apify sdk v1.
Now with Apify SDK v3 it's not working as expected
You need to add waiting for response in preNavigationHooks like decribed here: https://docs.apify.com/academy/node-js/how_to_fix_target-closed#page-closed-solution
Спасибо большое, это то что я искал 🥹
I've tried the above example and in my case the context is always undefined:

Plain Text
preNavigationHooks: [
        async ({ page, context }) => {
            log.info('context', { context, type: typeof context });
            context.responsePromise = page
                .waitForResponse(`https://www.autotrader.co.uk${URL_PROPERTIES_DICTIONARY.GRAPHQL_PATH}`)
                .catch((e) => e);


Plain Text
INFO  context {"type":"undefined"}
WARN  PuppeteerCrawler: Reclaiming failed request back to the list or queue. TypeError: Cannot set properties of undefined (setting 'responsePromise')
just advanced to level 2! Thanks for your contributions! 🎉
Plain Text
    preNavigationHooks: [
        async (context) => {
            context.responsePromise = context.page
                .waitForResponse((httpResponse) => httpResponse.url().includes(URL_PROPERTIES_DICTIONARY.GRAPHQL_PATH))
                .catch((e) => e);
        },
    ],
    async requestHandler({ request, page, responsePromise }) {


basically I did this in the end and it is working.

Thank very much for sharing the right article.
Add a reply
Sign up and join the conversation on Discord