I'm trying to get the data from ajax post call (graphQL) on a webpage but it does not seem to work
I have tried to run the crawler with headful mode and open the network tab, the request is being made and response is there but waitForResponse does not seem to work (
Here's my code:
const crawler = new PuppeteerCrawler({
proxyConfiguration,
requestQueue,
maxRequestRetries: 5,
navigationTimeoutSecs: 180,
requestHandlerTimeoutSecs: 180,
async requestHandler({ request, page }) {
// ...
log.warning('GraphQL starting to wait');
await page.waitForNetworkIdle();
log.warning('IDLE!!!');
await page.waitForRequest(
(req) => req.url().includes(URL_PROPERTIES_DICTIONARY.GRAPHQL_PATH),
);
log.warning('GraphQL request is done');
const response = await page.waitForResponse(
(httpResponse) => httpResponse.status() === 200 && httpResponse.url().includes(URL_PROPERTIES_DICTIONARY.GRAPHQL_PATH),
{ timeout: 180 * 1000 },
);
log.warning('GraphQL response arrived');
const data = await response.json();
//...
As you can see I also have added
waitForNetworkIdle
for testing and it finishes before waitForResponse, which is strange. See the logs:
INFO Page opened. {"label":"vehicle","url":"https://www.autotrader.co.uk/car-details/202307270142806?sort=relevance&advertising-location=at_cars&make=Audi&model=A2&postcode=PO16%207GZ&fromsra"}
WARN GraphQL starting to wait
WARN IDLE!!!
WARN PuppeteerCrawler: Reclaiming failed request back to the list or queue. Timed out after waiting 30000ms
Maybe I'm missing something?
By the way, the code was written for apify sdk version 1 and was working OK. I have upgraded to v3 and it stopped working OR it works reallly slow. like really slow