Apify and Crawlee Official Forum

Updated 3 months ago

Puppeteer Crawler cannot open the page

Hi,
I have a puppeteer scrapper, which worked just fine until this Monday. Nothing is changed, but scrapper stopped working.

The HTML markup of the page is not changed, a[data-testid="search-listing-title"] this element is still there. Apify run logs says it is failing to find this HTML element:
TimeoutError: waiting for selector a[data-testid="search-listing-title"] failed: timeout 30000ms exceeded
I have tried to launch scrapper from local machine and it did work but does not work on Apify platform. I guess something has to do with proxy.
This is part of my code:
Plain Text
//...
const proxyConfiguration = await Apify.createProxyConfiguration();

const launchContext = {
  useChrome: true,
  stealth: true,
  launchOptions: {
    headless: true,
  },
};

const crawler = new Apify.PuppeteerCrawler({
  requestList,
  requestQueue,
  proxyConfiguration,
  launchContext: launchContext as any,
  maxRequestRetries: 5,
  handlePageTimeoutSecs: 180,
  navigationTimeoutSecs: 180,
  async handlePageFunction({ page, request }): Promise<void> {

  await utils.puppeteer.saveSnapshot(page, { key: 'beforescrap', saveHtml: false });
  const cheerio = load(await page.content());

  const html = cheerio.html();
  await Apify.setValue('htmlstring', html, { contentType: 'text/html' });

  await page.waitForSelector('a[data-testid="search-listing-title"]');
//...

I have tried to take a screenshot to see what the page looks like and it gives blank white page.
I have also tried to change proxy settings to use residential servers and change the country - also did not work.
How can I debug this?
Logs screenshot is also attached.
Attachment
a.png
P
4
A
10 comments
Hi ,
Can you try to run https://apify.com/apify/screenshot-url on the url, that failed for you?
Just out of curiosity I tried to run it on page like https://www.autotrader.co.uk/car-search?postcode=PO16%207GZ&refresh=true and it seems all the data are there πŸ€”
OK, I'll try it.

The problem is not with single URL. As you can see it's car listings website and basically we scraping the vehicles.
just advanced to level 1! Thanks for your contributions! πŸŽ‰
react skeleton can be seen here.
My guess is maybe "stealth" mode of puppeteer is not working or something. I'm thinking maybe I'll upgrade apify SDK to the latest. I don't know...
not a big deal. In my actor I get complete blank screenshot
I mean the data seems to be there, I am using just a DATACENTER proxies for me I am trying to think about what may cause is in your case.


Hi, I have upgraded to apify v3 (I rewrote it in pure JS)
the selector can be seen now. That part is solved. I guess puppeteer's stealth plugin was not making it's job and I was blocked... With new sdk it works OK.

So the main problem is solved. Should I close the topic or something?

BTW I have a new problem (:, I've posted here https://discord.com/channels/801163717915574323/1231166329076191274
Thank you. No this is fine.
Add a reply
Sign up and join the conversation on Discord