Apify and Crawlee Official Forum

Updated last week

Moving from Playwright to Crawlee/Playwright for Scraping

Are there actually any ressources on building a scraper with crawlee except the one in the docs?
Where do I set all the browser context for example?

Plain Text
const launchPlaywright = async () => {
  const browser = await playwright["chromium"].launch({
    headless: true,
    args: ["--disable-blink-features=AutomationControlled"],
  });

  const context = await browser.newContext({
    viewport: { width: 1280, height: 720 },
    userAgent:
      "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3",
    geolocation: { longitude: 7.8421, latitude: 47.9978 },
    permissions: ["geolocation"],
    locale: "en-US",
    storageState: "playwright/auth/user.json",
  });
  return await context.newPage();
};
a
2 comments
Or within the pre navigation hook

Something like:

const crawler = new PlaywrightCrawler({
preNavigationHooks: [
async ({ page, request, browserContext }) => {
// Set a specific user agent for the browser context
await browserContext.addCookies([
{ name: 'session', value: '12345', domain: 'example.com' },
]);

// Emulate a specific device (e.g., mobile)
await page.setUserAgent(
'Mozilla/5.0 (iPhone; CPU iPhone OS 15_0 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.0 Mobile/15E148 Safari/604.1'
);
},
],
requestHandler: async ({ page, request }) => {
console.log(Visiting ${request.url});
const content = await page.content();
console.log(Content length: ${content.length});
},
});

await crawler.run(['https://example.com']);
Add a reply
Sign up and join the conversation on Discord