The structure i am doing idoes not look like the best.
I am basically creating several routers and then doing something like:
const crawler = new PlaywrightCrawler({
// proxyConfiguration: new ProxyConfiguration({ proxyUrls: ['...'] }),
requestHandler: async (ctx) => {
if (ctx.request.url.includes("url1")) {
await url1Router(ctx);
}
if (ctx.request.url.includes("url2")) {
await url2Router(ctx);
}
if (ctx.request.url.includes("url3")) {
await url3Router(ctx);
}
await Dataset.exportToJSON("data.json");
},
// Comment this option to scrape the full website.
// maxRequestsPerCrawl: 20,
});
This does not seem correct. Anyone with a better way?