I created a new crawler using the
npx crawlee create project
command, that creates some folders and files, it creates me a router.js file, which it has an instance of
createPlaywrightRouter
export const router = createPlaywrightRouter();
router.addDefaultHandler(async ({ enqueueLinks, log }) => {
log.info(`enqueueing new URLs`);
await enqueueLinks({
globs: ['https://crawlee.dev/**'],
label: 'detail',
});
});
router.addHandler('detail', async ({ request, page, log }) => {
const title = await page.title();
log.info(`${title}`, { url: request.loadedUrl });
await Dataset.pushData({
url: request.loadedUrl,
title,
});
});
as I understand, you are creating the default handler which is kinda the "main" listener, so later you are calling/invoking your route "detail", for the enqueLinks function, this could be interesting to split your process in more "routes"/steps, so it can be more clean and decoupled later.
My question is, how to call or invoke this without the enqueList function?
I was expecting something like:
router.addDefaultHandler(async (ctx) => {
await ctx.invoke('extract-meta-data')
await ctx.invoke('extract-detail')
await ctx.invoke('download-files')
});
Where can I see the functions this CTX admit or maybe I understood the router totally different.
Thanks π