duplicate_count
variable (I know we already have this functionality but this is just an example), and I'll update it if the data is already there in my db and stop the crawler if the count exceeds some threshold. Is there a way I could implement this?crawler.run([ { url: 'someUrl', userData: { thing: 'value' } } ]);
request.userData
request.userData
useState
import { createPlaywrightRouter , useState} from 'crawlee'; export const router = createPlaywrightRouter(); const state = await useState("test", {"val":12}) router.addDefaultHandler(async ({ enqueueLinks, log }) => { log.info(`enqueueing new URLs`); await enqueueLinks({ globs: ['https://crawlee.dev/**'], label: 'detail', }); }); router.addHandler('detail', async ({ request, page, log, pushData, }) => { const title = await page.title(); log.info(`${title}`, { url: request.loadedUrl }); await pushData({ url: request.loadedUrl, title, }); });
crawler.useState
, can you clarify a bit on this?name
in the useState func vs passing it in the config
parameter? on the docs both options use it to define a custom key value storeconst state = await useState()
and then use it inside the crawler like a simple object? e.g. state.property=val