Hello everyone
I am using Apify + Crawlee Cheerio Crawler + NestJS scheduler in my project, and getting issue NestJS process for running the server is quit when calling Apify.exit() . Below is my code
@Cron('0 */5 * * * *')
async handleEvery20Minutes() {
const config = new Configuration({ purgeOnStart: true, persistStorage: false });
let cheerioCrawler = new CheerioCrawler({
minConcurrency: 10,
maxConcurrency: 50,
// On error, retry each page at most once.
maxRequestRetries: 1,
// Increase the timeout for processing of each page.
requestHandlerTimeoutSecs: 30,
// Limit to 10 requests per one crawl
maxRequestsPerCrawl: 10,
requestHandler: defaultRouter
}, config);
await Actor.init();
const crawlingCodes = await this.codesService.findAllCodesUrl();
for (let i = 0; i < crawlingCodes.length; i++) {
await cheerioCrawler.addRequests([
{
url: crawlingCodes[i].url,
userData: {
code: crawlingCodes[i].name,
},
uniqueKey: uuidv4()
},
]);
}
await cheerioCrawler.run();
await cheerioCrawler.teardown();
await Actor.exit(); //when the NestJS scheduler running at this line, it quits
}
I would like to call
Actor.exit()
to reset the index of data json files. I can remove
Actor.exit()
but will get this error
[Nest] 43924 - 06/08/2024, 8:30:02 PM ERROR [Scheduler] Error: ENOENT: no such file or directory, open '/storage/datasets/default/000000001.json'
Does anyone has this similar issue when running Apify Crawlee on NestJS framework ? Can you please help ?
Thank you