Apify and Crawlee Official Forum

Updated 3 months ago

Apify in NestJS scheduler

Hello everyone
I am using Apify + Crawlee Cheerio Crawler + NestJS scheduler in my project, and getting issue NestJS process for running the server is quit when calling Apify.exit() . Below is my code
Plain Text
@Cron('0 */5 * * * *')
async handleEvery20Minutes() {
    const config = new Configuration({ purgeOnStart: true, persistStorage: false });
    let cheerioCrawler = new CheerioCrawler({
      minConcurrency: 10,
      maxConcurrency: 50,
    
      // On error, retry each page at most once.
      maxRequestRetries: 1,
    
      // Increase the timeout for processing of each page.
      requestHandlerTimeoutSecs: 30,
    
      // Limit to 10 requests per one crawl
      maxRequestsPerCrawl: 10,
      requestHandler: defaultRouter
    }, config);
    
    await Actor.init();
    const crawlingCodes = await this.codesService.findAllCodesUrl();
    for (let i = 0; i < crawlingCodes.length; i++) {
      await cheerioCrawler.addRequests([
        {
          url: crawlingCodes[i].url,
          userData: {
            code: crawlingCodes[i].name,
          },
          uniqueKey: uuidv4()
        },
      ]);
    }
    await cheerioCrawler.run();
    
    await cheerioCrawler.teardown();
    
    await Actor.exit(); //when the NestJS scheduler  running at this line, it quits 
}

I would like to call Actor.exit() to reset the index of data json files. I can remove Actor.exit() but will get this error
[Nest] 43924 - 06/08/2024, 8:30:02 PM ERROR [Scheduler] Error: ENOENT: no such file or directory, open '/storage/datasets/default/000000001.json'
Does anyone has this similar issue when running Apify Crawlee on NestJS framework ? Can you please help ?
Thank you
Attachment
image.png
v
a
2 comments
Hello ,
calling the exit method like this should do the trick:
Plain Text
await Actor.exit({ exit: false });
Thank you , it works
Add a reply
Sign up and join the conversation on Discord