Since a few days, I have been running the crawler with a high number of jobs. As a result, I have run into a problem.
I have found, that not all jobs are processed by the CheerioCrawler despite these jobs being added to the queue through addRequest([job]).
I can't really reproduce it, it happens approximately after 5000 - 6000 number of jobs.
My code doesn't crash, it continues to the next jobs (BullMQ job queue) without scraping the link.
This is normal behavior, since it reaches the requestHandler (CheerioInfo logger)
Add a reply
Sign up and join the conversation on Discord
Join on Discord