The community member is testing Apify's Web Crawling API and is currently taking 57 seconds to crawl 11 pages using Cheerio. They are wondering if the speed will increase when they subscribe and get higher memory, and if there are any other suggestions to increase the speed. The community members in the comments suggest that the issue is not with the plan, but with the structure of the scrape. They recommend running as many URLs concurrently as possible to maximize CPU utilization, and mention that Cheerio can be very fast.
We are just testing Apify's Web Crawling API right now in a free account and will subscribe shortly. Right now it is taking 57 seconds for 11 pages using Cheerio. My question is, will this be faster when we subscribe and get higher memory? Also are there any other suggestions to increase speed? What is the fastest method you have seen and how long should it take? This is a tad slow for us in production without some user experience changes.
You need big batch of URLs, in performance terms scaling only possible as concurrent requests, so approx 50 per minute under 4Gb of RAM should be possible for most of the sites.
As Alexey mentioned, the problem is not your plan but the stricture of your scrape. You need ot run as many URLs at once as possible to maximaly utilize the CPU. Cheerio can be very fast.