I noticed Apify makes a Website Content Crawler and 3 types of scrapers (Web Scraper, Cheerio, Playright). What's the difference between the Website Content Crawler vs these older scrapers? They seem to both crawl and scrape?
What specifically are you asking about? You can refer to the README section for both actors.
Website Content Crawler: This actor is designed to extract data for feeding, fine-tuning, or training large language models (LLMs) like GPT-4, ChatGPT, or LLaMA.
Web Scraper: The Web Scraper is a versatile and easy-to-use actor for crawling web pages and extracting structured data using just a few lines of JavaScript. It loads web pages in a Chromium browser to render dynamic content. You can configure and run it manually via the user interface or programmatically using the API. The extracted data is stored in a dataset, which can be exported in formats such as JSON, XML, or CSV.
Depending on your needs, you can choose the scraper that best fits your purpose. If you don't plan to use extracted datawith AI stuff, "Web Scraper" is the option to go.