Apify and Crawlee Official Forum

Updated 4 months ago

Website Content Crawler vs Web Scraper

I noticed Apify makes a Website Content Crawler and 3 types of scrapers (Web Scraper, Cheerio, Playright). What's the difference between the Website Content Crawler vs these older scrapers? They seem to both crawl and scrape?
O
T
3 comments
This seems to compare the 3 scrapers but it doesn't compare Apify's "Web Scraper" to the newer "Web Content Crawler"
What specifically are you asking about? You can refer to the README section for both actors.

Website Content Crawler:
This actor is designed to extract data for feeding, fine-tuning, or training large language models (LLMs) like GPT-4, ChatGPT, or LLaMA.

Web Scraper:
The Web Scraper is a versatile and easy-to-use actor for crawling web pages and extracting structured data using just a few lines of JavaScript. It loads web pages in a Chromium browser to render dynamic content. You can configure and run it manually via the user interface or programmatically using the API. The extracted data is stored in a dataset, which can be exported in formats such as JSON, XML, or CSV.

Depending on your needs, you can choose the scraper that best fits your purpose.
If you don't plan to use extracted datawith AI stuff, "Web Scraper" is the option to go.
Add a reply
Sign up and join the conversation on Discord