Apify and Crawlee Official Forum

Updated 4 months ago

save HTML file using crawlee

Has anybody tried downloading the HTML file of the URL using Crawlee? Was wondering if Crawlee has a capacity of downloading the HTML file of the URL since I've just been using Crawlee and really loving the experience.

4 comments

EExp

You can download HTML content of a webpages using Crawlee

MMarco

It depends on which crawler you are using:

Cheerio: https://cheerio.js.org/docs/api/classes/Cheerio#html
Playwright: https://playwright.dev/docs/api/class-page#page-content
Puppeteer: https://pptr.dev/api/puppeteer.page.content

NNyanmaru

Thanks for this awesome answer! Was wondering if Crawlee has examples on how to save it to a file?

MMarco

You can use the KeyValueStore: https://crawlee.dev/api/core/class/KeyValueStore. E.g., with Cheerio:

Plain Text

await store.setValue('my-html', $.html('html'), { contentType: 'text/html' });

Add a reply