Apify and Crawlee Official Forum

Updated 3 months ago

Save a webpage to a PDF file using Actor.setValue()

Hi, I'm new to PuppeteerCrawler. I'm trying to create a simple script to save a webpage as a PDF. For this purpose, I created a new Actor from the Crawlee - Puppeteer - TypeScript template in Apify. This is my main.ts code:
Plain Text
import { Actor } from 'apify';
import { PuppeteerCrawler, Request } from 'crawlee';

await Actor.init();

interface Input {
    urls: Request[];
}

const { urls = ['https://www.google.com/'] } = await Actor.getInput<Input>() ?? {};

const crawler = new PuppeteerCrawler({
    async requestHandler({ page }) {
        const pdfFileName = 'testFile';
        const pdfBuffer = await page.pdf({ format: 'A4', printBackground: true });

        console.log('pdfFileName: ', pdfFileName);
        console.log('pdfBuffer: ', pdfBuffer);
        
        await Actor.setValue(pdfFileName, pdfBuffer, { contentType: 'application/pdf' });
    },
});

await crawler.addRequests(urls);
await crawler.run();

await Actor.exit();


It seems that Actor.setValue doesn't want to consume the sent PDF buffer. What am I doing wrong?
Thanks
Attachment
image.png
R
D
2 comments
hey, the pdf() method returns Uint8Array, you will need to convert it to a Buffer class, try this:
Plain Text
const pdfFileName = 'testFile';
const pdf = await page.pdf({ format: 'A4', printBackground: true });
const pdfBuffer = Buffer.from(pdf);

console.log('pdfFileName: ', pdfFileName);
console.log('pdfBuffer: ', pdfBuffer);
        
await Actor.setValue(pdfFileName, pdfBuffer, { contentType: 'application/pdf' });
Great, that works. Thanks
Add a reply
Sign up and join the conversation on Discord