Download CSV using playwright in an Apify Actor

At a glance

The community member has written some JavaScript that navigates to and downloads a CSV file using Playwright with Chromium. They are using .saveAs and defining the file path on their local machine, but are unsure how to make this work on Apify. The community member is not sure if Apify can save a CSV file, but sees mention of saving files to the key-value store. The community member asks if they can expect to download and save a CSV file from an Apify actor, and if so, for a hint on the easiest way to do so. The ideal scenario would be for the CSV file to end up in a Google Drive folder.

The comments suggest that on Apify, the community member needs to read the file from the file path they have saved it and then put it in the key-value store, saving it with just the name of the file as the file path. Putting the file in a Google Drive folder would be more difficult as it requires authorization. The community members confirm that they should be able to download and save the CSV somewhere that they can then access from within the actor and save to the key-value store, and that if they save the CSV with just the file name as the file path, they can access that file again from within the actor. One community member also suggests that the community member could try intercepting the download request an

ssclear.

So I have written some javascript that navigates to and downloads a csv file using playwright with chromium. I am using .saveAs and defining the filepath on my local machine, but not sure how to convert this to work on Apify.

I have tried various things. Everything works except the download.

It is not explicitly clear to me that Apify can even save a .csv file. I see mention that it is possible to save files to the key-value-store, but i remain unsure.

Can anyone confirm wether or not I can expect to download and save a .csv file somewhere from an apify actor, and if so, a hint as to the easiest way to do so?

Additionally, the most ideal scenario is that the csv file ends up on a google drive folder.

5 comments

HHonzaS

On apify you need to do the second step, read the file from the filepath you have saved it and then put it to the KVS. Best to save it with just the name of the file as filepath.
If you want to put it to the google drive folder, then it is more difficult because you need to take care of authorization.

ssclear.

So I should be able to download and save the csv somewhere that I can then access from within the actor and save to the KVS?
If I save the csv with just the file name as filepath, how do I then access that file again from within the actor?

Authorizing with the google drive API shouldn't be too tough, that is at least well documented.

HHonzaS

Yes you should be able to download it and save to KVS.
If you save as filepath you will access it as filepath. That was the point not to save it somewhere deep in the directory structure. Actors are running in virtual enviroment in the docker, so you can just save something there and load it back. There is of course some size limit.

ssclear.

Ah that makes sense, I wll give this a try, thank you

ssclear.

After someone else suggested capturing the data directly, I instead went the route of intercepting the download request, and the making the request inside playwright and capturing it that way. Now I plan to parse the data and clean it up before sending it to its final destination via an API.

Add a reply

Apify Discord Mirror

Download CSV using playwright in an Apify Actor