Apify and Crawlee Official Forum

Updated 3 months ago

parsing input urls from google sheet

Hi, I have tried this feature https://docs.apify.com/platform/tutorials/crawl-urls-from-a-google-sheet
It looks like there is a bug that it does not parse out the whole url when there is comma inside it.
I have tried it on this sheet https://docs.google.com/spreadsheets/d/14eS_kezUiZ13U1zEaDrb4s7xnmerJuHwG7wiRIPwBIM/edit#gid=0 I even tried to put the url it inside " but it did not help.
Here is the result you can see that the urls requested are not the same as in the sheet.
https://api.apify.com/v2/datasets/vlTmoYRiFWawRdJsZ/items?clean=true&format=json
P
H
L
14 comments
Hello HonzaS,
Cannot really tell if this is a bug or a feature, but you may be able to encode these special characters in url ( https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent ).

For example , is being encoded to %2C so by simple replacing this character you should be able to achieve your goal.
Hi Pepa, thanks for the suggestion.
This means that the urls would need to be already encoded in the spreadsheet, right? That does not help me much as customer just wants to fill the spreadsheet and I do not think he can encode the urls. I guess only possibility is to use google sheets actor to load the urls.
he is populating the spreadsheet from airtable
This really sounds like something that may be improved, I am definitely raise an issue for the platform team about this.

Thinking about current workaround, you may use the google sheet actor, or provide google sheet url to your implementation and use google sheets API ( https://developers.google.com/sheets/api/quickstart/nodejs#set_up_the_sample ), but that is little bit complicated since, it requires creating OAuth2 credentials from the Google API Console.
yes that is what I was hoping to avoid - hassle with google OAuth πŸ˜„
Yuo don't need OAuth if your sheet is read only and fully public
But generally if that is your own actor, you can just parse the sheet (converted to CSV) manually
You don't need to log in with a user, but you need register app in Google Console and generate credentials to access data via Google Sheet API.
You don't if you use the public actor
Yeah I know, I have already implemented that and it works perfectly. Your actor is a lifesaver. πŸ˜„
how does it work? you have registered the app that the actor use? now I need to access public spreadsheed without apify platform and it looks like I need the api key
Yes, you need the api-key, or you could use puppeteer and extract the data by on your own. The actor uses it under the hood.
thanks a lot for clearing that for me
Add a reply
Sign up and join the conversation on Discord