The community member is looking to create an automated solution involving web scraping of dynamic sources and URLs. They currently have a process where they can identify dynamic changes to a web page and place the URL of new articles in a Google Sheet. The next step they want to automate is to have a bot visit the URLs in the spreadsheet, print the pages to PDF, and save them to a local or cloud-based directory. The community member is seeking guidance on the PDF portion and using tools like Apify/Crawlee to accomplish this.
I'm looking to create an automated solution involving the web scraping of dynamic sources and URLs. I've got a few automation tools in mind that can help me bridge the gap between technologies such as Zapier.
Currently, I am able to identify dynamic changes to a web page and if a change is made such as adding a new article, an output of the articles URL will be placed in a google sheet.
Now I'd like to take this further. I'd like to somehow automate it so that once a new row is added to the spreadsheet (which means a new article title and its associated link are added in neighboring columns), I then have a bot visit that link and print to PDF and save to some local or cloud based directory.
If anyone has experience with the PDF portion and using apify/crawlee for this I'd love some pointers. I think this platform is sick and I'd be able to accomplish a lot. I can't wait to see how far I can take this.