Apify and Crawlee Official Forum

Home
Members
acanimal
a
acanimal
Offline, last seen 4 months ago
Joined August 30, 2024
(Apologises for the crosslink: https://github.com/apify/crawlee/discussions/1577)

Hi, I recently discovered Crawlee and I'm trying to figure out how can I store the scraped data in database instead in local directorio storage.

Is there any plugin for that? How must I proceed to implement one? Must I code my own class that implements StorageClient interface? If so how must I injected later to be used.

Thanks!
17 comments
6
D
t
e
A
A
Hi, I just discovered Crawlee and seems a very great project.

I'm scraping a single url (https://jobs.workable.com/search) that contains a list of items with a load more button. Each time an item is clicked a floating modal show the item information.

In this scenario all the power of crawlee to remember visited urls, retries, etc is not a help.

My idea is:
  • From the start page, click on each of the initial items and scrape its content
  • Click on the load more button and repeat the process.
The help I'm requesting is in how to apply best practices for:
  • how to "remember/store" the last scrapped item index/id
  • how to handle with errors
Thanks in advance
3 comments
a
t