Apify and Crawlee Official Forum

Updated 5 months ago

structure for broad shallow scrape

At a glance
Hi, I have a question about the best way to structure my solution to this problem to best take advantage of apify infrastructure and services. I need to scrape once per day a dataset off of 500 distinct domains, but each only 1-2 pages. The selectors for the items are different for almost all sites. The two extremes are 500 separate actors and one actor with a hashmap of domains to selectors for that domain. I want to be able to track when a domain has broken. What’s the best way to structure this that aligns myself with apifys infra?
H
a
2 comments
I would make one actor that would take info about domains and selectors from some google spreadsheet for example.
you could use slack webhooks and get notified when selectors for a website fails. Then you can update and restart
Add a reply
Sign up and join the conversation on Discord