Apify Discord Mirror

Updated 2 months ago

Google Maps crawler - Increasing place limit after initial run

At a glance

The community member ran a Google Maps scraper and discovered that there are significantly more places available than what was initially collected in the first run. They have a dataset from the first run and are asking if it's possible to increase the place limit on the existing run configuration, or if they need to create a new run, what's the best way to import/merge the existing scraped data, avoid duplicating places, and continue from where the previous run stopped.

The comments indicate that there is generally no way to extend already scraped data or start a new run that would skip data based on previous runs. One community member suggests that the community member will have to run the actor again and pay again for data that's already been scraped, and that the tool should have a history feature to allow extending the scraping process. Another community member mentions they have a Google Maps crawler made with a Python script that can solve the problem.

Useful resources
Hi everyone,
I recently ran a Google Maps scraper (https://apify.com/compass/crawler-google-places) to collect place data, and I've discovered that there are many more places available than what was initially collected in my first run.
Current Situation:
  • Successfully completed an initial scrape
  • Have collected data for X places
  • Discovered there are significantly more places available
  • Already have a dataset from the first run
Questions:
Is it possible to increase the place limit on my existing run configuration?
If I need to create a new run, what's the best way to:
  • Import/merge my existing scraped data
  • Avoid duplicating places already collected
  • Continue from where the previous run stopped
Any guidance on the most efficient approach would be greatly appreciated.
Thanks in advance!
P
C
S
4 comments
Hi @Cuivrax What you describe really depend on the Actor implementation.

Generally there is no way to extends already scraped data or to start a new Run that would be skipping data based on previous Runs.
@Pepa J That's unfortunate because you have to run your actor again and pay again for data that's already been scraped. It should have a history feature to allow extending the scraping process. For now, I'll stick with my Python script.
I have a googlemap crawler that I made with python script. And this script can solve the problem you are having.
Add a reply
Sign up and join the conversation on Discord