Apify and Crawlee Official Forum

Updated 5 months ago

compass/crawler-google-places optimization

At a glance
I like the scraper and would like to use it more frequently and effectively and I plan on upgrading to a paid plan soon. I understand I can use the orchestrator to have multiple runs that execute at the same time to best utilize my resources but I want to make sure that I am optimizing the individual runs. I have a wide range of categories that I would like to query for but I am wondering if I should do multiple runs per location and query for different categories on each run or if I should just slim the list down so that I can do a single run per location?

Also, at what point will there be no new results? For instance, I have gone through all the categories listed that I can put as an input and have picked out things like bar, brewery, brewpub, etc but I noticed that most of the entries in the log are either that there is no data for the search term or all the data is duplicate. Are there certain categories that are better than others or will encompass others?

Another question is if I abort a run because I had too many categories in the input, and then I star another run in the same location but with a smaller set of categories, will data be duplicated, or will these places be passed over because I already have them stored?

My end goal is essentially to have all the places that sell alcohol (bars, pubs, nightclubs, etc) and I want to try and do this as efficiently as possible. Sorry for the long message hopefully somebody has some insight for me
o
1 comment
Hi, for efficiency I would use one run per location and the most generic categories: bar, restaurant, nightclub...Using both "bar" and "pub" might find many duplicates. I'd test them on some small location (city center?) and see if they give you same results or not.

Also, at what point will there be no new results?...

If you enter N categories in the input than the scraper performs N searches in the specified area, one for each category. When you search for "bar", "brewery" and "brewpub", from what I observed, the scraper finds most of the breweries and brewpubs when searching for "bar" and so the "brewery" and "brewpub" search will find mostly duplicates. Hopefully this makes sense πŸ™‚

...will data be duplicated, or will these places be passed over because I already have them stored?

The runs are independent, so the second run will scrape places regardless of whether they were scraped in the previous run.
Add a reply
Sign up and join the conversation on Discord