Apify and Crawlee Official Forum

Updated 4 months ago

hi guys

At a glance
Hello there,

I'm reaching out for assistance with data scraping. Once completed, is there a method to structure the data according to my requirements? Additionally, is there a way to eliminate replicated data and retain only one instance? Your expertise in this matter would be greatly appreciated. Thank you.
o
1 comment
Hi, on Apify platform, you can store your data either to dataset or key value store. You can store any valid JSON in a dataset, so you can structure your data any way you want. Does this answer you question?

Regarding elimination of duplicates, we usually keep track of IDs (e.g. HashSet) of items that we have already pushed to dataset. Everytime we're pushing a new item, we check whether it's already been pushed.
Add a reply
Sign up and join the conversation on Discord