Apify and Crawlee Official Forum

Updated 5 months ago

Error handling/Best Practices Python SDK

At a glance
Hello,
I am using pre-built actors in my application. I use them like this to create the dataset:
Plain Text
    client = ApifyClientAsync(token=settings.APIFY_API_TOKEN)
    run = await client.actor(actor.value).start(run_input=run_input)
    processed = 0
    while True:
        await asyncio.sleep(2)
        data = client.dataset(run["defaultDatasetId"]).iterate_items(offset=processed)
        async for item in data:
            dataset.append(item)
            processed += 1
            logger.info(f"processing item: {item.get('url')}")
        run_status = await client.run(run["id"]).get()
        if run_status.get("status", None) == "RUNNING":
            logger.info("Run is still running")
            continue
        else:
            logger.info("Run is finished.")
            break

I want to improve the error handling of this approach. I am wondering which types of errors or issues I could encounter and what the best practices are. Example: What happens if the actor breaks (memory/cpu/other issue) or I get an exception (which types)? What if there are errors in the dataset (400 status code, crawler blocked, etc.). Does anyone have recommendations here? Thank you!!
c
v
5 comments
Any ideas? Thanks
Hi , can you please re-post this question to the Python forum? It should be more suitable for this question and it will be easier for you to get an answer.
Can you please link the python forum? Is it another channel/server?
I didn't see this channel at all. Thanks. Posted it there.
Add a reply
Sign up and join the conversation on Discord