I have a Python requests actor that is somehow restarting itself after many hours , with the same input, and appending the results to the existing output. That's very annoying! If there was a crash it should just abort and i should be able to restart it on my own if i needed to. But restarting with the same input over again just wastes time and money for me. Does anyone know why this is happening?
Here's what the log has -- there's no other info about any errors, the crawler was running just fine right up until 11:08:34, was about 21% done, and then I get:
023-09-08T11:08:34.534Z ACTOR: Sending Docker container SIGTERM signal.
2023-09-08T11:08:48.281Z ACTOR: Pulling Docker image from repository.
2023-09-08T11:08:49.924Z ACTOR: Creating Docker container.
2023-09-08T11:08:50.307Z ACTOR: Starting Docker container.
2023-09-08T11:08:51.769Z INFO Initializing actor...
2023-09-08T11:08:51.771Z INFO System info ({"apify_sdk_version": "1.1.4", "apify_client_version": "1.4.1", "python_version": "3.11.5", "os": "linux"})
And then the crawler starts from scratch using the old input data -- it doesn't "restore" or keep any old state. Which means I've wasted a lot of time / actor $ re-crawling the same input.
The timeout set was about 10 days, and the crawler had been running for 22 hours before I noticed. Another strangeness is that the full log only goes back 5 hours from when I stopped it -- 06:02 UTC to 11:08 UTC. It should go back 22 hours.
How can i protect against this?