Hi and good day. I'm creating a POST API that access the following JSON body: { "url": "https://crawlee.dev/python/", "targets": ["html", "pdf"] }
Inside the list of targets, is the extension which my code downloads if it discovers.
I'm already at my wit's end since I don't get the error I'm getting which is: [crawlee.memory_storage_client._request_queue_client] WARN Error adding request to the queue: Request ID does not match its unique_key.
I have never set explicitly id of the request, what is the purpose? I think it is colliding with some inner crawlee mechanism that is setting id automatically. You can set just the unique key.
I'm trying to make a POST request using FastAPI that accepts JSON body that contains the URL and the target extension in which I want to make flexible by inputting the extension I want to download, example on my JSON body I inserted { "url": "https://crawlee.dev/python/", "targets": ["html", "pdf"] }
my program will crawl on the url provided on the JSON body then it will download all files that has the extensions of html and pdf which I can add more like png or jpg
Hi @HonzaS, I'm actually trying to make a POST request using FastAPI that will accept a JSON Body that contains the URL and the target extension that I'll download every page that I'll crawl to. Example: on my JSON body I inserted { "url": "https://crawlee.dev/python/", "targets": ["html", "pdf"] }
my program will crawl on the url provided on the JSON body then it will download all files that has the extensions of html and pdf which I can add more like png or jpg
Hi Everyone! Glad to say this finally worked! I've fixed the latest problem encountered by adding the following on my enqueue_links: await context.enqueue_links(user_data={"targets": targets})