Apify and Crawlee Official Forum

Updated last week

Simple POST-example

Flaw in tutorial on basic POST-functionality:
https://crawlee.dev/python/docs/examples/fill-and-submit-web-form

It makes an actual POST-request, but the data is not reaching the server, tried on various endpoints.

Two questions:
1) What is broken here and how to fix it?
2) My biggest concern using Crawlee is that I have no clue how to troubleshoot these kind of bugs.

Where can one check what goes wrong, for example how to check under the hood if CURL (?) or whatever library makes the actual request is populating the payload correctly, etc.
It has many benefits this framework, but due to all the abstractions, its very hard to troubleshoot. Probably my mistake and inexperience with the framework, but any guidance on how to troubleshoot would be great as simple things not working without anyway to troubleshoot makes using this Crawlee-framework quite cumbersome.


Plain Text
import asyncio
import json

from crawlee import Request
from crawlee.http_crawler import HttpCrawler, HttpCrawlingContext


async def main() -> None:
    crawler = HttpCrawler()

    # Define the default request handler, which will be called for every request.
    @crawler.router.default_handler
    async def request_handler(context: HttpCrawlingContext) -> None:
        context.log.info(f'Processing {context.request.url} ...')
        response = context.http_response.read().decode('utf-8')
        context.log.info(f'Response: {response}')  # To see the response in the logs.

    # Prepare a POST request to the form endpoint.
    request = Request.from_url(
        url='https://httpbin.org/post',
        method='POST',
        payload=json.dumps(
            {
                'custname': 'John Doe',
            }
        ).encode(),
    )

    # Run the crawler with the initial list of requests.
    await crawler.run([request])


if __name__ == '__main__':
    asyncio.run(main())
M
f
O
4 comments
Hi @crawleexl , you may notice that in the tutorial the payload is not transferred to payload but to data

Now payload is useless and is not passed to the http client

There are problems with POST requests, this should be fixed in the next release with this fix - https://github.com/apify/crawlee-python/pull/542.
Hi @Mantisus

I'm having the same issue trying to get data passing the payload the same way. When will the Crawlee team release version 0.4.0? That version seems to be working for my tests.

I'm trying to resolve this payload issue in my project.
I'm not an Apify employee and I don't know when the next release will be
0.4.3 version was released recently. So the issue should be fixed.
Add a reply
Sign up and join the conversation on Discord