Flaw in tutorial on basic POST-functionality:
https://crawlee.dev/python/docs/examples/fill-and-submit-web-formIt makes an actual POST-request, but the data is not reaching the server, tried on various endpoints.
Two questions:
1) What is broken here and how to fix it?
2) My biggest concern using Crawlee is that I have no clue how to troubleshoot these kind of bugs.
Where can one check what goes wrong, for example how to check under the hood if CURL (?) or whatever library makes the actual request is populating the payload correctly, etc.
It has many benefits this framework, but due to all the abstractions, its very hard to troubleshoot. Probably my mistake and inexperience with the framework, but any guidance on how to troubleshoot would be great as simple things not working without anyway to troubleshoot makes using this Crawlee-framework quite cumbersome.
import asyncio
import json
from crawlee import Request
from crawlee.http_crawler import HttpCrawler, HttpCrawlingContext
async def main() -> None:
crawler = HttpCrawler()
# Define the default request handler, which will be called for every request.
@crawler.router.default_handler
async def request_handler(context: HttpCrawlingContext) -> None:
context.log.info(f'Processing {context.request.url} ...')
response = context.http_response.read().decode('utf-8')
context.log.info(f'Response: {response}') # To see the response in the logs.
# Prepare a POST request to the form endpoint.
request = Request.from_url(
url='https://httpbin.org/post',
method='POST',
payload=json.dumps(
{
'custname': 'John Doe',
}
).encode(),
)
# Run the crawler with the initial list of requests.
await crawler.run([request])
if __name__ == '__main__':
asyncio.run(main())