Dear all, I am trying to scrap data from a public ip. For some reason cheeriocrawler is not getting the data back but in postman I could easily get the data. Proxy ip is whitelisted because I am using the same ip for postman and for cheerio.
Postman does add some default headers but when I look at my request object the headers are empty. Does someone knows at which points cheerio sets the headers and generate some fingerprints and how can I see them ?
Thanks for your response. Actually, I am more interested in what is being sent in the request headers. I have debugged it further and found out that when I try to scrap the API it won't work in the first try and when I refresh the opened browser by crawlee it does work. I wanted to check what is going on so I used Playwright in head full mode and I could see that there was an error but when I refreshed the same page I got the response back. The api I am trying to scrap data from is very sensitive to some headers as you see in the picture. I think some headers are not set properly in the request and on refresh the browser adds default headers and then it works.