I've been playing around with deploying PlaywrightCrawler to AWS Lambda and it's working well. I've used @sparticuz/chromium for the chrome exe as per this doc:
https://crawlee.dev/docs/deployment/aws-browsersHowever, upon examining the request headers it's generating, I've discovered the sec-ch-ua hint header is always as follows:
"HeadlessChrome";v="129", "Not=A?Brand";v="8", "Chromium";v="129"
I've restricted the fingerprint generation options to 'Chrome' and the User-Agent header is nicely randomized (always chrome, but with variations).
I've also observed the version of chrome in the 2 headers doesn't always match for example:
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML,, like Gecko) Chrome/116.0.0.0 Safari/537.36,
"sec-ch-ua": "HeadlessChrome";v="129", "Not=A?Brand";v="8", "Chromium";v="129"
This difference is surely going to make PlaywrightCrawler significantly easier to detect by anti-bot systems?
Running the same code locally (not using @sparticuz) and it looks fine -
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36"
"sec-ch-ua": "\"Chromium\";v=\"124\", \"Google Chrome\";v=\"124\", \"Not-A.Brand\";v=\"99\""
Is there something I can do/set in order to get the sec-ch-ua and user-agent headers aligned when using the @sparticuz/chromium?
Thanks