Apify

Apify and Crawlee Official Forum

b
F
A
J
A

Make PlaywrightCrawler less unique and avoid blocking? (canvas/fonts/plugins/permissions...)

I checked my program (PlaywrightCrawler) against this thing: https://amiunique.org/fingerprint
Used US residential proxy, did 3 screenshots, see below
It seems - there are some areas where Crawlee could do better (be less unique, less detectable)!

Here the list (these things are red on the screenshots):
  • User Agent (I used fingerprint generator for this!)
  • Canvas
  • Navigator properties
  • List of fonts
  • List of plugins
  • Permissions
Some settings in my PlaywrightCrawler:
useFingerprints: true, useFingerprintCache: false, launcher: firefox

Regarding list of plugins: I use some JS code (pluginContent string) taken from here: https://discord.com/channels/801163717915574323/1059483872271798333
and inject it into page this way:
Plain Text
    preNavigationHooks: [
        async ({ page, request }) => {
            await page.addInitScript({ content: pluginContent });
        },


Well, this code/hack... it simulates presence of some PDF plugins... but I have an impression there are better solutions for plugins/fonts/permissions...
Attachments
3.png
2.png
1.png
2
P
L
n
10 comments
Hi ,
If you ideas for improvements of crawlee I suggest you to rise and Issue or PR at https://github.com/apify/crawlee/issues.
Some improvements are already in progress, thanks for suggestion. cc
Thanks !
Well, fixing "too unique User Agent" - I think this should be done somewhere in libraries... may be in the fingerprint generator ?
Other things like List of fonts, List of plugins etc... probably the idea to use puppeteer-extra-plugin-stealth is good for this.

I mean this part of documentation:
Try Puppeteer with the puppeteer-extra-plugin-stealth plugin. Generally, Crawlee's default configuration should have stronger bypassing but some features might land first in the stealth plugin.

I have two problems with puppeteer-extra-plugin-stealth:
  1. I need an example: how to use this plugin with Crawlee
  2. What to do if I already built my JS program/crawler with PlaywrightCrawler ?
just advanced to level 7! Thanks for your contributions! πŸŽ‰
there's a playwright extra stealth plugin too by the same maintainers
we are talking about this: https://github.com/berstend/puppeteer-extra/tree/master/packages/playwright-extra
right?

any example how to use it with PlaywrightCrawler ?
Add a reply
Sign up and join the conversation on Discord
Join