Louis Deconinck

Set country for datacenter proxies

Is it possible to set a country for datacenter proxies? I currently have 30 Apify datacenter IPs which I think are from the US.

I'm specifically looking to use EU proxies, country doesn't matter. Is there a way to specify multiple country codes?

Example of how I do it with residential proxies.

Plain Text

const proxyConfiguration = await Actor.createProxyConfiguration({
    groups: ['RESIDENTIAL'],
    countryCode: 'FR',
});

2 comments

LLouis Deconinck

RESIDENTIAL5 proxies

In the past I sometimes used RESIDENTIAL5 proxies, which I believed to be even better proxies than the regular RESIDENTIAL proxies. However, as of late they stopped working. Has anything changed in that regard? My scraper does no longer work and regular residential proxies keeps it blocked.

1 comment

LLouis Deconinck

Purge all datasets using Apify CLI

with --purge you can delete the default dataset. This does not affect the other datasets. Is there a way to purge all datasets using the CLI?

1 comment

LLouis Deconinck

Playwright chrome browser fails on Apify platform

I have a scraper using Playwright, which still works perfectly locally. It also used to work on Apify, but since today it no longer does.

Has anything been changed about how Playwright is ran on Apify? The error talks about the old Chrome headles mode being removed?

See attachment for the full logs.

Plain Text

2025-02-07T18:07:53.704Z browserType.launchPersistentContext: Target page, context or browser has been closed
2025-02-07T18:07:53.705Z Browser logs: <launching> /home/myuser/pw-browsers/chrome --disable-field-trial-config ...
2025-02-07T18:07:53.708Z <launched> pid=36
2025-02-07T18:07:53.709Z [pid=36][err] Old Headless mode has been removed from the Chrome binary.

Haven't changed anything about the default Dockerfile, here it is:

Plain Text

FROM apify/actor-node-playwright-chrome:20 AS builder
RUN npm ls crawlee apify puppeteer playwright
COPY --chown=myuser package*.json ./
RUN npm install --include=dev --audit=false
COPY --chown=myuser . ./
RUN npm run build
FROM apify/actor-node-playwright-chrome:20
RUN npm ls crawlee apify puppeteer playwright
COPY --chown=myuser package*.json ./
RUN npm --quiet set progress=false \
    && npm install --omit=dev --omit=optional \
    && echo "Installed NPM packages:" \
    && (npm list --omit=dev --all || true) \
    && echo "Node.js version:" \
    && node --version \
    && echo "NPM version:" \
    && npm --version \
    && rm -r ~/.npm
COPY --from=builder --chown=myuser /home/myuser/dist ./dist
COPY --chown=myuser . ./
CMD ./start_xvfb_and_run_cmd.sh && npm run start:prod --silent

5 comments

LLouis Deconinck

PPE charging

Is there any way how I can verify what the user will be charged when doing a run of a PPE (Pay Per Event) actor? How can I verify that the charging is set up correctly on my end?

1 comment

LLouis Deconinck

Increase maximum number of redirects

LLouis Deconinck

Max redirects

I am getting this error message, how to best deal with it?
Reclaiming failed request back to the list or queue. Redirected 10 times. Aborting.
Can I increase the max number of redirects for my CheerioCrawler?

1 comment

LLouis Deconinck

Passing fingerprint.com bot protection locally but not on Apify

The site I'm scraping uses fingerprint.com bot protection. Locally my code passes the protection 95% of the time, but when running the actor on Apify it never does. How is that possible?

To pass this protection I've implemented the following measures (complete code in next message), this was a bit of trial and error, so all feedback welcome:

Browser Configuration
- Using Firefox instead of Chrome/Chromium
- Using incognito pages (useIncognitoPages: true)
- Enabled fingerprint randomization (useFingerprints: true)
Random Viewport/Screen Properties
- Random window dimensions (1280-1920 x 720-1080)
- Random device scale factor (1, 1.25, 1.5, or 2)
- Random mobile/touch settings
- Random color scheme (light/dark)
Locale and Timezone Randomization
- Random locale from 8 different options
- Random timezone from 8 different global locations
Browser Property Spoofing
- Removing navigator.webdriver flag
- Random navigator.plugins array
- Random navigator.platform
- Random navigator.hardwareConcurrency (4-16)
- Random navigator.deviceMemory (2-16GB)
- Random navigator.languages
- Random navigator.maxTouchPoints
Chrome Detection Evasion
- Removing Chrome DevTools Protocol (CDP) detection properties (cdcadoQpoasnfa76pfcZLmcfl*)
Performance Timing Randomization
- Modifying performance.getEntries() to add random timing offsets
- Randomizing both startTime and duration of performance entries
Proxy Usage
- Using residential proxies (groups: ['residential'])

6 comments

LLouis Deconinck

Change latest version of actor

I introduced a new version of my actor, but how do I make it the latest version? I assume that the README is also taken from the latest version?

LLouis Deconinck

published actors & proxies

I've developed an actor which I would like to publish as it is finished. However, in order to scrape sufficient data, proxies would be necessary. How does this work and who pays for the proxies when the actor is published? I'm currently doing local development on a free account.

1 comment

Apify Discord Mirror

Set country for datacenter proxies

RESIDENTIAL5 proxies

Purge all datasets using Apify CLI

Playwright chrome browser fails on Apify platform

PPE charging

Increase maximum number of redirects

Max redirects

Passing fingerprint.com bot protection locally but not on Apify

Change latest version of actor

published actors & proxies