Apify Discord Mirror

Home
Members
Louis Deconinck
L
Louis Deconinck
Offline, last seen 3 weeks ago
Joined September 25, 2024
Is it possible to set a country for datacenter proxies? I currently have 30 Apify datacenter IPs which I think are from the US.

I'm specifically looking to use EU proxies, country doesn't matter. Is there a way to specify multiple country codes?

Example of how I do it with residential proxies.

Plain Text
const proxyConfiguration = await Actor.createProxyConfiguration({
    groups: ['RESIDENTIAL'],
    countryCode: 'FR',
});
2 comments
M
L
In the past I sometimes used RESIDENTIAL5 proxies, which I believed to be even better proxies than the regular RESIDENTIAL proxies. However, as of late they stopped working. Has anything changed in that regard? My scraper does no longer work and regular residential proxies keeps it blocked.
1 comment
M
with --purge you can delete the default dataset. This does not affect the other datasets. Is there a way to purge all datasets using the CLI?
1 comment
f
I have a scraper using Playwright, which still works perfectly locally. It also used to work on Apify, but since today it no longer does.

Has anything been changed about how Playwright is ran on Apify? The error talks about the old Chrome headles mode being removed?

See attachment for the full logs.

Plain Text
2025-02-07T18:07:53.704Z browserType.launchPersistentContext: Target page, context or browser has been closed
2025-02-07T18:07:53.705Z Browser logs: <launching> /home/myuser/pw-browsers/chrome --disable-field-trial-config ...
2025-02-07T18:07:53.708Z <launched> pid=36
2025-02-07T18:07:53.709Z [pid=36][err] Old Headless mode has been removed from the Chrome binary.


Haven't changed anything about the default Dockerfile, here it is:
Plain Text
FROM apify/actor-node-playwright-chrome:20 AS builder
RUN npm ls crawlee apify puppeteer playwright
COPY --chown=myuser package*.json ./
RUN npm install --include=dev --audit=false
COPY --chown=myuser . ./
RUN npm run build
FROM apify/actor-node-playwright-chrome:20
RUN npm ls crawlee apify puppeteer playwright
COPY --chown=myuser package*.json ./
RUN npm --quiet set progress=false \
    && npm install --omit=dev --omit=optional \
    && echo "Installed NPM packages:" \
    && (npm list --omit=dev --all || true) \
    && echo "Node.js version:" \
    && node --version \
    && echo "NPM version:" \
    && npm --version \
    && rm -r ~/.npm
COPY --from=builder --chown=myuser /home/myuser/dist ./dist
COPY --chown=myuser . ./
CMD ./start_xvfb_and_run_cmd.sh && npm run start:prod --silent
`
5 comments
a
L
s
Is there any way how I can verify what the user will be charged when doing a run of a PPE (Pay Per Event) actor? How can I verify that the charging is set up correctly on my end?
1 comment
S
I am getting this error message, how to best deal with it?
Reclaiming failed request back to the list or queue. Redirected 10 times. Aborting.
Can I increase the max number of redirects for my CheerioCrawler?
1 comment
O
The site I'm scraping uses fingerprint.com bot protection. Locally my code passes the protection 95% of the time, but when running the actor on Apify it never does. How is that possible?

To pass this protection I've implemented the following measures (complete code in next message), this was a bit of trial and error, so all feedback welcome:
  • Browser Configuration
    • Using Firefox instead of Chrome/Chromium
    • Using incognito pages (useIncognitoPages: true)
    • Enabled fingerprint randomization (useFingerprints: true)
  • Random Viewport/Screen Properties
    • Random window dimensions (1280-1920 x 720-1080)
    • Random device scale factor (1, 1.25, 1.5, or 2)
    • Random mobile/touch settings
    • Random color scheme (light/dark)
  • Locale and Timezone Randomization
    • Random locale from 8 different options
    • Random timezone from 8 different global locations
  • Browser Property Spoofing
    • Removing navigator.webdriver flag
    • Random navigator.plugins array
    • Random navigator.platform
    • Random navigator.hardwareConcurrency (4-16)
    • Random navigator.deviceMemory (2-16GB)
    • Random navigator.languages
    • Random navigator.maxTouchPoints
  • Chrome Detection Evasion
    • Removing Chrome DevTools Protocol (CDP) detection properties (cdcadoQpoasnfa76pfcZLmcfl*)
  • Performance Timing Randomization
    • Modifying performance.getEntries() to add random timing offsets
    • Randomizing both startTime and duration of performance entries
  • Proxy Usage
    • Using residential proxies (groups: ['residential'])
6 comments
L
M
I introduced a new version of my actor, but how do I make it the latest version? I assume that the README is also taken from the latest version?
I've developed an actor which I would like to publish as it is finished. However, in order to scrape sufficient data, proxies would be necessary. How does this work and who pays for the proxies when the actor is published? I'm currently doing local development on a free account.
1 comment
R