Getting no proxies on input - expected or a bug?

HHonza Javorek

I'm trying to integrate my Scrapy actor with Playwright, so I attempted to figure out what is the actual format of the proxy input from Apify, so that I could somehow pass it over to Playwright.

I printed out what my spider gets and this is what it prints:

Plain Text

APIFY_PROXY_SETTINGS: {'apifyProxyGroups': [], 'useApifyProxy': True}

Empty list. Is that expected or a bug? Does it mean that my scraper runs without proxies despite the fact I have the "Datacenter" option turned on? I'm really confused now.

The spider has some problems with getting blocked. If I thought I'm using proxies, but in fact there are none, then it's no surprise I'm getting blocked.

Attachment

3 comments

PPepa J

Hi
Yes, this is what comes from the UI component for selecting proxy.

If the ProxyConfiguration is set - Datacenter proxies are used as default (I think you also may use "DATACENTER" proxy group but that would be the same). There would be None value provided from the UI component instead of the object structure (I suppose as I am not that familiar with the Python SDK, you may might ask in to get better answer) when the No proxy tab is selected.

What you might be actually interested is not what comes from the UI component but actual proxyUrl - that actually contains information about the login and proxy password etc. You might find more information about it at https://docs.apify.com/sdk/python/docs/concepts/proxy-management#configuring-proxy-based-on-actor-input.

HHonza Javorek

Aaah, thanks! That makes sense 🙂 So the URL doesn't come from the input, it's only config, which allows the SDK to decide how to retrieve the URL from Apify's API. Cool!

AApifyBot

just advanced to level 4! Thanks for your contributions! 🎉

Add a reply

Join on Discord

Apify and Crawlee Official Forum

Getting no proxies on input - expected or a bug?