Apify and Crawlee Official Forum

Updated 3 months ago

Getting no proxies on input - expected or a bug?

I'm trying to integrate my Scrapy actor with Playwright, so I attempted to figure out what is the actual format of the proxy input from Apify, so that I could somehow pass it over to Playwright.

I printed out what my spider gets and this is what it prints:
Plain Text
APIFY_PROXY_SETTINGS: {'apifyProxyGroups': [], 'useApifyProxy': True}

Empty list. Is that expected or a bug? Does it mean that my scraper runs without proxies despite the fact I have the "Datacenter" option turned on? I'm really confused now.

The spider has some problems with getting blocked. If I thought I'm using proxies, but in fact there are none, then it's no surprise I'm getting blocked.
Attachment
Screenshot_2024-04-15_at_12.11.48.png
P
H
A
3 comments
Hi
Yes, this is what comes from the UI component for selecting proxy.

If the ProxyConfiguration is set - Datacenter proxies are used as default (I think you also may use "DATACENTER" proxy group but that would be the same). There would be None value provided from the UI component instead of the object structure (I suppose as I am not that familiar with the Python SDK, you may might ask in to get better answer) when the No proxy tab is selected.

What you might be actually interested is not what comes from the UI component but actual proxyUrl - that actually contains information about the login and proxy password etc. You might find more information about it at https://docs.apify.com/sdk/python/docs/concepts/proxy-management#configuring-proxy-based-on-actor-input.
Aaah, thanks! That makes sense πŸ™‚ So the URL doesn't come from the input, it's only config, which allows the SDK to decide how to retrieve the URL from Apify's API. Cool!
just advanced to level 4! Thanks for your contributions! πŸŽ‰
Add a reply
Sign up and join the conversation on Discord