Apify Discord Mirror

Home
Members
jensmeichler
j
jensmeichler
Offline, last seen last week
Joined February 12, 2025
I have a use case where I want to have a crawler running permanently. This crawler has a tieredProxyList set up that it will iterate over in case some of them don't work. For scraping some pages I don't want to use proxies to reduce the amount of money I am spending on them (When I scrape my own page I don't want to proxy, but I want to use the same logic / handlers. Is it possible to specify either the proxy that should be used for specific requests? Or maybe even the proxy tier?

Basic Setup:

const proxyConfiguration = new ProxyConfiguration({tieredProxyUrls: [{'proxyTier1'], ['proxyTier2']]});

const crawler = new PlaywrightCrawler(
{
keepAlive: true,
proxyConfiguration: proxyConfiguration,
// ...
},
);

// ...

crawler.addRequests(requestsWhereWeWantProxies);
crawler.addRequests(requestsWhereWeDontWantProxies);

It would be nice to be able to do something like:

crawler.addRequests(requestsWhereWeWantProxies);
crawler.addRequests(requestsWhereWeDontWantProxies.map((request) => ({...request, proxy: null}));

or

const proxyConfiguration = new ProxyConfiguration({tieredProxyUrls: [{'proxyTier1'], ['proxyTier2'], [null]]});

// ...

crawler.addRequests(requestsWhereWeWantProxies);
crawler.addRequests(requestsWhereWeDontWantProxies.map((request) => ({...request, proxyTier: 2}));
3 comments
j
L