Use Apify Proxy without getting blocked
Getting blocked is the single biggest reason an Apify actor stops working in production. This guide shows you how to combine Apify Proxy with session pools, country targeting, and retry-on-block so your scraper keeps running when the target site fights back.
Use with an AI agent
Open this guide as a pre-filled prompt — or copy it for Claude Code, Cursor, Codex, or any other coding agent.
TL;DR
Spin up a residential ProxyConfiguration once, hand it to your crawler, and let session pools do the rotation for you:
const proxyConfiguration = await Actor.createProxyConfiguration({
groups: ['RESIDENTIAL'],
countryCode: 'US',
});proxy_configuration = await Actor.create_proxy_configuration(
groups=['RESIDENTIAL'],
country_code='US',
)The rest of this guide is about doing that correctly — and not burning $100 of residential traffic on accident.
When you actually need a proxy
Not every actor needs Apify Proxy. Reach for it when one of these is true:
- You're getting blocked. Datacenter IPs from the Apify run container land on enough blocklists that even forgiving sites will eventually start serving 403s, captchas, or empty pages.
- You need geo-targeted content. Pricing, search results, and product availability often change by country. A residential US IP will see what a US shopper sees.
- Sessions need to look real. Logged-in flows, multi-step checkouts, or anything that builds up cookies works best when every request from a “user” comes from the same IP for the duration of that session.
If you can scrape a site happily from your laptop with no proxy, you probably don't need one in production either. Start without; add it the first time you get a 403.
Datacenter vs residential
Apify Proxy ships two main pools. Pick the cheapest one that still works.
| Property | Datacenter | Residential |
|---|---|---|
| Speed | Fast | Slower (real consumer connections) |
| Cost | Cheap (per GB) | 5–10× datacenter |
| Block resistance | Low — known IP ranges | High — looks like real users |
| Country support | Limited / unreliable | Yes, per-country routing |
| Good for | Forgiving sites, JSON APIs | Amazon, Google, social, anti-bot vendors |
Default to DATACENTERwhen you're prototyping. Switch to RESIDENTIAL the moment you see consistent 403s or captchas.
The four knobs
Almost every blocking problem comes down to tuning these four things:
groups— which proxy pool you want (['DATACENTER']or['RESIDENTIAL']). Residential is your blocked-by-default escape hatch.countryCode— ISO country code ('US','DE','BR'). Tells the proxy to route through an IP in that country.- Session pool — a rotating set of identities (IP + cookies) that the SDK manages and retires when they get blocked.
- Retry on 403/429 — when a session is blocked, retire it and let the crawler re-queue the request with a fresh one.
Get all four right and most sites will treat you like a normal-looking pool of users.
The proxy helper
Drop this file next to your main entry point. It returns a configured ProxyConfiguration on the platform and undefined/None locally so you don't burn proxy credits during dev.
import { Actor } from 'apify';
/**
* Returns a ProxyConfiguration, or `undefined` when running locally
* so you don't burn proxy credits during development.
*/
export async function getProxyConfiguration({
groups = ['RESIDENTIAL'],
countryCode = 'US',
} = {}) {
if (!Actor.isAtHome()) return undefined;
return await Actor.createProxyConfiguration({ groups, countryCode });
}
from apify import Actor
async def get_proxy_configuration(
groups: list[str] | None = None,
country_code: str = 'US',
):
"""Return a ProxyConfiguration, or None when running locally."""
if not Actor.is_at_home():
return None
return await Actor.create_proxy_configuration(
groups=groups or ['RESIDENTIAL'],
country_code=country_code,
)
The Apify SDK already treats an undefinedproxy configuration as “no proxy” — so passing the helper's return value straight into a crawler is safe in every environment.
Using it with Crawlee
Crawlee's session pool is the cheapest unblocker you can buy. It maintains a pool of session IDs, sticks cookies to each one, rotates them across requests, and lets you retire the bad ones on demand.
import { Actor } from 'apify';
import { CheerioCrawler } from 'crawlee';
import { getProxyConfiguration } from './proxy.js';
await Actor.init();
const proxyConfiguration = await getProxyConfiguration();
const crawler = new CheerioCrawler({
proxyConfiguration,
useSessionPool: true,
persistCookiesPerSession: true,
maxRequestRetries: 5,
async requestHandler({ request, $, session, response }) {
if ([401, 403, 429].includes(response.statusCode)) {
session.retire();
throw new Error(`Blocked at ${request.url} — retiring session`);
}
await Actor.pushData({ url: request.url, title: $('title').text() });
},
});
await crawler.run(['https://example.com']);
await Actor.exit();
What's doing the work here:
useSessionPool: true— Crawlee allocates a pool of sessions and binds each request to one.persistCookiesPerSession: true— each session keeps its own cookie jar, so logged-in flows survive across requests.- The 401/403/429 check calls
session.retire()and throws — Crawlee then re-queues the request with a different session.
Using it without Crawlee
If you're not using Crawlee — maybe you're calling a JSON API directly — you can still get an Apify Proxy URL and route a single request through it.
import { Actor } from 'apify';
import { ProxyAgent, fetch as undiciFetch } from 'undici';
await Actor.init();
const proxyConfiguration = await Actor.createProxyConfiguration({
groups: ['RESIDENTIAL'],
countryCode: 'US',
});
// Sticky session — same IP reused for every call with this sessionId.
const proxyUrl = await proxyConfiguration.newUrl('user-42');
const dispatcher = new ProxyAgent(proxyUrl);
const response = await undiciFetch('https://example.com', { dispatcher });
const html = await response.text();
Actor.log.info(`Fetched ${html.length} bytes`);
await Actor.exit();
import httpx
from apify import Actor
async def main() -> None:
async with Actor:
proxy_configuration = await Actor.create_proxy_configuration(
groups=['RESIDENTIAL'],
country_code='US',
)
# Sticky session — same IP reused across requests using this session_id.
proxy_url = await proxy_configuration.new_url(session_id='user-42')
async with httpx.AsyncClient(proxy=proxy_url) as client:
response = await client.get('https://example.com')
Actor.log.info(f"Fetched {len(response.text)} bytes")
On 401/403/429 with raw fetch, change the sessionId you pass to newUrl()and retry. You're managing the rotation yourself.
Sticky sessions
Passing a sessionId to newUrl() tells Apify Proxy to reuse the same upstream IP for every call that uses that ID. That's what you want for:
- Logged-in flows where switching IP mid-session triggers a re-auth.
- Multi-step checkouts that bind a cart to an IP.
- Anything that builds session state on the target site over several requests.
Pick anything stable as the session ID — 'user-42', `cart-${userId}`, a hash of your input. Apify holds the IP for that ID for ~10 minutes.
Gotchas worth knowing
- No proxy locally by default.
Actor.createProxyConfiguration()returnsundefinedlocally unless you setAPIFY_PROXY_PASSWORD(andAPIFY_TOKEN). The helper above gives you that behavior for free. - Residential is 5–10× the cost of datacenter. Use datacenter for forgiving sites; residential only when you're actually getting blocked.
countryCodeonly does anything on residential. Datacenter pools don't currently honor country targeting reliably — don't rely on it for geo-locked content.- Sessions beat single IPs. Always use
useSessionPool: truewith Crawlee — it rotates and retires automatically. With raw fetch, pass asessionIdtonewUrl()and rotate it yourself on 403/429. - Don't retry the same session forever. A retired session is dead. Retry the requestwith a fresh session; don't retry the same session.
- Mind the budget. Apify charges by GB of proxy traffic, not by request. Heavy images/PDFs through residential is the fastest way to spend $100 on nothing — block media requests with Crawlee's
preNavigationHooksif you don't need them.
Where to go next
- Add free-tier limits to your Apify actor — cap how much proxy traffic free users can burn through.
- How to tell if an Apify user is paying — gate residential proxy access behind paid plans if it's expensive to deliver.
- Apify Pricing Calculator — factor proxy GB into your bundle pricing so you don't lose money on every run.
Spotted a bug, or want a guide on something else?
support@mail.apifyhub.com