Chain Apify actors: call one from another
Compose actors instead of stuffing every step into one. This guide shows how to call one Apify actor from another — synchronously with Actor.call() (wait for the result), or fire-and-forget with Actor.start()— and how to read the called actor's dataset.
Use with an AI agent
Open this guide as a pre-filled prompt — or copy it for Claude Code, Cursor, Codex, or any other coding agent.
TL;DR
Actor.call(actorId, input, options)— starts the other actor, waits for it to finish, and returns the run record (includingdefaultDatasetId).Actor.start(actorId, input, options)— starts the other actor and returns immediately with the run id. Use a webhook to react when it's done.- Either way, the called actor's output lives in its own dataset — open it with
Actor.openDataset(run.defaultDatasetId, { forceCloud: true }).
Why chain actors
Three patterns cover almost every case:
- Orchestrator → worker. One actor decides what to scrape (e.g. shards a list of 10,000 URLs into 100 chunks) and calls a worker actor per chunk. Workers run in parallel; the orchestrator stays small and cheap.
- Preprocess → main. Call
apify/website-content-crawlerto get clean Markdown/HTML, then your actor parses it. You inherit the crawler's proxy handling, JS rendering, and content extraction without re-implementing any of it. - Main → enrich. Your actor produces raw rows, then calls an enrichment actor (translation, summarization, geocoding, screenshotting…) per item. Lets you mix and match best-of-breed actors instead of vendoring everything.
Actor.call(): wait for a result
Actor.call()blocks until the child run reaches a terminal status (succeeded, failed, timed out, or aborted) and returns the full run record. The fields you'll use most are run.id, run.status, and run.defaultDatasetId.
import { Actor } from 'apify';
await Actor.init();
// Calls another actor, waits for it to finish, returns the run record.
const run = await Actor.call('apify/website-content-crawler', {
startUrls: [{ url: 'https://docs.apify.com' }],
maxCrawlPages: 50,
saveMarkdown: true,
}, {
memory: 2048, // MB
timeout: 300, // seconds
});
Actor.log.info(`Called run ${run.id} finished with status ${run.status}`);
// Read the called actor's default dataset.
const dataset = await Actor.openDataset(run.defaultDatasetId, { forceCloud: true });
const { items } = await dataset.getData();
Actor.log.info(`Got ${items.length} items from the called actor`);
// Re-emit (or transform, or enrich) into this run's dataset.
for (const item of items) {
await Actor.pushData({
sourceUrl: item.url,
markdown: item.markdown,
processedAt: new Date().toISOString(),
});
}
await Actor.exit();
from datetime import datetime, timezone
from apify import Actor
async def main() -> None:
async with Actor:
# Calls another actor, waits for it to finish, returns the run record.
run = await Actor.call(
'apify/website-content-crawler',
run_input={
'startUrls': [{'url': 'https://docs.apify.com'}],
'maxCrawlPages': 50,
'saveMarkdown': True,
},
memory_mbytes=2048,
timeout_secs=300,
)
Actor.log.info(
f"Called run {run.id} finished with status {run.status}"
)
# Read the called actor's default dataset.
dataset = await Actor.open_dataset(
id=run.default_dataset_id,
force_cloud=True,
)
items = (await dataset.get_data()).items
Actor.log.info(f"Got {len(items)} items from the called actor")
for item in items:
await Actor.push_data({
'sourceUrl': item['url'],
'markdown': item['markdown'],
'processedAt': datetime.now(timezone.utc).isoformat(),
})
Reading the called actor's dataset
The called actor writes to its own default dataset. Open it by id with forceCloud: true (Python: force_cloud=True) so the SDK fetches from the Apify platform instead of looking for a local store:
const dataset = await Actor.openDataset(run.defaultDatasetId, { forceCloud: true });
const { items } = await dataset.getData();dataset = await Actor.open_dataset(id=run.default_dataset_id, force_cloud=True)
items = (await dataset.get_data()).itemsFrom there you can re-emit, transform, or enrich the items into your run's dataset with Actor.pushData() / Actor.push_data()— that's what your users actually receive.
Actor.start(): fire and forget
When the child run is long, or you want fan-out parallelism, don't sit and wait. Actor.start() returns the run id the moment the child is queued — your actor can move on (or exit) and react later via a webhook.
import { Actor } from 'apify';
await Actor.init();
const { id } = await Actor.start('username/your-worker', {
chunkId: 42,
startUrls: [{ url: 'https://example.com' }],
}, {
memory: 1024,
// Optional: notify a webhook when the worker finishes.
webhooks: [{
eventTypes: ['ACTOR.RUN.SUCCEEDED', 'ACTOR.RUN.FAILED'],
requestUrl: 'https://your-server.example.com/apify-webhook',
}],
});
Actor.log.info(`Started worker run ${id} — returning immediately`);
await Actor.exit();
from apify import Actor
async def main() -> None:
async with Actor:
run = await Actor.start(
'username/your-worker',
run_input={
'chunkId': 42,
'startUrls': [{'url': 'https://example.com'}],
},
memory_mbytes=1024,
webhooks=[{
'event_types': ['ACTOR.RUN.SUCCEEDED', 'ACTOR.RUN.FAILED'],
'request_url': 'https://your-server.example.com/apify-webhook',
}],
)
Actor.log.info(f"Started worker run {run.id} — returning immediately")
Passing input and configuring the call
Both Actor.call() and Actor.start()take a third options argument. The ones you'll reach for:
memory(memory_mbytesin Python) — RAM for the child run, in MB. Defaults to the child actor's default. Bumping it usually scales CPU too.timeout(timeout_secsin Python) — seconds before the child is forcibly aborted. Default is the platform max (often hours). Always set this when the child is on a critical path.build— pin the child to a specific build tag (e.g.'latest','beta', or a version like'1.2.0'). Lets you ride a stable build while the actor'slatestmoves.webhooks— an array of webhook descriptors fired onACTOR.RUN.SUCCEEDED,ACTOR.RUN.FAILED, etc. Required if you want to react to astart()ed run without polling.
Costs
Every called actor is its own billed run, billed to the calling user — not the actor's owner. A free orchestrator that calls 100 paid actors charges the user 100 paid runs. Two consequences:
- Free orchestrators that wrap paid workers are fine — but disclose it on the listing.
- The orchestrator itself keeps billing for memory while it sits in
Actor.call(). For long children, preferActor.start()plus a webhook handler so the orchestrator can exit.
Error handling
Actor.call()throws if the child run ends in a non-success status (failed, timed out, aborted) — you don't need to inspect run.status in the happy path.
Two cases where you do need to check run.status yourself:
- You used
Actor.start()and looked the run up later via the API — there's no exception path. - You set
waitSecsonActor.call()low enough that it returns before the child finishes. The returned record will show whatever status the run was in at the cutoff (oftenRUNNING).
Wrap Actor.call() in try/except (or try/catch) when the child's failure shouldn't kill your run — e.g. enrichment that's nice-to-have but not blocking.
Gotchas worth knowing
- Chained costs compound. Every called actor is its own billed run, billed to the calling user. A “free” orchestrator that calls 100 paid actors charges the user 100 paid runs.
Actor.call()is synchronous by default. Your run sits idle (still billed for memory) while the child runs. For long children, preferActor.start()+ a webhook handler.- Mind the timeout. Default
Actor.call()waits up to the platform max (often several hours). Settimeout_secs/timeoutto match your SLA — better to fail fast than rack up runtime. - Don't infinite-loop. Actor A calling Actor A is allowed and will happily spin until you run out of compute units. Add a hard recursion guard.
- Read datasets via
forceCloud/force_cloud=True. Locally,openDataset(id)defaults to local storage. The flag tells the SDK to fetch from the platform — which is what you want for a dataset produced by another actor on the platform. - Webhook URLs need to be reachable. A webhook firing into
localhostwon't deliver. Use a tunnel (ngrok, cloudflared) during development.
Where to go next
- Schedule your Apify actor — run an orchestrator on a cron.
- Send actor results to a webhook — the other half of the
Actor.start()pattern. - How to tell if an Apify user is paying — gate which children you call based on tier.
- Apify Pricing Calculator — model the compounding cost of chained runs.
Spotted a bug, or want a guide on something else?
support@mail.apifyhub.com