Chain Apify actors: call one from another

Compose actors instead of stuffing every step into one. This guide shows how to call one Apify actor from another — synchronously with Actor.call() (wait for the result), or fire-and-forget with Actor.start()— and how to read the called actor's dataset.

Use with an AI agent

Open this guide as a pre-filled prompt — or copy it for Claude Code, Cursor, Codex, or any other coding agent.

TL;DR

  • Actor.call(actorId, input, options) — starts the other actor, waits for it to finish, and returns the run record (including defaultDatasetId).
  • Actor.start(actorId, input, options) — starts the other actor and returns immediately with the run id. Use a webhook to react when it's done.
  • Either way, the called actor's output lives in its own dataset — open it with Actor.openDataset(run.defaultDatasetId, { forceCloud: true }).

Why chain actors

Three patterns cover almost every case:

  • Orchestrator → worker. One actor decides what to scrape (e.g. shards a list of 10,000 URLs into 100 chunks) and calls a worker actor per chunk. Workers run in parallel; the orchestrator stays small and cheap.
  • Preprocess → main. Call apify/website-content-crawlerto get clean Markdown/HTML, then your actor parses it. You inherit the crawler's proxy handling, JS rendering, and content extraction without re-implementing any of it.
  • Main → enrich. Your actor produces raw rows, then calls an enrichment actor (translation, summarization, geocoding, screenshotting…) per item. Lets you mix and match best-of-breed actors instead of vendoring everything.

Actor.call(): wait for a result

Actor.call()blocks until the child run reaches a terminal status (succeeded, failed, timed out, or aborted) and returns the full run record. The fields you'll use most are run.id, run.status, and run.defaultDatasetId.

import { Actor } from 'apify';

await Actor.init();

// Calls another actor, waits for it to finish, returns the run record.
const run = await Actor.call('apify/website-content-crawler', {
  startUrls: [{ url: 'https://docs.apify.com' }],
  maxCrawlPages: 50,
  saveMarkdown: true,
}, {
  memory: 2048,   // MB
  timeout: 300,   // seconds
});

Actor.log.info(`Called run ${run.id} finished with status ${run.status}`);

// Read the called actor's default dataset.
const dataset = await Actor.openDataset(run.defaultDatasetId, { forceCloud: true });
const { items } = await dataset.getData();

Actor.log.info(`Got ${items.length} items from the called actor`);

// Re-emit (or transform, or enrich) into this run's dataset.
for (const item of items) {
  await Actor.pushData({
    sourceUrl: item.url,
    markdown: item.markdown,
    processedAt: new Date().toISOString(),
  });
}

await Actor.exit();

Reading the called actor's dataset

The called actor writes to its own default dataset. Open it by id with forceCloud: true (Python: force_cloud=True) so the SDK fetches from the Apify platform instead of looking for a local store:

const dataset = await Actor.openDataset(run.defaultDatasetId, { forceCloud: true });
const { items } = await dataset.getData();
dataset = await Actor.open_dataset(id=run.default_dataset_id, force_cloud=True)
items = (await dataset.get_data()).items

From there you can re-emit, transform, or enrich the items into your run's dataset with Actor.pushData() / Actor.push_data()— that's what your users actually receive.

Actor.start(): fire and forget

When the child run is long, or you want fan-out parallelism, don't sit and wait. Actor.start() returns the run id the moment the child is queued — your actor can move on (or exit) and react later via a webhook.

import { Actor } from 'apify';

await Actor.init();

const { id } = await Actor.start('username/your-worker', {
  chunkId: 42,
  startUrls: [{ url: 'https://example.com' }],
}, {
  memory: 1024,
  // Optional: notify a webhook when the worker finishes.
  webhooks: [{
    eventTypes: ['ACTOR.RUN.SUCCEEDED', 'ACTOR.RUN.FAILED'],
    requestUrl: 'https://your-server.example.com/apify-webhook',
  }],
});

Actor.log.info(`Started worker run ${id} — returning immediately`);

await Actor.exit();

Passing input and configuring the call

Both Actor.call() and Actor.start()take a third options argument. The ones you'll reach for:

  • memory (memory_mbytesin Python) — RAM for the child run, in MB. Defaults to the child actor's default. Bumping it usually scales CPU too.
  • timeout (timeout_secs in Python) — seconds before the child is forcibly aborted. Default is the platform max (often hours). Always set this when the child is on a critical path.
  • build — pin the child to a specific build tag (e.g. 'latest', 'beta', or a version like '1.2.0'). Lets you ride a stable build while the actor's latest moves.
  • webhooks — an array of webhook descriptors fired on ACTOR.RUN.SUCCEEDED, ACTOR.RUN.FAILED, etc. Required if you want to react to a start()ed run without polling.

Costs

Every called actor is its own billed run, billed to the calling user — not the actor's owner. A free orchestrator that calls 100 paid actors charges the user 100 paid runs. Two consequences:

  • Free orchestrators that wrap paid workers are fine — but disclose it on the listing.
  • The orchestrator itself keeps billing for memory while it sits in Actor.call(). For long children, prefer Actor.start() plus a webhook handler so the orchestrator can exit.

Error handling

Actor.call()throws if the child run ends in a non-success status (failed, timed out, aborted) — you don't need to inspect run.status in the happy path.

Two cases where you do need to check run.status yourself:

  • You used Actor.start()and looked the run up later via the API — there's no exception path.
  • You set waitSecs on Actor.call() low enough that it returns before the child finishes. The returned record will show whatever status the run was in at the cutoff (often RUNNING).

Wrap Actor.call() in try/except (or try/catch) when the child's failure shouldn't kill your run — e.g. enrichment that's nice-to-have but not blocking.

Gotchas worth knowing

  • Chained costs compound. Every called actor is its own billed run, billed to the calling user. A “free” orchestrator that calls 100 paid actors charges the user 100 paid runs.
  • Actor.call() is synchronous by default. Your run sits idle (still billed for memory) while the child runs. For long children, prefer Actor.start() + a webhook handler.
  • Mind the timeout. Default Actor.call() waits up to the platform max (often several hours). Set timeout_secs / timeout to match your SLA — better to fail fast than rack up runtime.
  • Don't infinite-loop. Actor A calling Actor A is allowed and will happily spin until you run out of compute units. Add a hard recursion guard.
  • Read datasets via forceCloud / force_cloud=True. Locally, openDataset(id) defaults to local storage. The flag tells the SDK to fetch from the platform — which is what you want for a dataset produced by another actor on the platform.
  • Webhook URLs need to be reachable. A webhook firing into localhost won't deliver. Use a tunnel (ngrok, cloudflared) during development.

Where to go next

Spotted a bug, or want a guide on something else?

support@mail.apifyhub.com