Schedule your Apify actor without burning credits

Apify Schedules let you run any actor on a cron expression. The trick is making the actor idempotentwith a tiny key-value cursor so a daily 6am run doesn't re-scrape what yesterday already produced.

Use with an AI agent

Open this guide as a pre-filled prompt — or copy it for Claude Code, Cursor, Codex, or any other coding agent.

TL;DR

  • Apify has Schedules — cron-style triggers that fire a chosen actor with chosen input.
  • Schedules run in UTC. Always.
  • A naive scheduled actor re-does its full job on every fire. You pay each time.
  • Make the actor idempotent: store a cursor in a named KV store, fetch only what's new since the last run, dedup by ID before pushing.
  • Save the cursor only on success so failed runs retry the same window.

Set up a schedule in the Apify Console

  1. Open console.apify.com/schedules.
  2. Click Create new (top right).
  3. Give the schedule a name and a cron expression. For a daily 06:00 UTC run:
0 6 * * *
  1. Click Actions Add actor, pick the actor, and provide the input JSON it should run with. (You can add multiple actors to the same schedule — they all fire on the cron tick.)
  2. Save. The schedule starts firing on the next tick — no further deploy needed.

Schedules are first-class objects: you can pause one, edit the cron, change input, or trigger a one-off run from the same screen. They also have an API, so you can manage them programmatically with APIFY_TOKENif you'd rather keep them in source control.

Why a naive scheduled actor wastes money

Imagine a product-tracker actor. Every morning at 06:00 it scrapes the same 1,000 product pages, builds the same 1,000-row dataset, and exits. Tomorrow it does it again. The dataset grows by 1,000 rows per day even though almost nothing on those pages changed.

You pay for:

  • Compute units to fetch and parse 1,000 pages (most of which are unchanged).
  • Proxy traffic to load each one.
  • Dataset storage that grows linearly with time, not with new information.

The fix isn't to run less often — your users want fresh data. The fix is to make each scheduled run only do new work.

The cursor pattern

A cursor is a tiny piece of state that says “this is where I left off.” Stored in a named KV store so it survives between runs but doesn't pollute your default storage tab:

import { Actor } from 'apify';

const STORE = 'schedule-cursor';
const KEY = 'cursor';

export async function loadCursor() {
  const store = await Actor.openKeyValueStore(STORE);
  return (await store.getValue(KEY)) ?? {
    lastRunAt: 0,
    lastSeenIds: [],
  };
}

export async function saveCursor(cursor) {
  const store = await Actor.openKeyValueStore(STORE);
  await store.setValue(KEY, cursor);
}

Two fields are usually enough: lastRunAt (a millisecond timestamp) for time-based sources, and lastSeenIds(a bounded list of recent IDs) for dedup against any source that doesn't have reliable monotonic timestamps.

Wire it into your actor

Load the cursor → fetch only what's newer than lastRunAt → dedup against lastSeenIds → push the new items → save the cursor:

import { Actor } from 'apify';
import { loadCursor, saveCursor } from './cursor.js';
import { fetchItems } from './source.js';

await Actor.init();

const cursor = await loadCursor();
const now = Date.now();

// Look back to the last run, capped at 7 days for safety on first runs.
const sinceMs = cursor.lastRunAt
  ? Math.min(now - cursor.lastRunAt + 60_000, 7 * 86400_000)
  : 86400_000;

const items = await fetchItems({ since: now - sinceMs });

// Dedup — skip anything we've already pushed in a previous run.
const previouslySeen = new Set(cursor.lastSeenIds);
const newItems = items.filter((item) => !previouslySeen.has(item.id));

for (const item of newItems) {
  await Actor.pushData(item);
}

Actor.log.info(`Pushed ${newItems.length} new items (skipped ${items.length - newItems.length} dupes)`);

// Save the cursor — keep the 200 most recent IDs to bound storage.
await saveCursor({
  lastRunAt: now,
  lastSeenIds: items.slice(-200).map((i) => i.id),
});

await Actor.exit();

Three details worth calling out:

  • The 60-second overlap. now - cursor.lastRunAt + 60_000 re-fetches one extra minute of data on every run. This is your safety net for items that arrive at the source slightly after their createdAt.
  • The 7-day cap. If the cursor is missing or very old (you paused the schedule for a month), don't try to backfill the entire gap in one run. Cap the lookback so the run completes inside the actor timeout, then let subsequent runs catch up naturally.
  • Dedup first, push second. Filtering before pushData is what makes the dataset clean. The lastSeenIds set is bounded to 200 — tune up if your throughput justifies it.

Cron in UTC

Apify schedules run in UTC. There is no per-schedule timezone setting. Plan your cron expression on crontab.guru and then mentally subtract your offset.

Example: you want the actor to run at 06:00 PT every day.

  • 06:00 PT = 13:00 UTC (during PDT) / 14:00 UTC (during PST).
  • A 0 13 * * * schedule will fire at 06:00 PT in summer and 05:00 PT in winter.

If your users are sensitive to the local clock and you live in a country that does DST, you have two options: pick a UTC time you're happy to drift by an hour twice a year, or run the schedule hourly and have the actor itself check whether it's the “right” local hour to do work.

Overlapping runs

Apify will start a new scheduled run on every cron tick — even if the previous run hasn't finished. For a daily schedule that's fine. For a 5-minute schedule against a slow source, you can end up with several copies running at once, all racing on the same cursor.

Two ways to defend against this:

  • maxConcurrentRuns on the schedule. Set it to 1 so the platform refuses to start a new run while the previous one is still going. Available on most paid plans.
  • A KV-store lock as a fallback. At the top of main, write { runId, startedAt } to a lock key. If the key already exists and its startedAt is fresh enough, exit immediately. Clear the key in a finallyblock. This works on any plan but you have to handle stale locks (a previous crash) yourself — usually “the lock is older than the actor's max run duration, so ignore it.”

Gotchas worth knowing

  • Cron is UTC. A 0 9 * * *schedule fires at 09:00 UTC — that's 01:00 PT or 02:00 ET. Use crontab.guru to plan, then mentally subtract your offset.
  • Missed runs don't backfill. If the platform is down for an hour and your cron fires during the gap, you simply miss that run. The cursor pattern handles this — the next run picks up the wider window automatically.
  • Schedules can overlap. A 5-minute schedule firing while the previous run still hasn't finished will start a second run by default. Set maxConcurrentRuns: 1on the schedule, or guard with a KV-store lock if your platform tier doesn't support it.
  • Failures don't update the cursor. Save the cursor only on success — that way a failed run lets the next attempt re-process the same window.
  • State is per actor. The named KV store schedule-cursoris shared across all runs of this actor, which is exactly what you want. It's not per-user — scheduled runs don't have a user-tier concept the way a manual run does.
  • A 200-ID dedup window may not be enough. Tune it to your throughput — for high-volume actors, store hashes or use a Bloom filter.

Where to go next

Spotted a bug, or want a guide on something else?

support@mail.apifyhub.com