Schedule your Apify actor without burning credits
Apify Schedules let you run any actor on a cron expression. The trick is making the actor idempotentwith a tiny key-value cursor so a daily 6am run doesn't re-scrape what yesterday already produced.
Use with an AI agent
Open this guide as a pre-filled prompt — or copy it for Claude Code, Cursor, Codex, or any other coding agent.
TL;DR
- Apify has Schedules — cron-style triggers that fire a chosen actor with chosen input.
- Schedules run in UTC. Always.
- A naive scheduled actor re-does its full job on every fire. You pay each time.
- Make the actor idempotent: store a cursor in a named KV store, fetch only what's new since the last run, dedup by ID before pushing.
- Save the cursor only on success so failed runs retry the same window.
Set up a schedule in the Apify Console
- Open console.apify.com/schedules.
- Click Create new (top right).
- Give the schedule a name and a cron expression. For a daily 06:00 UTC run:
0 6 * * *- Click Actions → Add actor, pick the actor, and provide the input JSON it should run with. (You can add multiple actors to the same schedule — they all fire on the cron tick.)
- Save. The schedule starts firing on the next tick — no further deploy needed.
Schedules are first-class objects: you can pause one, edit the cron, change input, or trigger a one-off run from the same screen. They also have an API, so you can manage them programmatically with APIFY_TOKENif you'd rather keep them in source control.
Why a naive scheduled actor wastes money
Imagine a product-tracker actor. Every morning at 06:00 it scrapes the same 1,000 product pages, builds the same 1,000-row dataset, and exits. Tomorrow it does it again. The dataset grows by 1,000 rows per day even though almost nothing on those pages changed.
You pay for:
- Compute units to fetch and parse 1,000 pages (most of which are unchanged).
- Proxy traffic to load each one.
- Dataset storage that grows linearly with time, not with new information.
The fix isn't to run less often — your users want fresh data. The fix is to make each scheduled run only do new work.
The cursor pattern
A cursor is a tiny piece of state that says “this is where I left off.” Stored in a named KV store so it survives between runs but doesn't pollute your default storage tab:
import { Actor } from 'apify';
const STORE = 'schedule-cursor';
const KEY = 'cursor';
export async function loadCursor() {
const store = await Actor.openKeyValueStore(STORE);
return (await store.getValue(KEY)) ?? {
lastRunAt: 0,
lastSeenIds: [],
};
}
export async function saveCursor(cursor) {
const store = await Actor.openKeyValueStore(STORE);
await store.setValue(KEY, cursor);
}
from apify import Actor
STORE = 'schedule-cursor'
KEY = 'cursor'
async def load_cursor() -> dict:
store = await Actor.open_key_value_store(name=STORE)
return (await store.get_value(KEY)) or {
'lastRunAt': 0,
'lastSeenIds': [],
}
async def save_cursor(cursor: dict) -> None:
store = await Actor.open_key_value_store(name=STORE)
await store.set_value(KEY, cursor)
Two fields are usually enough: lastRunAt (a millisecond timestamp) for time-based sources, and lastSeenIds(a bounded list of recent IDs) for dedup against any source that doesn't have reliable monotonic timestamps.
Wire it into your actor
Load the cursor → fetch only what's newer than lastRunAt → dedup against lastSeenIds → push the new items → save the cursor:
import { Actor } from 'apify';
import { loadCursor, saveCursor } from './cursor.js';
import { fetchItems } from './source.js';
await Actor.init();
const cursor = await loadCursor();
const now = Date.now();
// Look back to the last run, capped at 7 days for safety on first runs.
const sinceMs = cursor.lastRunAt
? Math.min(now - cursor.lastRunAt + 60_000, 7 * 86400_000)
: 86400_000;
const items = await fetchItems({ since: now - sinceMs });
// Dedup — skip anything we've already pushed in a previous run.
const previouslySeen = new Set(cursor.lastSeenIds);
const newItems = items.filter((item) => !previouslySeen.has(item.id));
for (const item of newItems) {
await Actor.pushData(item);
}
Actor.log.info(`Pushed ${newItems.length} new items (skipped ${items.length - newItems.length} dupes)`);
// Save the cursor — keep the 200 most recent IDs to bound storage.
await saveCursor({
lastRunAt: now,
lastSeenIds: items.slice(-200).map((i) => i.id),
});
await Actor.exit();
import time
from apify import Actor
from .cursor import load_cursor, save_cursor
from .source import fetch_items
async def main() -> None:
async with Actor:
cursor = await load_cursor()
now_ms = int(time.time() * 1000)
# Look back to the last run, capped at 7 days for safety on first runs.
if cursor['lastRunAt']:
since_ms = min(now_ms - cursor['lastRunAt'] + 60_000, 7 * 86_400_000)
else:
since_ms = 86_400_000
items = await fetch_items(since=now_ms - since_ms)
# Dedup
previously_seen = set(cursor['lastSeenIds'])
new_items = [item for item in items if item['id'] not in previously_seen]
for item in new_items:
await Actor.push_data(item)
Actor.log.info(
f"Pushed {len(new_items)} new items "
f"(skipped {len(items) - len(new_items)} dupes)"
)
await save_cursor({
'lastRunAt': now_ms,
'lastSeenIds': [i['id'] for i in items[-200:]],
})
Three details worth calling out:
- The 60-second overlap.
now - cursor.lastRunAt + 60_000re-fetches one extra minute of data on every run. This is your safety net for items that arrive at the source slightly after theircreatedAt. - The 7-day cap. If the cursor is missing or very old (you paused the schedule for a month), don't try to backfill the entire gap in one run. Cap the lookback so the run completes inside the actor timeout, then let subsequent runs catch up naturally.
- Dedup first, push second. Filtering before
pushDatais what makes the dataset clean. ThelastSeenIdsset is bounded to 200 — tune up if your throughput justifies it.
Cron in UTC
Apify schedules run in UTC. There is no per-schedule timezone setting. Plan your cron expression on crontab.guru and then mentally subtract your offset.
Example: you want the actor to run at 06:00 PT every day.
- 06:00 PT = 13:00 UTC (during PDT) / 14:00 UTC (during PST).
- A
0 13 * * *schedule will fire at 06:00 PT in summer and 05:00 PT in winter.
If your users are sensitive to the local clock and you live in a country that does DST, you have two options: pick a UTC time you're happy to drift by an hour twice a year, or run the schedule hourly and have the actor itself check whether it's the “right” local hour to do work.
Overlapping runs
Apify will start a new scheduled run on every cron tick — even if the previous run hasn't finished. For a daily schedule that's fine. For a 5-minute schedule against a slow source, you can end up with several copies running at once, all racing on the same cursor.
Two ways to defend against this:
maxConcurrentRunson the schedule. Set it to1so the platform refuses to start a new run while the previous one is still going. Available on most paid plans.- A KV-store lock as a fallback. At the top of
main, write{ runId, startedAt }to alockkey. If the key already exists and itsstartedAtis fresh enough, exit immediately. Clear the key in afinallyblock. This works on any plan but you have to handle stale locks (a previous crash) yourself — usually “the lock is older than the actor's max run duration, so ignore it.”
Gotchas worth knowing
- Cron is UTC. A
0 9 * * *schedule fires at 09:00 UTC — that's 01:00 PT or 02:00 ET. Use crontab.guru to plan, then mentally subtract your offset. - Missed runs don't backfill. If the platform is down for an hour and your cron fires during the gap, you simply miss that run. The cursor pattern handles this — the next run picks up the wider window automatically.
- Schedules can overlap. A 5-minute schedule firing while the previous run still hasn't finished will start a second run by default. Set
maxConcurrentRuns: 1on the schedule, or guard with a KV-store lock if your platform tier doesn't support it. - Failures don't update the cursor. Save the cursor only on success — that way a failed run lets the next attempt re-process the same window.
- State is per actor. The named KV store
schedule-cursoris shared across all runs of this actor, which is exactly what you want. It's not per-user — scheduled runs don't have a user-tier concept the way a manual run does. - A 200-ID dedup window may not be enough. Tune it to your throughput — for high-volume actors, store hashes or use a Bloom filter.
Where to go next
- Cache results across runs — the companion pattern: when the source hasn't changed, return last run's data instead of re-fetching.
- How to tell if an Apify user is paying — handy if your scheduled runs serve different users on different tiers.
- Apify Pricing Calculator — work out what a daily schedule actually costs before you ship it.
Spotted a bug, or want a guide on something else?
support@mail.apifyhub.com