5 Apify run-log patterns that make production debugging 10× faster


5 Apify run-log patterns that make production debugging 10× faster

I run 78 Apify actors (31 published in the public Store, 47 private). The Trustpilot scraper alone has 951 lifetime runs across 3 paying users; the Reddit scraper has 80+; the email extractor has 107. Most of them work fine until they don’t, and when they don’t, you get a Slack ping at 04:12 UTC that says “Trustpilot run #523 failed” and a link to a 14 MB log file.

If your logs look like every other Node.js process — console.log("processing item") over and over — finding the actual failure inside that 14 MB takes 30 minutes of grepping. With the 5 patterns below, the same failure takes 30 seconds.

These are the patterns I wish I’d put into every actor on day one. None of them require a new dependency. All of them survive the move from “one-off scraper” to “actor with paying users.”


Pattern 1: Tag-prefix every log line so grep narrows in one pass

The default console.log("retrying") is useless across 50,000 lines. The fix is a fixed vocabulary of square-bracket tags, one per category of event.

Before:

console.log("retrying after 429");
console.log("got 429 again");
console.log("backoff 4s");
console.log("retrying");

After:

const log = (tag, msg, extra = {}) =>
  console.log(`[${tag}] ${msg} ${JSON.stringify(extra)}`);

log('RETRY', 'http_429', { url, attempt: 2, backoff_ms: 4000 });
log('SOFT-BLOCK', 'cloudflare_challenge', { url, status: 200, body_marker: 'cf-challenge' });
log('DEDUP-SKIP', 'already_in_dataset', { unique_key: review.id });

Now your post-mortem grep is one line:

apify call ... | grep '\[SOFT-BLOCK\]'

The vocabulary I use across all 78 actors: [INIT], [RETRY], [SOFT-BLOCK], [DEDUP-SKIP], [PAGINATION], [PROXY], [OUT], [FATAL], [SUMMARY]. Pick yours and stick to it — half the value of the pattern is consistency across actors so grep behaves the same on every run.

Pattern 2: Log structured JSON, not free-form strings

Once you have tags, take the next step and put the actual data in JSON. Apify’s run log is plain text, but jq works fine if every line is [TAG] {json}.

Before:

console.log(`processed ${count} items, took ${elapsed}ms`);

After:

log('PROGRESS', 'page_done', {
  page: pageNum,
  items_extracted: count,
  ms: elapsed,
  proxy_country: proxy.country,
});

After the run, query across all pages:

apify run-logs ... | grep '\[PROGRESS\]' | sed 's/.*\[PROGRESS\] page_done //' \
  | jq -s 'map(.ms) | {avg: (add/length), max: max, min: min}'

That’s { "avg": 1842, "max": 9314, "min": 412 } — and the max: 9314ms is your slowest page. Without structured logs you’d have to grep + awk + manually parse the string. With them, one jq answers the question.

The cost of this pattern is roughly 8 extra characters per log line. The benefit is that every log line is queryable forever, including across runs (more on that in Pattern 5).

Pattern 3: Log every pagination boundary so resumability is free

When a 6-hour scraper dies at hour 4, the question is: which page do I restart from? If you only log “processing item 12,847 of unknown” you have no idea. If you log every pagination boundary, restart is a one-line decision.

Before:

for (let page = 1; ; page++) {
  const items = await fetchPage(page);
  if (items.length === 0) break;
  await Actor.pushData(items);
}

After:

for (let page = 1; ; page++) {
  log('PAGINATION', 'page_start', { page, cursor: lastCursor });
  const items = await fetchPage(page);
  if (items.length === 0) {
    log('PAGINATION', 'page_empty_terminating', { page });
    break;
  }
  await Actor.pushData(items);
  log('PAGINATION', 'page_done', { page, items: items.length, cursor: items[items.length-1].id });
}

Now if the run dies at hour 4, you grep '\[PAGINATION\]' | tail -5 and see the last successful boundary. Restart input becomes { "start_page": 247, "start_cursor": "abc..." }. You don’t lose 4 hours of work, and customers don’t see the gap.

I missed this on the Reddit scraper for the first 30 runs. When run #31 ghosted at 4am on a Saturday, restart took 90 minutes of manual cursor reconstruction. After this pattern landed, restart was 90 seconds.

Pattern 4: Log proxy IP per request — auditable rotation, free debugging

When a customer says “my run got soft-blocked,” the first question is “did your proxy actually rotate?” Without per-request proxy logging, the answer is “I think so.” With it, you can prove rotation happened — or discover that you used the same IP for 400 requests because rotation was misconfigured.

Before:

const response = await gotScraping({ url, proxyUrl: proxyConfig.newUrl() });

After:

const proxyUrl = await proxyConfig.newUrl();
const proxyHost = new URL(proxyUrl).hostname;
log('PROXY', 'request', { url, proxy_host: proxyHost, session: sessionId });
const response = await gotScraping({ url, proxyUrl });
log('PROXY', 'response', { url, proxy_host: proxyHost, status: response.statusCode, ms: response.timings.phases.total });

Two queries unlock:

# Did rotation actually happen?
apify run-logs ... | grep '\[PROXY\] request' | jq -r '.proxy_host' | sort -u | wc -l
# 47   <-- 47 unique proxy hosts across the run, rotation works
# Which proxy IP saw the most 403s?
apify run-logs ... | grep '\[PROXY\] response' | jq -s 'map(select(.status==403)) | group_by(.proxy_host) | map({host: .[0].proxy_host, count: length}) | sort_by(.count) | reverse | .[0:5]'

That second query has saved me hours on Trustpilot specifically — 1 proxy host out of the residential pool was getting 80% of the 403s, and the answer was “rotate that subnet out, not the whole pool.”

Pattern 5: Write a run-end summary to dataset-meta so jq works across runs

The four patterns above make a single run debuggable. This last one makes the fleet debuggable — you can answer “across the last 30 runs, how many soft-blocked?” without re-fetching 30 log files.

At end-of-run, write a single summary object to a known KV key (or to a small _meta dataset):

Before: no run-end log — you scroll back through console output to see what happened.

After:

const summary = {
  run_id: process.env.APIFY_ACTOR_RUN_ID,
  started_at: startTs,
  ended_at: Date.now(),
  duration_ms: Date.now() - startTs,
  items_pushed: itemsCount,
  pages_scraped: pageNum,
  retries_total: retryCount,
  soft_blocks: softBlockCount,
  proxy_hosts_used: proxyHosts.size,
  fatal: null,
};
log('SUMMARY', 'run_complete', summary);
await Actor.setValue('RUN_SUMMARY', summary);

Now across the fleet:

# Last 30 runs, how often did soft-blocks happen?
apify task runs ... --limit 30 --json | \
  jq -r '.items[].id' | \
  while read id; do apify run kv-store-record get $id RUN_SUMMARY; done | \
  jq -s 'map(.soft_blocks) | {total: add, runs: length, avg: (add/length), max: max}'

Output: { "total": 47, "runs": 30, "avg": 1.5, "max": 18 } — and now you know which run had 18 soft-blocks and you can drill into that one specifically. Without this pattern, that question costs 30 minutes of clicking through the Apify UI.

I added this to the Trustpilot scraper after run #500 specifically. Two weeks later it caught a regression: a new Cloudflare rule on Trustpilot bumped soft-blocks from ~1/run average to ~6/run average. The first sign was the soft_blocks field jumping in 3 consecutive run summaries — caught the trend in 24 hours instead of 5 days of customer complaints.


Putting it together

Adopt all 5 and your post-mortems shrink from “30 minutes of grep” to a few one-liners:

QuestionCommand
What broke this run?grep '\[FATAL\]'
Where do I restart from?grep '\[PAGINATION\]' | tail -5
Did proxy rotation work?grep '\[PROXY\] request' | jq -r .proxy_host | sort -u | wc -l
Which proxies got 403’d?grep '\[PROXY\] response' | jq 'select(.status==403)'
How does this run compare to last 30?... | jq RUN_SUMMARY

These patterns add maybe 40 lines of code to an actor and pay for themselves the first time a run dies at 04:00 UTC.


If you’re running production scrapers on Apify and want a second pair of eyes on log hygiene + restart strategy + proxy auditability — I do paid case-study writeups and operational reviews. Email spinov001@gmail.com or browse my Apify Store (78 actors, 1,000+ paying-user runs). More write-ups at blog.spinov.online and @scraping_ai.