Description drift in serverless function catalogs — a monthly refresh playbook


If you run a catalog of serverless functions — Apify actors, Lambda, Cloud Run, Modal, Replicate — every description field starts rotting the moment you publish it. Run counts climb. User patterns change. The example use cases you cited at launch turn out to be the wrong ones. The catalog page that drove 60% of your inbound discovery six months ago is now an artifact of the past, quietly under-selling whatever you currently ship.

I run 32 public actors on the Apify Store. They’ve accumulated 2190 lifetime runs across the portfolio. Last month I noticed something embarrassing: a description that said “15 runs” was sitting in front of an actor that had quietly crossed 26. A description that named “no paying users yet” was on an actor with one. Numbers I’d written to set baselines had become millstones. Worse, the ICPs I’d guessed at launch were wrong by half — actors I’d positioned as “developer-tooling” were getting most of their traction from procurement teams, and vice versa.

This post is the 30-minute monthly refresh playbook I now run. It’s three steps: detect, decide, ship. Code at the bottom.

Step 1 — Detect what’s stale

The detection step is dumb on purpose. You don’t need a dashboard. You need a single query that surfaces actors where the run count on the live description doesn’t match the run count in the metrics API.

For Apify, the query is two REST calls:

import requests, re

API = "https://api.apify.com/v2"
TOKEN = os.environ["APIFY_TOKEN"]

def actors_with_stale_descriptions(user_id: str):
    actors = requests.get(
        f"{API}/users/{user_id}/actors",
        headers={"Authorization": f"Bearer {TOKEN}"},
        params={"limit": 100, "offset": 0},
    ).json()["data"]["items"]

    stale = []
    for actor in actors:
        meta = requests.get(
            f"{API}/acts/{actor['id']}",
            headers={"Authorization": f"Bearer {TOKEN}"},
        ).json()["data"]
        desc = meta.get("description", "")
        live_runs = meta.get("stats", {}).get("totalRuns", 0)
        m = re.search(r"(\d+)\s+runs?", desc)
        if not m:
            continue
        described_runs = int(m.group(1))
        if abs(live_runs - described_runs) >= 5:
            stale.append((actor["name"], described_runs, live_runs))
    return stale

Run this once a month against your catalog. Anything where the described run count is 5+ off the live count goes in the refresh queue. For my portfolio, last week it surfaced five actors: walmart-reviews (described 7, live 18), threads-scraper (23 → 40), country-info-scraper (19 → 30), exchange-rate-scraper (18 → 29), json-validator-formatter (15 → 26).

Step 2 — Decide what new signal to surface

This is the step where most catalogs go wrong. You can’t just bump the run count and call it done. The reason description drift matters at all is that the wrong number is anchored to the wrong ICP.

Three signals I now decide on per actor before drafting the new description:

1. Recent paying-user count, not lifetime run count. Lifetime runs reward the past. The number that matters for a prospect skimming your catalog is “is anyone paying for this right now?” Apify exposes stats.userActors7DaysWithRunCount and userActors30DaysWithRunCount. If u7 >= 1 your description should say so — that’s a stronger signal than a four-figure lifetime number.

2. ICP, not feature. The first version of my walmart-reviews-scraper description led with “scrapes Walmart product reviews.” Useful, but generic. The refresh leads with the actual ICP I’d inferred from looking at runs: “consumer-reviews monitoring for retail brand teams.” That’s the audience that converts. Feature-led descriptions invite a bid war against every other scraper. ICP-led descriptions invite a single buyer with a budget.

3. Cross-portfolio anchor. Every refreshed description now ends with the same two lines: “Production scraping tips: t.me/scraping_ai” and “Engineering notes: blog.spinov.online.” The Apify Store has surprisingly little internal navigation — most readers see one actor page and leave. The footer is the portfolio’s only chance to compound discovery across the catalog.

Step 3 — Ship, verify, log

The actual write is one PUT call. The trap is forgetting that you have to verify it landed. Apify’s PUT returns 200 even when the description has been silently truncated past the 300-character limit (ask me how I know). The verify pattern:

def refresh_description(actor_id: str, new_desc: str):
    if len(new_desc) > 300:
        raise ValueError(f"description exceeds 300 chars: {len(new_desc)}")
    requests.put(
        f"{API}/acts/{actor_id}",
        headers={"Authorization": f"Bearer {TOKEN}"},
        json={"description": new_desc},
    ).raise_for_status()
    # verify
    live = requests.get(
        f"{API}/acts/{actor_id}",
        headers={"Authorization": f"Bearer {TOKEN}"},
    ).json()["data"]["description"]
    assert live == new_desc, f"description silently truncated or rejected"
    return live

The verify GET is the half-line that catches the silent failures. I learned this the hard way after a cross-post pipeline that “succeeded” with HTTP 200 for ten days while serving 404 pages downstream. Write paths that don’t verify on the read side are write paths that lie to you.

Log every refresh somewhere stable. I keep a flat-file tracker (daily_publish_tracker.md) that gets one line per PUT — timestamp, actor name, what changed, char count, whether the GET-verify matched. Six months in, the tracker has caught two silent rollbacks where Apify reverted my edits during what looked like internal store re-indexing.

Four mistakes I made on the first pass

  1. Refreshing five descriptions in a single session. Looked productive. Was reckless — bulk metadata edits across a portfolio can trip platform anti-spam heuristics. Now I cap at three per day, none of them on the same ICP.

  2. Citing lifetime runs across the portfolio in every individual actor description. “Part of a 32-actor catalog with 2190 lifetime runs” reads like brag and clogs the 300-character limit. The portfolio anchor belongs in the footer URLs, not the lead.

  3. Forgetting that the description is the first thing AI scrapers see. Half my organic catalog traffic now comes from agents indexing Apify Store for actor-discovery prompts. Descriptions written for human skimming get summarized into nothing; descriptions written with a structured one-line opener (“[ICP]: [outcome]. [Method]. [Recency signal].”) survive the summary.

  4. Not budgeting time for the next refresh. Drift is recurring. I now block 30 minutes on the first Monday of each month — same way you’d run a deps upgrade on a package.json. Without the recurring slot it slides to “I’ll get to it” and never happens.

The whole loop — detect, decide, ship — takes 20 to 40 minutes monthly for a 30-actor portfolio. The compound payoff is that prospects who land on any one actor page see current numbers, current ICP, and a path back to the rest of the catalog. That’s the only thing description fields are for.


Production scraping tips: t.me/scraping_ai
Apify portfolio: apify.com/knotless_cadence (32 public actors, 2190 lifetime runs)
Questions or custom scraper work: spinov001@gmail.com