What 250 runs of a Trustpilot scraper taught me about anti-bot patterns


Last week the Trustpilot scraper I publish on the Apify Store crossed 949+ organic runs. No paid promotion, no Reddit posts, no affiliate push — users just found it by searching the store. That makes it a useful sample size to talk about what actually held up under production load and what didn’t.

This post is the post-mortem. Numbers are real. The actor is public — you can read the source.

The architecture in one sentence

A Playwright-based actor with a small request queue, residential proxy rotation per session, and a normalized-output schema (reviewer, rating, body, language, published_at, verified).

What surprised me on the way to 250 runs

1. The slowest path was almost never the bot detection

I had built three layers of evasion (residential proxies, randomized fingerprints, human-like scroll patterns) expecting Trustpilot to push back hard. They didn’t, in any meaningful way. The error budget was eaten by something more boring:

  • Locale parsing. Trustpilot serves dates and rating widgets differently per locale. ~14% of my early failures were parser bugs, not blocks.
  • Pagination edge cases. A profile with exactly 20, 100, or 500 reviews would render slightly differently than one with 23. ~9% of failures.

If you’re tempted to start a scraper with a stealth library, start with parser tests instead. Detection costs you a request. Bad parsing costs you the entire dataset.

2. Residential proxies paid for themselves only at one threshold

Datacenter IPs worked perfectly for the first ~50 reviews of any given page. After that, an invisible rate limit kicked in — not a CAPTCHA, just empty responses. Switching to residential at the 50-review mark dropped the empty-response rate from 22% to 0.4%.

For shorter profiles (under 50 reviews) datacenter was cheaper and identical in success rate. The actor now picks the proxy tier dynamically based on the target profile’s review count fetched in the first request.

3. Pay-per-result pricing changed how users behaved

I started the actor on a flat per-run pricing model. ~40% of runs were people testing tiny profiles to see if it worked — high cost, low value. After switching to pay-per-result, users pulled larger datasets in fewer runs because the marginal cost of each review was visible.

Net effect: lower run count, higher revenue, happier users. The Apify Store team has been pushing PPR for good reason.

What the error log actually looks like

Across the most recent 30-day window:

Failure categoryShare of failed runs
Parser edge cases (locale, pagination)41%
Proxy provider rate limits19%
Trustpilot HTML structure change14%
User-supplied invalid URL12%
Out-of-memory on huge profiles8%
Other6%

Bot detection — the thing I expected to dominate — wasn’t even in the table.

What I would do differently if I started today

  1. Write parser tests first. Five fixture HTML files (one per locale, one with edge-case pagination) before a single Playwright call.
  2. Make memory limits explicit. Profiles with 5,000+ reviews need streaming output, not “build the array, return at end.” This bit me twice.
  3. Skip the stealth library entirely on the first version. Add it only after a real block, not preemptively.
  4. Default to pay-per-result pricing. It changes user behavior in a way that helps you both.

Want a custom scraper for a target this actor doesn’t cover?

Email spinov001@gmail.com with the target site, the data shape you need, and rough volume per month. Pilot rate is $100 for one delivery or $150 for a three-target series.

— Aleksei