Spinov · Web Scraping & AI Research

Spinov · Web Scraping & AI Research

Home Blog About Consulting Sponsor

Apify Store Email Telegram channel @scraping_ai RSS Feed

9 Keyless Health APIs: You Asked for v1, You Got v8

Jul 29, 2026
10 Keyless Statistics APIs: Never Delete the World Row

Jul 27, 2026
9 Keyless Currency APIs, and the Yen That Breaks Your Cents

Jul 26, 2026
I Gave 9 Free IP-Geolocation APIs One IP. 3 Pointed to a Farm in Kansas.

Jul 25, 2026
7 No-Key Weather APIs: Finding One Is Easy, the Units Are the Trap

Jul 24, 2026
11 Keyless Package-Registry APIs, and Why You Can't Pin `latest`

Jul 23, 2026
16 Keyless DNS and Cert APIs. Only 3 of 12 Answer for Themselves.

Jul 22, 2026
Your Bot Asked 5 Keyless Crypto APIs for UNI. None Said Which UNI.

Jul 21, 2026
16 Keyless Game APIs, 5 Incompatible Ways to Say "Empty"

Jul 20, 2026
Free Sports APIs: I Curl-Tested 15, 8 Need No Key

Jul 19, 2026
Your Dictionary API 404s on Real Words. Here Are 9 Keyless.

Jul 18, 2026
15 Keyless Transit APIs, and a Timestamp Their Own Docs Get Wrong

Jul 17, 2026
10 Free Art & Museum APIs With No Key (2026)

Jul 16, 2026
11 Free Space & Astronomy APIs With No Key (2026)

Jul 15, 2026
11 Free Music APIs With No Key or Signup (2026)

Jul 14, 2026
10 Free Facts, Jokes & Name APIs With No Key (2026)

Jul 13, 2026
9 Free Mock & Fake-Data APIs With No Key (2026)

Jul 12, 2026
8 Free Pop-Culture APIs With No Key (2026)

Jul 11, 2026
8 Free Research Paper APIs With No Key (2026)

Jul 10, 2026
8 Free Food & Nutrition APIs (No Key, Tested 2026)

Jul 9, 2026
10 Free Government APIs With No Key or Signup (2026)

Jul 7, 2026
9 Free Public Holiday & Time APIs With No Key (2026)

Jul 6, 2026
9 Free Company Data APIs With No Key or Signup (2026)

Jul 5, 2026
8 Free CVE & Vulnerability APIs With No Key (2026)

Jul 4, 2026
8 Free Geocoding APIs With No Key and No Signup (2026)

Jul 3, 2026
11 Free No-Key APIs Your AI Agent Can Use to Read the Web

Jul 2, 2026
Your LLM JSON Got Cut Off. Don't Just Raise max_tokens

Jul 1, 2026
Your Agent Success Rate Counts Only the Survivors

Jun 30, 2026
RAG Chunking: Overlap=0 Drops Facts on the Boundary

Jun 29, 2026
SSRF in AI Agents: Blocking 169.254 by String Isn't Enough

Jun 28, 2026
Caching LLM Calls: A Raw Prompt Key Almost Never Hits

Jun 27, 2026
You Can't Unit-Test an AI Agent. You Can Regression-Gate It.

Jun 26, 2026
Your Agent Trusts the Tool's Description. The Attack Hides There.

Jun 25, 2026
Your AI Agent Logged Its Own API Key. I Wrote the 40-Line Redactor.

Jun 24, 2026
Your RAG Answers Confidently. The Source Doesn't Say That.

Jun 23, 2026
The MCP Tool Your Agent Calls Changed Its Schema. It Didn't Notice.

Jun 22, 2026
Your AI Agent Scraped a Page. The Page Told It What to Do.

Jun 21, 2026
Your Agent Doesn't Run Out of Context. It Degrades at 79%

Jun 20, 2026
The Cheaper API Was 2.5x Cheaper. It Cost 1.6x More.

Jun 19, 2026
One Empty 200 OK Poisoned 5 of My Agent's 10 Steps

Jun 18, 2026
The HTTP Code Your AI Agent Doesn't Handle Yet: 402

Jun 17, 2026
Your AI Agent Will Double-Charge on a Lost Response

Jun 16, 2026
Your AI Agent's Memory Has No Expiry Date: I Scored Freshness on a Real Corpus

Jun 15, 2026
Your AI Agent Re-Reads Every Page It Already Saw. I Measured the 8x Context Tax

Jun 14, 2026
Your AI Agent Trusts a 200 OK. I Logged How Often the Page Was Garbage

Jun 13, 2026
Give Your AI Agent a Web-Fetch Tool: a 60-Line MCP Server (Free, Self-Hosted)

Jun 12, 2026
Your Scraper Re-Downloads Everything. Most Didn't Change.

Jun 10, 2026
Your Scraper Got Clean Data. The Site Lied to It.

Jun 9, 2026
Your Scraper Passes Every Run. It's Still Rotting.

Jun 8, 2026
Your Scraper Collected 50 Rows. There Were 4,000.

Jun 7, 2026
Your Scraper Died at Row 12,000. The Rerun Pattern.

Jun 6, 2026
A 30-Line Probe That Tells You If a Page Needs a Browser

Jun 5, 2026
You Pay for the Bandwidth That Returns Nothing

Jun 4, 2026
A Budget Brake That Stops a Scraper Before $200

Jun 3, 2026
Spoofing Your Scraper's Fingerprint Is a Losing Arcade

Jun 2, 2026
Your Scraper Returned a Clean Row. It Was Wrong.

Jun 1, 2026
9 Free LLM APIs in 2026 You Can Use Without a Credit Card

May 31, 2026
HTTP 200 Is a Lie: A 30-Line Schema Canary for Source Drift

May 30, 2026
Feeding Raw HTML to Your LLM Is a Token Tax. I Measured It on 10 Real Pages — Median 7.4×, and It Hits Every Scheduled Run

May 29, 2026
I've Run 2,190 Production Scrapes. The Framework You Pick Isn't What Breaks — Here's What Actually Does

May 28, 2026
Scraping All the Text Is the Easy 10%. Keeping the Corpus Worth Training On Is the Other 90% — Notes From 962 Runs

May 27, 2026
I've Run 2,190 Production Scrapes — "Ethical" Isn't a robots.txt Question, It's a Rate-Limit One

May 25, 2026
Conditional GET in production scrapers: what I learned wiring it into 3 actors

May 19, 2026
Three memory-leak patterns in long-running scrapers (and how I caught them after 968 Trustpilot runs)

May 18, 2026
Token Economics of Agent-Driven Scraping: When LLM Agents Cost 50× More Than a Cron Job

May 18, 2026
5 Apify dataset deduplication patterns that stop double-billing your customers

May 17, 2026
5 Apify scheduler mistakes that quietly burn compute units

May 15, 2026
Token Bucket vs Exponential Backoff: What Changed After 966 Runs

May 15, 2026
Building a Proxy Health Monitor for 24/7 Scraper Uptime

May 13, 2026
5 production scraping failures from 1000+ runs (and the fixes that actually shipped)

May 12, 2026
Description drift in serverless function catalogs — a monthly refresh playbook

May 12, 2026
3 Telegram Channels Worth Following for Production Data Engineering

May 11, 2026
I write production scrapers. AI made 30% of them worse. Here's the rule of thumb.

May 11, 2026
5 Apify webhook patterns that turn one-off scrapers into reliable data pipelines

May 3, 2026
5 Apify run-log patterns that make production debugging 10x faster

May 1, 2026
5 Apify Scheduler Mistakes That Quietly Burn Compute Units (And the Cron Fixes)

May 1, 2026
5 Apify run-log patterns that make production debugging 10× faster

May 1, 2026
Five Apify Input Schema Mistakes And The Fixes That Stuck

May 1, 2026
Apify vs. self-hosted: the three numbers I use to decide

Apr 30, 2026
Cost per result: a 4-line worksheet for Apify actors

Apr 30, 2026
Dead features in your own code: a self-audit story from my Apify actor

Apr 30, 2026
DuckDB + dbt: a zero-cost analytics warehouse for projects under 100 GB

Apr 30, 2026
Idempotent webhook receivers in 50 lines of Python

Apr 30, 2026
Three operational rules I added after my Trustpilot scraper crossed 100 runs

Apr 30, 2026
Why your retry logic is broken (and the 30-line fix)

Apr 30, 2026
Schema drift killed our pipeline — three contract tests that catch it

Apr 30, 2026
When NOT to scrape: 3 patterns where I now reach for an API instead

Apr 30, 2026
Automate Your Backups with MinIO: Free S3-Compatible Storage for Everything

Apr 29, 2026
Traefik + Docker: Zero-Config Reverse Proxy That Discovers Your Containers Automatically

Apr 29, 2026
How my Trustpilot scraper survived 949 production runs (and the 3 things that almost killed it)

Apr 29, 2026
Welcome — what this blog is for

Apr 27, 2026
What 250 runs of a Trustpilot scraper taught me about anti-bot patterns

Apr 25, 2026

© 2026 Aleksei Spinov · Apify Store · @scraping_ai on Telegram · Sponsor this blog · spinov001@gmail.com

Some posts contain affiliate links to scraping/proxy providers (Oxylabs, Bright Data) — disclosed at the article level.