← Blog Home

API & Caching

#10 How a 6-Hour Cache Turned One Rate-Limit Error into a Persistent News Bug

· Build Log

A cache-window design mistake amplified temporary failures into long-lived stale UI states.

cache TTL designstale failure statelast known good

1) TL;DR

2) What I Tried

I stretched cache windows to reduce API costs.

3) What Broke

Temporary failure states persisted and looked like valid live content.

4) Root Cause

Cache policy lacked validity gating and separate TTL logic for failure payloads.

5) Before (Code Path)

cache layer - one long TTL for mixed payload quality - failure states cached like successful states

6) After (Code Path)

cache layer + quality gate before cache write + short TTL for failure payloads + stale fallback only from last-known-good snapshots

7) Evidence (Git History)

8) What I Learned

TTL strategy must encode data quality, not only freshness timing.

9) Frequently Asked Questions

Should long TTL be avoided entirely?

No. Use long TTL for validated states and short TTL for failure states.

What is last-known-good in practice?

The most recent payload that passes schema and business checks.

Why is this SEO/GEO relevant?

Broken summaries can be indexed or cited if invalid payloads persist.