How can I improve async data retrieval and caching?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Async retrieval gets data without blocking UI or request threads, but performance still degrades if the same remote call is repeated excessively. Caching solves that, yet naive cache code can introduce stale data and race conditions. A strong approach combines request deduplication, time-based cache policies, and explicit invalidation.
Build a Request-Deduplicating Async Cache
A common problem is a traffic burst where multiple callers request the same key at the same time. Without deduplication, each caller triggers a separate network call. You can avoid this by storing in-flight promises per key.
This structure avoids duplicate remote calls and keeps latency stable under concurrency spikes.
Add Stale-While-Revalidate for Better UX
Strict TTL caches can create sudden latency spikes when entries expire. A stale-while-revalidate policy improves responsiveness by returning stale data quickly while refreshing in background.
This pattern is especially useful for profile pages, dashboard cards, and catalog screens where slightly old data is acceptable for short intervals.
Layer the Cache Near Data Boundaries
Place cache logic in repository or data access modules, not random call sites. Central placement makes invalidation and metrics easier to manage.
For server applications, keep per-process in-memory cache for hot keys and add distributed cache for multi-instance consistency. For frontend applications, use memory cache for current session and persistent browser storage for limited offline durability.
Metrics to track:
- cache hit ratio
- p95 loader latency
- number of in-flight deduplicated requests
- stale serve count and refresh failures
Without telemetry, cache bugs hide behind temporary performance gains.
Handle Errors and Expiration Intentionally
Do not cache failures by default unless your service has explicit negative caching rules. If upstream is unstable, short-lived failure caching may protect infrastructure, but tune it carefully to avoid masking recovery.
When invalidation events exist, such as update mutations or webhooks, clear targeted keys immediately instead of waiting for TTL expiry. Time-based expiration is a fallback, not your only consistency mechanism.
Common Pitfalls
- Caching at every call site instead of one centralized data access layer.
- Ignoring in-flight deduplication, causing thundering herd behavior.
- Using one TTL for all data classes even when freshness requirements differ.
- Caching error responses unintentionally and serving repeated failures.
- Forgetting metrics, which makes tuning cache effectiveness guesswork.
Summary
- Combine async retrieval with key-based cache and in-flight request deduplication.
- Use stale-while-revalidate when fast response is more important than absolute freshness.
- Keep cache logic near data boundaries for maintainable invalidation.
- Track hit ratio and refresh outcomes to tune policies with evidence.
- Treat TTL as one consistency tool, alongside event-driven invalidation.

