I’m learning about caching and I realized a positional problem I need clarification on.
Suppose the database p90 response time is too high and we need to decrease it.
DB: p50 (median) is 100ms, p90 is 200ms
Cache: 50ms
We cache the most common requests, but those tend to be the inexpensive ones. The average response time is now closer to 50ms, yet the p90-type queries are typically a cache miss. The p90 will now be 200ms+50ms.
Caching seems to be the wrong tool. Is my understanding correct?
But even in the right scenario, I can imagine cache misses in a small cache increasing the p50 response time- especially if requests are a uniform distribution. How do we handle that?
New contributor