I’m building an API that allows people to search for books (just for simplicity). There is an endpoint that takes a query string and returns the top 10 matched results. For example:

/api/v1/book/find?title=Ha

It will return

Hangover
Hammer of the Gods
Harry Potter and The Sorcerer's Stone
Harry Potter and The Chamber of Secret
Harry Potter and The Prisoner of Azkaban
Harry Potter and The Goblet of Fire
Harry Potter and The Order of Phoenix
Harry Potter and The Half-blood Prince
Harry Potter and The Deathly Hallows
Harry Potter and The Cursed Child

One more key stroke: /api/v1/book/find?title=Har, and the result will change to:

Harry Potter and The Sorcerer's Stone
Harry Potter and The Chamber of Secret
Harry Potter and The Prisoner of Azkaban
Harry Potter and The Goblet of Fire
Harry Potter and The Order of Phoenix
Harry Potter and The Half-blood Prince
Harry Potter and The Deathly Hallows
Harry Potter and The Cursed Child

Assume all the URLs are parsed correctly, every API request will be used to check against the database. However, my database hosting has a limit amount of reading so I want to avoid reading the database as much as possible.

On top of my head, I have two solutions:

  1. Use the query string, hash it and store it in a cache with the result as the value.
  2. Add a delay. For example, the final request is expired in 2 seconds then checking (maybe combine with the cache as well).

Are there good strategies/patterns for this type of problem? Thanks.

2

Are there good strategies/patterns for this type of problem? Thanks.

As for caching, yes, there are popular strategies, such as Cache Aside and Read-Through. Read-Through has its counterpart Write-Through, which keeps the cache up-to-date with every write/update operation; otherwise, the cache only gets updated when it misses the request.

While caching is common practice, it takes you to decide (among other things) where to implement it. Server-side? Client-side? Both? Channel cache (network)? ORM (if any)?

The more caches you have, the harder it is to make the system predictable. If you implement different caches across the solution, my experience tells me that you would do good looking for a way to evict/expire each cache individually or all together.

As for design patterns, it might interest you Leading Boards or Table Lookups. Both are backed by data already preloaded into memory. These might not be related to the endpoint in question but to API requests sent automatically or constantly, e.g. by widgets. For example, think about lists like best sellers, most popular or wanted, collections by authors, sagas, etc. The kind of data you can afford to keep out of date for a while and (why not) cached on both sides.

Add a delay. For example, if the final request expires in 2 seconds then check (maybe combined with the cache as well).

Managing cache can be difficult sometimes. I wouldn’t recommend to make it harder. The delay is added to the latency inherent to the public network (internet) and the private one, connection throttling, cache TTL, etc. Keep it simple, take some metrics, and make decisions based on the data you gather.

Regarding metrics. If you wonder what APIS or queries are worth caching, try gathering metrics from production. Make a ranking of the most consumed endpoints and db queries; evaluate how often the involved data changes so you can set reasonable TTLs or decide where to set the cache (client, server, both).

3

However, my database hosting has a limit amount of reading so I want to avoid reading the database as much as possible.

Maybe you should start by fixing the database then. Databases, by design, are intended to handle a lot of reads, fast. If you’re trying to avoid reading the database as much as possible, then it makes little sense to use the database in the first place.

Once you fixed the main issue, check how many requests you get from the search. It may not be the actual bottleneck—for instance, by logging the requests and the time it takes to do them, you may discover that you’re spending 80% of the time performing the same query again and again somewhere else in the application, and the searches use only 1% of the time.

If and only if you discovered that there are indeed a lot of searches being performed, and it has a real, concrete footprint on the database (and the searches are actually legitimate, coming from real users, and not a DDOS attack, for instance), then:

  • Either cache the results of the API, given that you’ll have to be very careful when it comes to invalidating the results. If, at some point, a ninth book of Harry Potter is added, do you invalidate the entire cache? Or do you loop through all the cached results and figure out if they match the new book?

  • Or just put in cache the entire list of books, and use it when searching. Note that this solves the invalidation problem: when a new book is added, you don’t need to reload the entire list in cache—you can just add the new book to it. However, this approach has two drawbacks:

    1. It moves the search logic from the database to the application server.

    2. It forces the application server to retrieve the entire list of books. This may not be an issue if you have a few dozens of thousands of books, but would start to have a significant impact if you have millions of books.

4