The application I work on manages hundreds of thousands of records each in a different state in their lifecycle. A large requirement is to have a nearly accurate count of these objects in each state so they can be assigned in batches. I.E. User X needs to know there are at least 25 objects in state Y so they can assign a batch of 25.
Right now there is a incredibly inefficient set of queries that runs for each state to count the objects every time the user hits this webpage and returns the results. These queries get ran millions of times a day.
An alternative we have come up with is to run the queries once every 1-5 minutes and cache the results in a Redis instance, where each time a user requests the count they just read it from there.
This still has a decent amount of overhead as the number of different states, and permutations of substates is quite high. So once every N minutes we will still have to hit the DB quite hard.
TLDR; I feel like there is a way I can maintain an accurate-ish count of what objects are in what state of their lifecycle without having to constantly count from the DB. One thought I had was essentially to have a trigger, or event push/pop from a queue each time a object entered/exited a state, letting you get the count instantly by getting the size of the queue, but this seems more resource intensive than just counting once every 3 minutes.
It doesn’t look like you need a “nearly accurate count” – you say you need to know that there are at least 25 objects in a state.
To achieve this, you can for example have one method that counts the objects in that state, and possibly stops counting once it reaches 1,000 for example, and decrease the counter every time you move an object away from the state. So you have a property “guaranteed available count”. If the “guaranteed available count” goes below 25, you count objects in that state again.