How do you cache complicated data?

  softwareengineering

I’m working on a desktop application that stores data in a complicated object graph. It has many read-only operations that need to do expensive transformations on parts of it, for example to display some graphical elements, or calculate things. For performance, I want to avoid repeating those transformations if the data hasn’t changed. Changes that affect the result can happen in many different objects, including ones that are linked in other ways besides composition, such as ID numbers and string names. Many of these transformations use data from overlapping sets of objects.

general idea

What kind of architecture could I use to reliably cache it?

Currently, it’s a mixture of ad-hoc approaches, like lazy evaluation with dirty flags attached to the data for each of several uses of it, caches whose dependencies all have references to them and tell them when to invalidate, and grouping some operations into batches that do the transformation explicitly at the start so it can reuse the transformed data throughout the operation. This is a giant mess.

I thought about using events that are fired by objects when they change and caches of each kind of transformed data listen to those events. But I worry if there might be performace problems with what’s sometimes a million tiny objects all firing the same event when all of them are being changed at once. I’m also confused about how to get references to those objects to listen to their events and worried there will be a lot of code everywhere for detecting changes in every field of every class.

LEAVE A COMMENT