Fast sliding cache counter

  softwareengineering

We have an email system that handles incoming and outgoing emails, the risk is that we get so called feedback loops, a bounce email is answered with another bounce etc. We have safeguards in place to handle this, but I need a last line of defense.

I want to keep a cache that keeps track of per email how often it was sent in the last 5 minutes, if it goes over a certain threshold nothing is sent and a log entry is generated.

But we handle a few 1000 emails an hour so I want this lookup to be fairly fast.

What I was thinking is to use a dictionary where the key is the email address and the value is a list of dates. Every minute a scheduled task will purge this dictionary.

Pros: Its very fast to check how many emails are sent in the last 5 hours, just the length of the list of the dictionary key for that email.
Cons: cleaning is expensive, the list would have to be locked and every item has to be inspected.

Is there a good way to utilize such a caching mechanism?

1

Conceptually, it should work with a dictionary keyed on the email address and with an ordered list of timestamps when the mails were sent.

As the list is ordered, new timestamps are always added to one end and the old timestamps are removed from the other. And you don’t need to traverse the whole list, but only up to the first entry that doesn’t need to be purged from it.

On the efficient implementation side, look for a dictionary and ordered list that don’t need to be completely locked to add/remove elements. That is probably most critical for the dictionary.

1

LEAVE A COMMENT