How to ease the maintenance of event driven code?

When using an event based component I often feel some pain at maintenance phase.

Since the executed code is all split around it can be quite hard to figure what will be all the code part that will be involved at runtime.

This can lead to subtle and hard to debug problems when someone adds some new event handlers.

Edit from comments:
Even with some good practices on-board, like having an application wide event bus and handlers delegating business to other part of the app, there is a moment when the code starts to become hard to read because there is a lot of registered handlers from many different places (especially true when there is a bus).

Then sequence diagram starts to look over complex, time spend to figure out what is happening is increasing and debugging session becomes messy (breakpoint on the handlers manager while iterating on handlers, especially joyful with async handler and some filtering on top of it).

//////////////
Example

I have a service that is retrieving some data on the server. On the client we have a basic component that is calling this service using a callback. To provide extension point to the users of the component and to avoid coupling between different components, we are firing some events: one before the query is sent, one when the answer is coming back and another one in case of a failure.
We have a basic set of handlers that are pre-registered which provide the default behavior of the component.

Now users of the component (and we are user of the component too) can add some handlers to perform some change on the behavior (modify the query, logs, data analysis, data filtering, data massaging, UI fancy animation, chain multiple sequential queries, whatever).
So some handlers must be executed before/after some others and they are registered from a lots of different entry point in the application.

After a while, it can happens that a dozen or more handlers are registered, and working with that can be tedious and hazardous.

This design emerged because using inheritance was starting to be a complete mess. The event system is used at a kind of composition where you don’t know yet what will be your composites.

End of example
//////////////

So I’m wondering how other people are tackling this kind of code. Both when writing and reading it.

Do you have any methods or tools that let you write and maintain such code without to much pain ?

I’ve found that processing events using a stack of internal events (more specifically, a LIFO queue with arbitrary removal) greatly simplifies event-driven programming. It allows you to split the processing of an “external event” into several smaller “internal events”, with well-defined state in between. For more information, see my answer to this question.

Here I present a simple example which is solved by this pattern.

Suppose you are using object A to perform some service, and you give it a callback to inform you when it’s done. However, A is such that after calling your callback, it may need to do some more work. A hazard arises when, within that callback, you decide that you don’t need A any more, and you destroy it some way or another. But you’re being called from A – if A, after your callback returns, cannot safely figure out that it was destroyed, a crash could result when it attempts to perform the remaining work.

NOTE: It’s true that you could do the “destruction” in some other way, like decrementing a refcount, but that just leads to intermediate states, and extra code and bugs from handling these; better for A to just stop working entirely after you don’t need it anymore other than continue in some intermediate state.

In my pattern, A would simply schedule the further work it needs to do by pushing an internal event (job) into the event loop’s LIFO queue, then proceed to call the callback, and return to event loop immediately. This piece of code it no longer a hazard, since A just returns. Now, if the callback doesn’t destroy A, the pushed job will eventually be executed by the event loop to do its extra work (after the callback is done, and all its pushed jobs, recursively). On the other hand, if the callback does destroy A, A’s destructor or deinit function can remove the pushed job from the event stack, implicitly preventing execution of the pushed job.

I think proper logging can help quite a big. Make sure that every event thrown/handled is logged somewhere (you can used logging frameworks for this). When you’re debugging, you can consult the logs to see the exact order of execution of your code when the bug occurred. Often this will really help narrow down cause of the problem.

I wanted to update this answer since I’ve gotten some eureka moments since after “flattening” and “flattening” control flows and have formulated some new thoughts on the subject.

Complex Side Effects vs. Complex Control Flows

What I’ve found is that my brain can tolerate complex side effects or complex graph-like control flows as you typically find with event handling, but not the combo of both.

I can easily reason about code that causes 4 different side effects if they’re being applied with a very simple control flow, like that of a sequential for loop. My brain can tolerate a sequential loop which resizes and repositions elements, animates them, redraws them, and updates some kind of auxiliary status. That’s easy enough to comprehend.

I can likewise comprehend a complex control flow as you might get with cascading events or traversing a complex graph-like data structure if there’s just a very simple side effect going on in the process where order doesn’t matter the slightest bit, like marking elements to be processed in a deferred fashion in a simple sequential loop.

Where I get lost and confused and overwhelmed is when you have complex control flows causing complex side effects. In that case the complex control flow makes it difficult to predict in advance where you’re going to end up while the complex side effects make it difficult to predict exactly what’s going to happen as a result and in what order. So it’s the combination of these two things that makes it so uncomfortable where, even if the code works perfectly fine right now, it’s so scary to change it without fear of causing unwanted side effects.

Complex control flows tend to make it difficult to reason about when/where things are going to happen. That only becomes really headache-inducing if these complex control flows are triggering a complex combination of side effects where it’s important to understand when/where things happen, like side effects that have some kind of order dependency where one thing should happen before the next.

Simplify the Control Flow or the Side Effects

So what do you do when you encounter the above scenario which is so difficult to comprehend? The strategy is to either simplify the control flow or the side effects.

A widely-applicable strategy to simplifying side effects is to favor deferred processing. Using a GUI resize event as an example, the normal temptation might be to reapply the GUI layout, reposition and resize the child widgets, triggering another cascade of layout applications and resizing and repositioning down the hierarchy, along with repainting the controls, possibly triggering some unique events for widgets that have custom resizing behavior which trigger more events leading to who-knows-where, etc. Instead of trying to do this all in one pass or by spamming the event queue, one possible solution is to descend down the widget hierarchy and mark what widgets need their layouts to be updated. Then in a later, deferred pass which has a straightforward sequential control flow, reapply all the layouts for widgets that need it. You might then mark what widgets need to be repainted. Again in a sequential deferred pass with straightforward a control flow, repaint the widgets marked as needing to be redrawn.

This has the effect of both simplifying the control flow and side effects since the control flow becomes simplified since it’s not cascading recursive events during the graph traversal. Instead the cascades occur in the deferred sequential loop which might then be handled in another deferred sequential loop. The side effects become simple where it counts since, during the more complex graph-like control flows, all we’re doing is simply marking what needs to be processed by the deferred sequential loops which trigger the more complex side effects.

This does come with some processing overhead but it might then open up doors to, say, doing these deferred passes in parallel, potentially allowing you to get an even more efficient solution than you started if performance is a concern. Generally performance shouldn’t be much of a concern in most cases though. Most importantly, while this might seem like a moot difference, I have found it so much easier to reason about. It makes it much easier to predict what’s happening and when, and I can’t overestimate the value that can have in being able to more easily comprehend what’s going on.

So I’m wondering how other people are tackling this kind of code. Both when writing and reading it.

The model of event driven programming simplifies coding to some extent. It has probably evolved as a replacement of big Select (or case ) statements used in older languages and gained popularity in early Visual development environments such as VB 3 (Don’t quote me on the history, I did not check it)!

The model becomes a pain if the event sequence matters and when 1 business action is split across many events. This style of process violates the benefits of this approach. At all costs, try to make the action code encapsulated in the corresponding event and don’t raise events from within events. That then becomes far worse than the Spaghetti resulting from the GoTo.

Sometimes developers are eager to provide GUI functionality that requires such event dependency, but really there is no real alternative that is significantly simpler.

Bottom line here is that the technique is not bad if used wisely.

It sounds like you are looking for State Machines & Event Driven Activities.

However, you might also want to look at State Machine Markup Workflow Sample.

Here you are a short overview of state machine implementation. A state machine workflow consists of states. Each state is composed of one or more event handlers. Each event handler must contain a delay or an IEventActivity as the first activity. Each event handler can also contain a SetStateActivity activity that is used to transition from one state to another.

Each state machine workflow has two properties: InitialStateName and CompletedStateName. When an instance of the state machine workflow is created, it is put into the InitialStateName property. When the state machine reaches the CompletedStateName property, it finishes execution.

Event driven code is not the real problem. I fact I have no problem following logic in even driven code, where call-back are explicitly defined or in-line call-backs are used. For example generator style callbacks in Tornado are very much easy to follow.

What is real hard to debug are dynamically generated function calls. The (anti?)pattern which I would call the Call-back Factory from Hell. However, this kind of function factories are equally hard to debug in traditional flow.

What has worked for me is making each event stand on its own, without reference to other events. If they are coming in asynchroniously, you don’t have a sequence, so trying to figure out what happens in what order is pointless, besides being impossible.

What you do end up with are a bunch of data structures that are getting read and modified and created and removed by a dozen threads in no particular order. You’ve got to do extemely proper multi-threaded programming, which is not easy. You’ve also got to think multi-threaded, as in “With this event, I am going to look at the data I have at a particular instant, without regard to what it was a microsecond earlier, without regard to what just changed it, and without regard to what the 100 threads waiting for me to release the lock are going to do to it. Then I will make my changes based on this even and what I see. Then I am done.”

One thing I find myself doing is scanning for a particular Collection and making sure that both the reference and the collection itself (if not threadsafe) are locked correctly and synchronized correctly with other data. As more events are added, this chore grows. But if I was tracking the relationships between events, that chore would grow a lot faster. Plus sometimes a lot of the locking can be isolated in its own method, actually making the code simpler.

Treating each thread as a completely independent entity is difficult (because of the hard-core multi-threading) but doable. “Scalable” may be the word I’m looking for. Twice as many events take only twice as much work, and maybe only 1.5 times as much. Trying to coordinate more asynchronious events will bury you quickly.

Filed under: softwareengineering - @ 15:20

Thẻ: event-programming, maintenance

Thiết kế website giá rẻ

Danh mục

How to ease the maintenance of event driven code?