Representing a rule in a ruleset

How to represent rules for a rule engine as objects?

A rule would be

if (booleanExpression(input)) then a chain of generic actions” else next rule

…where the generic actions might be e.g. passing the input to a different chain of rules, returning a conclusion (terminating analysis with result), or gathering additional data – though probably other actions might prove necessary too.

Now the problem of parsing arbitrary boolean conditions aside (let’s assume we have a fully working booleanExpression class available), how would such a Rule object look like, and specifically, how would the Ruleset object containing orderly, interconnected grid of these look like, in particular in a structure that will be possible to process by the actual engine. (if the data structures don’t make it obvious how such engine should process them, maybe a hint about it too?)

This will be written in C++ but a language-agnostic answer is most welcome too.

This is derivative of my question about Event Correlator, which seems to be unanswerable simply because it’s too broad to tackle in one answer, so I’m trying to split the problem into smaller, bite-sized chunks.

Given that you have already defined what a rule is, the corresponding data structure would be straightforward (pseudo-code):

class Rule {
  Condition condition;
  ActionSet positiveActions, negativeActions
}

However, as you indicated by your reference to a ruleset, the interesting part of rule engines is the process of rule selection and their execution strategy. I’m afraid that there are as many variants as you believe there might be – plus some.

Typically though, a rule engine’s rules – at least for about the dozen rule engines I’ve seen – do not contain an “else” part. The idea of rules being that they have some condition on when they apply, but if they do not apply, then they simply are not applied. Hence, the engine development comes down to two core tasks:

Determine which rules are applicable

Numerous research papers are available on this one. The selection of applicable rules depends highly on the data structure used to represent your rules. If you truly want to allow arbitrary boolean conditions, then clearly, nothing can be determined unless the condition is evaluated on all possible inputs. Hence, most rule engines have some objects/predicates/constraints/whatever-the-things-are-called, which you have to use for the part of the rule that makes it applicable. This allows the engine to intelligently (read: efficiently) select from all rules and inputs the applicable ones.

Here’s a more concrete example of how something like this could look like in f.ex. Prolog:

MySignal(_, true, false) :- do_something

MySignal(_, true, true) :- do_something_else

Clearly an implementation can abuse the fact that these rules only apply to MySignal and only in case of the second argument being true. Hence, it is possible to improve the rule lookup with a dedicated lookup table for MySignal.

Note that in addition to the rules there is one very important other aspect, namely the input or more generally, the data on which these rules are supposed to operate. In most rule engines, it is furthermore possible to modify this data by the application of rules (i.e., you start with something, then the rules add new things to that data). In this example, the input would need to be stored such that one can efficiently determine if there is a MySignal input (again some kind of lookup table could be envisioned).

This input handling also needs a connection to your rules. In rule engines this is either performed via the above-mentioned objects/predicates/etc., or via logical variables. In other words: the actions that your rule has to perform will typically depend on the input, and as such, the input needs to be passable to the actions in some way. More concretely consider the following example:

MySignal(X, false, false) :- print(X)
Determine execution strategy of applicable rules

Most engines fall back to a trivial top-to-bottom strategy here, as in, the textually first rule in the rules file that matches the input is applied. You can go arbitrarily complex here as well. Just a few examples of different possible strategies:
- Prolog adds backtracking to the rule execution strategy such that a failure leads to another rule being used during the backtracking.
- Priority annotations made to rules can be used to derive a priority and from that the execution order.
- Probabilities can be annotated as well, leading to a probabilistic rule engine that selects from the applicable rules via a PRNG.
- Simple strategies are also possible, like simply executing the first rule found or all rules that are applicable (the latter only works if the rules do not consume the input).

So essentially your question boils down to which way you want to go: either full-blown rule engine, or a simple filter-map combination.

Just for the sake of completeness, here’s a faked rule-engine approach that might be applicable to your case (functional pseudo-code, without else-part but that should be clear). In this case, the rule selection is simply to check that the input satisfies the condition and the execution strategy is to apply all applicable rules in their textual order:

let I be the list of inputs
let R be a list of rules
I map (i => R filter ( r => r.condition(i) == true ) map ( r => r.action(i) )

Obviously, the runtime complexity is very bad with increasing number of rules due to the need of having to check each combination of condition and input. After all, a prime feature of rule engines is that they provide some means to deal with this combinatorial explosion.

In case you are interested on the proper rule engines, here are a few additional keywords to get you started on your quest on how to implement these: rule-based operational semantics, RETE, unification

Well, how will you invoke this thing?

I would assume it’s something like this:

for (auto rule: ruleset) {
    if (rule.matches(context)) {
        if (rule.execute() == Stop)
            break # so rules can stop early
        # else: default action is to continue
    }
}

So, you have an object with with two methods

bool Rule::matches(Context const& input) { return booleanExpression(input); }
void execute() {
    for (auto action: actions) action.execute();
}

Is that the sort of thing you’re looking for? This trivially works for a simple linear ruleset (a bit like iptables, though even that needs branching), but for a decision tree, one of the available actions needs to be “recursively evaluate this nested subtree”. It’ll be easier to demonstrate with a small sample ruleset, if you have one.

I would suggest you try to go away using actual boolean expressions and figure out what actions or events you’re doing to the “system” that you’re developing. If you’re using if-statements you might as well use a scripting language.

Once upon a time I’ve created a DSL in JSON format that defined rules as actions or events instead of using boolean expressions. I chose JSON because the DSL code was generated from a web backend and was going to be used in a javascript web front end.

Consider a series of options one user can take in a designer UI, all individually identifiable by a name. The designer UI for the example below will be cars that the user can select. The DSL basically looked like this:

// an array of rules
"rules": [
    // rule number 1
    { 
        // the originating option of the event or action
        "source": "car_1",

        // the option(s) that receives of the action or event
        "destination": ["color_yellow", "color_blue"],

        // the event or action that should happen
        // in this case the destination should be disabled
        "action": "disable",

        // what to roll back to if the destination was selected
        "instead": "color_red"  
    },
    // rule number 2, same as above but used to 
    // demonstrate chaining
    {
        "source": "roof_sunroof",
        "destination": "color_red",
        "action": "disable",
        "instead": "color_green"
    },
    ... and so on         
]

In this example, in the first rule: if the user selected the option “car_1” the color options “color_yellow” and “color_blue” will be disabled (because that particular car does not come in yellow and blue colors). In the second rule, if the user selects “roof_sunroof”, then the option of “color_red” is disabled.

Meanwhile the front-end would take this JSON file and create all the options on the User Interface and use the rules array to determine which events to set up.

The actions and events were implemented in such extend that they were chainable by the rule set, so if one source would disable and the roll back option was also disabled, the event would look up what to roll back to instead. In the example above if the option “roof_sunroof” was selected and then “car_1”, the application would first try to rollback to “color_red” (which is disabled) and then to “color_green”. If there was any circular dependencies, it would most likely be a bug in the DSL rather in the designer UI. Note that the events enforce the rules without the need to build a complex tree.

On the back-end side, it’s easier to represent the rules as objects if they’re definitions for a DSL. Sure if you go the if-statement route, you could persist the rule as a text script.

This is somewhat oblique to your question, but it sounds to me as though what you are talking about in this case is actually the creation of a Domain Specific Language- if you start researching those you will find all kinds of interesting articles. Martin Fowler provides a good starting point.

DSLs are a very useful area to be conscious of and something that I think we will be seeing a lot more of over the next few years.

Filed under: softwareengineering - @ 01:04

Thẻ: data-structures, rules-and-constraints, rules-engine

Thiết kế website giá rẻ

Danh mục

Representing a rule in a ruleset