An Open View

How to detect an infinite loop in your BRMS Rules and auto recover - part 2

So you just got into building rulesredhat_brms.png and you finally got something to work, when suddenly you hear your laptop fan start spinning out of control. You have no idea what is going on and suddenly you are unable to do anything, your laptop becomes completely unresponsive. All you remember is that the last thing you did was execute a set of business rules running on your local application server, whether JBoss/WildFly or Tomcat.

Well, the chances are good that you just hit your first infinite loop! This is part two of a three part blog post that covers how to detect an infinite loop in your rules and how to auto recover:

How to prevent an Infinite loop

In Part 1, we noticed how easy it is to generate an infinite loop. And the bad part of it is that you had no intention of creating one, so it always catches you off guard. So how do we prevent it from happening?

Avoid modifying objects on the RHS

As discussed in Part 1, infinite loops are created when you make use of the update/modify, delete/retract and insert commands in the RHS of the rule. So the immediate short answer is to avoid or limit the use of these commands in the RHS of the rules. But life is never that simple, and it is inevitable that you will eventually need to make use of these Working Memory modifiers.

Create well Defined Constraints on the LHS

Infinite loops are also based on a combination of the rule constraints in the LHS. This is very important to understand and remember. As shown previously (Part I - Code Block 2), by just slightly refining the constraints of the dog rule from "count >= 0" to "count < 10" in code block 1, I was able to avoid an infinite loop. By improving rule conditions, 90% of infinite loops can be avoided. It all starts with the business rule requirements. Poorly defined business rules leave the door open for infinite loops. Also, changes to the action part (RHS) of a rule over time due to continuous changes to the business requirements, without looking at the rule conditions/constraints at the same time, can create situations for infinite loops to occur. Therefore, it is ALWAYS very important to re-visit the rule conditions the moment you make use of any of the Working Memory modifiers  update/modifydelete/retract or insert commands in the RHS of the rule.

Also, look at other rules that have similar conditions on the same domain model objects. If an object is modified in the RHS of the rule, see if you cannot add additional constraints to the rule and other rules to improve the rule definition and avoid the rule or other rules from firing again. By going through this exercise, you may notice that other rules are firing multiple times (without causing an infinite loop), which you were not expecting, when you wrote them originally. So it helps not only to avoid infinite loops, but can also prevent undesired/unexplained outcomes and improve rule definitions and the overall performance of the rules engine.

But be careful, I have seen how developers can go overboard and start to contaminate the fact model by adding all sorts of additional attributes, not associated with the business domain model at all. Do not use attributes unrelated to the business domain model. There are other paths that can be followed to avoid infinite loops, which we will look at next.

Make use of the agenda-group Rule attribute

Another good practice to avoid infinite loops is to make use of Agenda Groups. With too many modifications to domain model objects by rules, it might be a good time to ask yourself, if it is possible to partition the rules to create "flow" or logical steps/phases that objects should go through. For example, you insert facts into the rules engine, and the natural "flow" would be to go through for an example, an initialize step/phase (by firing a specific set of rules), then a calculate phase and lastly a validate phase. In this example, it becomes clear that we have three rule sets, one for initialize, one for calculate and one for validate. With this knowledge you can group your rules into rule sets, by adding an additional attribute to the rules called an agenda-group. By dividing rules into agenda groups, it becomes possible to control rule execution, so that the rules engine at a given point in time, can only activate a specific set of rules. For example, if you modify a domain model object during the initialize step (only firing rules with the agenda-group attribute set to "initialize"), no rules in the calculate step will be activated, thus avoiding the possibility of any rules in the calculate agenda group from firing and potentially modifying the same objects and causing the rules in the initialize agenda group from re-firing. Code Block 4 below provides an example where the cat and dog rules above are put in their own agenda groups:

Screen Shot 2017-01-17 at 4.02.45 PM.png

Code Block 4 - Agenda Groups

Now, it is possible to fire/activate only the dog rule(s) by setting the focus on the agenda group dog and then firing the rule(s) using the Java code below as an example:

Screen Shot 2017-01-17 at 4.10.00 PM.png

Code Block 5 - Agenda Groups Set Focus

After firing the dog rule(s), we can now set the focus on the cat rule(s) and fire only the cat rule(s) as demonstrated in Code Block 5 above. Keep in mind that all rules by default belong to the "MAIN" agenda group, if you do not specify the agenda-group attribute for a rule.

Make use of modify instead of update

In Drools, both the update and modify commands notify the rules engine that an object was changed and requires the rules engine to re-evaluate all rules having conditions based on the specific object. So why are there two commands for doing the same thing? Well, the modify command is special in that you specifically identify exactly which attributes are modified on an object. If you look at line 7 of the Code Block 6 below, you will see an example of the modify command:

Screen Shot 2017-01-18 at 3.32.34 PM.png

Code Block 6 - Modify vs. Update

However, the update command (see example on line 3 above) only takes an object as an input parameter instead of one or more attributes like the modify command. So the modify command is more fine grained than the update command. So why is that important to know? It basically tells the rules engine that a very specific attribute was modified on an object and to only re-evaluate rules with constraints on that specific attribute, instead of any attribute of the given object. So if a domain model class has twenty attributes and you modified only one, only rules with constraints on that one attribute will be re-evaluated, making rules basically twenty times less sensitive and susceptible to an infinite loop. In the example below in Code Block 7, the change of the name attribute on line 10 will not cause the rule on line 13 to be activated and fire, because the rule is only interested in the count attribute on line 10.

Screen Shot 2017-01-18 at 3.54.06 PM.png

Code Block 7 - Modify

Now, this is where things get interesting. If you actually go and execute these rules in a rules engine, you will find that the first rule did cause the rules engine to re-evaluate both rules and cause the second rule to be re-activated, so what is going on? The truth is that modify by default behaves just like update in order for the rules engine to be backward compatible with former releases. To activate the fine grain property listeners of the rules engine when using the modify command, you have to annotate the specific domain model class with the @PropertyReactive annotation. So for the above case, to switch on the fine grain property listeners in the rules engine, annotate the declared Counter class like this:

Screen Shot 2017-01-19 at 10.33.16 AM.png

Code Block 8 - @PropertyReactive

It is important to note that inside a DRL file, use @propertyReactive with a lowercase "p" and @PropertyReactive with an upper case "P" in Java.

Now, the moment you start to play with changes on the attribute level instead of the object level, you may encounter difficult situations, such as changes on nested attributes (e.g., a Person can have an Address with street, city and state attributes) and attributes that can be a combination of other attributes within an object (e.g., Full Name is made up of Last Name and First Name). These issues are out of scope for this blog, but you can go and look at the annotations @watch, @Modifies and @ClassReactive for more information.

Make use of the no-loop Rule attribute

Most rule developers will jump right to the no-loop attribute when encountering an infinite loop. However, at this point it should be clear that there are quite a few other things that you can and should be doing to avoid an infinite loop. So when should you use the no-loop rule attribute?

With complex rules projects, which are close to maturity and ready to go to production, there will be times where you have no choice but to add a rule with some kind of object modification on the RHS that will cause that rule to re-activate itself, thus causing an infinite loop or a so called self-loop. We briefly touched on the no-loop attribute in  Part I - Code Block 3 to prevent the dog rule from calling itself. Basically, a rule with the no-loop attribute set to true (default is false), will not be-reactivated again, due to an object change on the RHS of the SAME rule. So basically, the rule cannot re-activate ITSELF again. Use this attribute conservatively and use it only as a last resort. Making use of the no-loop attribute is to interfere/overwrite the rules engine algorithm and preventing it from doing its magic. It also reduces the declarative nature and purpose of a rules project.

Make use of the lock-on-active Rule attribute

I have often seen the developer adding the no-loop attribute only to find out the infinite loop is still happening. It's like a slippery slope, the moment you start to add the no-loop attribute, you find that you need to add it to other rules as well, to the point where suddenly it does not make sense anymore. Well do not fear, the doctor has something stronger to prescribe, lock-on-active. Lock-on-active is also a rule attribute and similar to no-loop and is also used to help prevent infinite loops. However, lock-on-active is more drastic. It will NOT RE-ACTIVATE a rule within a given agenda group, if ITSELF OR ANY OTHER RULE in the same agenda group triggered a re-evaluation due to object modification in the RHS (remember the default agenda group is MAIN).

So if you have ten rules in an agenda group and say five of them have lock-on-active set to true and five not, the rules engine can re-activate and fire any of the five rules with lock-on-active set to false and with matching conditions on the LHS, but none of the five rules with lock-on-active set to true. To put it differently, if an agenda group receives the focus, any rule within the agenda group with lock-on-active set to true, will not be re-activated again, irrespective of the origin of the modification. If a match was found on a rule with lock-on-active set to true, the activation of the rule will be canceled as well.

Looking at Code Block 9 below, each of these rules will fire only once, because of well-defined conditions on the LHS. But if you make use of an agenda listener (Java class: org.kie.api.event.rule.AgendaEventListener), you will notice that each of these rules will be re-evaluated at least three times, which means the rules engine would have tried to reactivate each of these rules three times instead of just once. Now, you can add the no-loop attribute to each rule and it will reduce the re-evaluation to two times for each rule, basically preventing itself from being re-evaluated. If you add the lock-on-active, it will cause each rule to only fire once and not re-evaluate again. 

Screen Shot 2017-01-19 at 4.07.35 PM.png

Code Block 9 - Lock-on-active

Now, let's add a rule (rule 4) to our previous set of rules, which do not have the lock-on-active (see Code Block 10 below). In this case, we went ahead and marked all the rules with lock-on-active except for rule 4. If you can take a guess, how many times do you think rule 4 will fire for this rule set? The answer is three times. Every time one of the other rules makes a modification to the Counter object, it will cause a re-evaluation; resulting in the re-activation of rule 4.

Screen Shot 2017-01-25 at 12.09.30 PM.png

Code Block 10 - Lock-on-active

Something else that is interesting is that there is also a rule attribute called an activation-group; very similar to an agenda group. However, within an activation group only one rule will be activated (mutually exclusive) and then all the other rules will be canceled within the same activation group. It is similar to an agenda group with all the rules having the lock-on-active set to true, but adds the additional mutually exclusiveness to the rules. So go try it out and see what it does.

Let's Recap 

  • Avoid or limit the use of the update/modify, delete/retract and insert commands in the RHS of a rule
  • If you have to use one of the above commands, evaluate and refine the rule constraints on the LHS, for rules with constraints on the same domain model classes
  • Identify rule phases/steps to partition rules and implement rule sets by making use of the rule attribute agenda-group
  • Make use of the modify command in combination with @PropertyReactive instead of update command
  • Make use of the no-loop rule attribute to avoid a self-loop
  • Make use of the lock-on-active rule attribute to avoid a compound-loop
  • Make use of the activation-group rule attribute to avoid a compound-loop

What's next?

Be on the lookout for part 3, where I will cover how to automatically detect an infinite loop and exit the Rules Engine as well as generating a notification when detected. 

Ben-Johan van der Walt

Ben-Johan van der Walt is a Software Architect/Engineer with over 20 years of experience leading successful projects of various sizes and scopes. He is a seasoned professional, with outstanding project planning, execution, mentoring and support skills. He is always ready for a challenge.