Modbus error management in openHAB

rossko57 · April 25, 2019, 11:14pm

Positive action

Having seen that the error/timeout/retry loop can cause performance loss for survivors, and done what we can to optimize the loop, there is one more trick we could use. Simply not to enter the loop, or at least not so often. We’ve optimized our polling rate for normal use of course, but if we greatly reduce polling further after we’ve encountered a fault - we will also greatly reduce those time wasting error loops. We should still retry once in while, hoping that fault has been repaired – but every five minutes is probably enough.

Slave Thing disabling

Earlier we saw how to detect a failing poller Thing and use that to trigger a rule. Within that rule, we can use the openHAB REST API to disable the TCP/serial bridge Thing that “owns” the poller. That will also disable any other pollers on this Modbus slave.

The rule can also start an independent timer that will enable the Thing again in a few minutes. At that time, the binding will attempt to recover the connection to the slave and restart polling. Should this fail again, our rule will be triggered once more.

Rule "modbus slave22 error"
when
   // any poller Thing for this slave
   Thing "modbus:poller:slave22:s22_coils" changed or
   System started      // in case faulty at boot up
then
   var Tstatus = getThingStatusInfo("modbus:poller:slave22:s22_coils").getStatus()
   if (Tstatus.toString == "OFFLINE") {
      // set the TCP/serial bridge Thing offline
      // this will stop new polling requests on all associated pollers
     sendHttpPutRequest( "http://localhost:8080/rest/things/modbus:tcp:slave22/enable", "application/json", "false" )
      // now schedule a future recovery attempt
      createTimer(now.plusMinutes(5), [ |
         sendHttpPutRequest( "http://localhost:8080/rest/things/modbus:tcp:slave22/enable", "application/json", "true")
      ] )
   }
end

Note that you will likely see a couple of further errors logged after the disabling, as already scheduled polls work through the queue.

There is a potential snag – if we reboot with a faulty slave then the rule will not be triggered by poller status change and we would never reduce polling or attempt recovery. The workaround is addition of System started to the rule trigger; normally the poller will not be offline at boot, but if it is we will start our periodic attempts instead of regular polling.

You could instead use the REST API to alter the refresh parameter of the poller Thing. You would also need to arrange for restoration of “normal” refresh parameter. There are drawbacks to that method:

(a) It won’t work if you use xxx.things text files to configure modbus, not editable later.

(b) You may have several pollers to manage for each slave.

Summary

Some simple rules can further reduce performance impact of Modbus faults, while still allowing for automatic recovery after “repair”.