[SOLVED] Many(+100) modbus pollers, is it slowing down the binding?

Hi,

I just upgraded from modbus binding v1 to v2.

I have one tcp-server (slave), a Carlo Gavazzi SH2WEB, which has a massive modbus-map with all its functions and their parameters.

In ver1 I had defined 144 slaves (one for each function I wanted OH to read/write). It has worked very well and response time has been within a second.

In ver2 I figure I have to define a poller for each of the old “slave”-definitions to achieve the same. This is due to the fact that the modbus-registers I use is not in a consecutive order.

Now I have defined +100 Things, each of them have their own Poller.

I find that the responsetime is now much longer, appr 8-10 seconds from changing for example a dimmer in the site-map until it happen IRL.

  1. Can someone with knowledge about the binding explain why. @ssalonen?
  2. What can I do to improve the performance? It is the writing I’d like to have higher pririty on, reading is not so critical.

Exctract of my .things-file: (couldn’t fit all lines)

// Modbus to Smart-House
Bridge modbus:tcp:SH2WEB [ host="192.168.1.198", port=502, id=1 ] {
    // smart-house (Fx) Sovrum 4 (Kontor) - Ljusfunktion Tak_Status
    Bridge poller P1 [ start=0, length=1, refresh=1000, type="holding" ] {
        Thing data D1 [ readStart="0", readValueType="uint16", writeStart="0", writeValueType="uint16", writeType="holding" ]
    }
    // smart-house (Fx) Sovrum 4 (Kontor) - Ljusfunktion Fönster_Status
    Bridge poller P2 [ start=38, length=1, refresh=1000, type="holding" ] {
        Thing data D2 [ readStart="38", readValueType="uint16", writeStart="38", writeValueType="uint16", writeType="holding" ]
    }
    // smart-house (Fx) Vardagsrum - Ljusfunktion Fönster LU_Status
    Bridge poller P3 [ start=76, length=1, refresh=1000, type="holding" ] {
        Thing data D3 [ readStart="76", readValueType="uint16", writeStart="76", writeValueType="uint16", writeType="holding" ]
    }
    // smart-house (Fx) Entré- Ljusfunktion VU_Status
    Bridge poller P4 [ start=114, length=1, refresh=1000, type="holding" ] {
        Thing data D4 [ readStart="114", readValueType="uint16", writeStart="114", writeValueType="uint16", writeType="holding" ]
    }
    // smart-house (Fx) Sovrum 3 (P&J) - Temperatur_Status
    Bridge poller P5 [ start=176, length=1, refresh=1000, type="holding" ] {
        Thing data D5 [ readStart="176", readValueType="uint16", writeStart="176", writeValueType="uint16", writeType="holding" ]
    }
    // smart-house (Fx) WC - Dimbart ljus_Status
    Bridge poller P6 [ start=272, length=1, refresh=1000, type="holding" ] {
        Thing data D6 [ readStart="272", readValueType="uint16", readTransform="JS(dimmertransform.js)", writeStart="272", writeValueType="uint16", writeType="holding" ]
    }
    // smart-house (Fx) Hall - Ljusfunktion ute entré_Status
    Bridge poller P7 [ start=349, length=1, refresh=1000, type="holding" ] {
        Thing data D7 [ readStart="349", readValueType="uint16", writeStart="349", writeValueType="uint16", writeType="holding" ]
    }
    // smart-house (Fx) Ljuskanal - Natt_Status
    Bridge poller P8 [ start=387, length=1, refresh=1000, type="holding" ] {
        Thing data D8 [ readStart="387", readValueType="uint16", writeStart="387", writeValueType="uint16", writeType="holding" ]
    }
    // smart-house (Fx) Sovrum 3 (P&J) - Ljusfunktion Tak_Status
    Bridge poller P9 [ start=487, length=1, refresh=1000, type="holding" ] {
        Thing data D9 [ readStart="487", readValueType="uint16", writeStart="487", writeValueType="uint16", writeType="holding" ]
    }
    // smart-house (Fx) Sovrum 3 (P&J) - Ljusfunktion Fönster_Status
    Bridge poller P10 [ start=525, length=1, refresh=1000, type="holding" ] {
        Thing data D10 [ readStart="525", readValueType="uint16", writeStart="525", writeValueType="uint16", writeType="holding" ]
    }
    // smart-house (Fx) Parkering Motorvärmare_Status
    Bridge poller P11 [ start=563, length=1, refresh=1000, type="holding" ] {
        Thing data D11 [ readStart="563", readValueType="uint16", writeStart="563", writeValueType="uint16", writeType="holding" ]
    }
    // smart-house (Fx) Spa - Ljusfunktion Spegel_Status
    Bridge poller P12 [ start=587, length=1, refresh=1000, type="holding" ] {
        Thing data D12 [ readStart="587", readValueType="uint16", writeStart="587", writeValueType="uint16", writeType="holding" ]
    }

....

}

Make sure you are logging no errors, either communications issues or configuration problems. Waiting for timeouts and retries really eats up time.

Don’t write to things that you don’t need to write to. It’s okay to configure read-only registers.

Don’t read things more often than you have to (v2 binding allows you to read poll different things at different rates)
For example, room temperature won’t need sampling every second.
A 100 polls per second only allows 10 millisecond for each poll. There’ll be a number of places in your system that can cause bottlenecks - anything you an do to reduce loading helps.

Don’t read things that you don’t need to read at all (v2 binding allows write-only registers)
If OH is writing contents, it already knows what they are - can they change?

There’s no way to schedule that. When you change an OH Item, any associated Modbus binding gets written to “immediately”. (but via a queue)
Hence the suggestion to look carefully at anything you can avoid writing to, use as a read-only instead.

2 Likes

Thanks for your quick and detalied reply, appreciated!

I will try to optimize based on your advice and see what improvements I can achieve.

Would it be more effective to reduce the number of pollers where possible (where there are several registers within the allowed span) and create more data things as children of that poller? I’d prefer not to, but if it improves response-time significally I will consider it.

I do not know, but I’m pretty confident that leads to better performance n both communications and processing.
Your slave does to need to be able to deal with registers included in a block / span that might not be defined. Some slaves reject such reads, yours is presumably under your control?

More on that topic … as revealed in other threads, the Modbus binding starts off with a queue to initially ‘refresh’ the read registers. If there’s not much headroom in the system(s), it can take some time to catch up on that queue as new poll requests come along and add to it.
Another way of saying don’t despair if initial response is sluggish, it may settle out after a few minutes.

With just one slave targetted, you might be hitting the limits of how quickly that can turn round polls.

Your slave does to need to be able to deal with registers included in a block / span that might not be defined. Some slaves reject such reads, yours is presumably under your control?

I know that slaves can reject such requests,I will make some tests how SH2WEB reacts. It is also possible to rearrange the modbusmap manually in it, but I’d prefer not to touch it, since I am aiming for a generic solution for other users of the system as well.

cheekily paging @mbs38 as the Modbus “demanding network” experienced user . Any further advice for us?

Unfortunately not. Everything I would have suggested has already been proposed by you. For example decreasing the amount of pollers. However I do not think it is really going to solve the problem because I am having similiar issues with my serial setup. Actually, this is why I am still delaying a move towards the 2.x binding on my productive systems.

In my setup something like this

Bridge modbus:serial:hut22 [ port="/dev/ttyUSB0", baud=38400, id=22, stopBits="1.0", parity="none",dataBits=8, echo=false, encoding="rtu", flowControlIn="none", flowControlOut="none", receiveTimeoutMillis=200, timeBetweenTransactionsMillis=5 ]
{

        Bridge poller coils [ start=0, length=99, refresh=20, type="coil" ]
        {
                Thing data out0 [ readStart="0", readValueType="bit", writeStart="0", writeValueType="bit", writeType="coil" ]
                Thing data out1 [ readStart="1", readValueType="bit", writeStart="1", writeValueType="bit", writeType="coil" ]
                Thing data out2 [ readStart="2", readValueType="bit", writeStart="2", writeValueType="bit", writeType="coil" ]
                Thing data out3 [ readStart="3", readValueType="bit", writeStart="3", writeValueType="bit", writeType="coil" ]
                Thing data out4 [ readStart="4", readValueType="bit", writeStart="4", writeValueType="bit", writeType="coil" ]
                Thing data out5 [ readStart="5", readValueType="bit", writeStart="5", writeValueType="bit", writeType="coil" ]
                  .
                  .
                  .
                  .
                Thing data out95 [ readStart="95", readValueType="bit", writeStart="95", writeValueType="bit", writeType="coil" ]                                    
        }
        
        Bridge poller sensor    [ start=52, length=7, refresh=20, type="holding" ]
        {
                Thing data voltage              [ readStart="52", readValueType="uint16" ]
        
        }
}

causes high CPU load and also responds kind of sluggishly. In fact both CPUs on my Intel NUC (N3050 cpu) are at around 80% all the time causing the system to heat up and therefore increase fanspeed to audible levels. With the old 1.x binding the load was negligible even with >50 modbus transactions per second.

Ssalonen and I are already in a private conversation about this. I have given him access to one of my system so he can do some real world tests. He is working on a solution and, if I have understood correctly, part of the problem has been identified.

Maybe @ssalonen can clarify?

Best regards,
Max

Thanks for opening this up here.

Yes, we are investigating where the bottleneck is… It’s a bit too early to conclude anything for sure. At this stage it looks like the things and channels of new bindings brings lot of overhead.

Will update when I have more time to work on this, and have something solid to share

Best
Sami

There’s some discussion ongoing in Eclipse Smarthome project on this: https://github.com/eclipse/smarthome/issues/6416#issuecomment-433983134 . Follow this for updates

There might be some performance issues with the openHAB core itself.

1 Like

The new openHAB 2.4 snapshot should now include the framework improvements (see link above), please try it out and feel free to comment.

Hi,

I have done changes in two steps.

First I removed all unnesscesary writes according to previous suggestions. This resulted in removing write-command from 41 pollers out of 129. I also reduced refreshtime of some signals to 10000ms instead of 1000ms. I didi not notice any improvement in responsetime.

I have now updated to the snapshot 2.4.0~20181109205612-1 (Build #1418).

I do not experience any improvement regarding responsetimes compared with before the update. This is my perception only. How would I go about to measure it in a better way so you can get better feedback?

Sorry that you are still seeing poor behavior. Thanks for trying out all this, I think this is really valuable test case to improve the performance.

Before proceeding further, in addition to updating openHAB to snapshot, I would update the modbus binding and transport to latest snapshot. The latest version has a fix for the “queue” accumulating, something discussed earlier in this thread. Ensure you have the latest version active (check the date) by using bundle:list | grep -i modbus karaf console command.

If you are still experiencing problems, please share the full configuration here (with latest changes).

Then I would you to set the logging to verbose (check documentation for details) and record the slow response event (both events.log and general openhab.log is of interest).

I understand that the main problem is that after changing a Dimmer item from sitemap, it takes a long time for something to change in real world

I find that the responsetime is now much longer, appr 8-10 seconds from changing for example a dimmer in the site-map until it happen IRL.

From the verbose logs we can see how long delay there is between “item receiving a command” and “binding writing to slave”.

From the logs we will also see whether the polling is following the expected cycle, or whether is lagging behind due to sheets amount of different requests. The polling and writes might be competing for the communication channel with the slaves.

Best
Sami

@Pe_Man
I’m facing a similar problem. My is caused by a misconfigured persistance.

I had done this:

 Items {
    * : strategy = everyUpdate

I see no performance problems before I use MODBUS-Binding. But it is clear what is happen. MODBUS-Binding is my first binding which polls. So I get item updates for nearly 100 items every 1sec. This causes my OH to response very slow.

I changed it to

Items {
    * : strategy = everyChange

and the problems are gone.

Maybe it is the same on your installation.

2 Likes

That’s well spotted !

Hi,

I too have recently moved from Modbus 1 to 2 binding and am reading and writing many addresses to an SH2WEB the same as @Pe_Man . Modbus 1 binding was almost instant response. Modbus 2 binding is taking about 10 seconds. Please let me know if there has been any developments to solve this.

Thanks

Hi

There has been progress, please have a look at my previous responses and report back.

I wasn’t sure where to find this but I found this documentation: https://www.openhab.org/docs/configuration/persistence.html#persistence
and found that I had no persistence activated. I followed the recommendations and added rrd4j as default persistence, added the file with the default recommended values:

Strategies {
        everyHour : "0 0 * * * ?"
        everyDay  : "0 0 0 * * ?"

        // if no strategy is specified for an Item entry below, the default list will be used
       default = everyChange
}

/*
 * Each line in this section defines for which Item(s) which strategy(ies) should be applied.
 * You can list single items, use "*" for all items or "groupitem*" for all members of a group
 * Item (excl. the group Item itself).
 */
Items {
        // persist the Item state of Heating_Mode and Notifications_Active on every change and restore them from the db at startup
        * : strategy = everyChange, restoreOnStartup

        // additionally, persist all temperature and weather values every hour
        Temperature*, Weather* : strategy = everyHour
}

But after doing this I did not see any improvement of the performance.

/per

The persistence issue was about IF you had persistence running, THEN Modbus frequent polls could cause frequent updates, triggering persistence to do its work, with an effect of adding load to the system and causing apparent poor performance (because there is more work being done)

If you didn’t have persistence, then adding it will not make anything work better.

It’s probably time to share some logs?

1 Like

I see and understand that, but since I had not persistanse added and I understood the documentation that it was important to have a default persistance I thought it would be good to mention this.

Default Persistence Service

It is important to select a default persistence service. You should do this even if you have only one persistence add-on installed.

To select a default persistence service, in paper UI, select Configuration and then System from the side menu. Scroll down to “Persistence”, and select your default service from the drop-down list. Note that you must first install a persistence add-on before you make this selection. Be sure to save your choice once you have selected your default service.