Experiencing occasional Random delay in rule execution

  • Platform information:
  • Openhab running as a VM.
  • Hardware: 9900k (1 core, 2 threads 4GB of ram i think)
  • OS: Ubuntu 18.04 server
  • Java Runtime Environment: which java platform is used and what version
  • openHAB version: 2.4
  • Issue of the topic:

I am experiencing a random delay in rule execution occasionally. Its not isolated to one rule and it involves both z-wave devices and other devices. It only happens rarely but doesnt seem to happen to any specific kind of rule.

I have already eliminated any Thread:sleeps over 500ms. One more note, like i said above, this isnt specific to rules that have delay in them.

I will give some examples:
Rule 1: I want any 1 switch in my bathroom to control all the lights in my bathroom. There are a total of three switches. So this rule says that if any of the switches are triggered (on or off) that the other 2 will follow. Sometimes this happens almost immediately and sometimes it takes 10 or 15 seconds.

Rule 2: Leaving scene. I activate this rule when i am leaving the house. It turns all the lights off, turns the tv off and opens the garage door.
It has been working fine for months but just yesterday i activated this rule and it worked as expected except for the garage didnt open. So i opened it manually and left. After I had gotten in the car, backed out, lowered the garage and driven about a mile, i got a notification that the garage just went up. It seems that the garage up command didnt go through for a few minutes!

All in all this isnt specific to a rule or a type of item. it just seems like Openhab is too busy to get around to the command at the moment causing a delay. I find that hard to believe as Openhab has 2 threads of a 9900k and 4bg of ram. probably very OP for a simple openhab install.

I will post the actual rules from the rules file when I get home but since this is more of a general issue, your thoughts are appreciated.

And still this could be the reason. Only 2 threads can easily be kept busy by any kind of busy-waiting programming. I wouldn’t bet my life on it there isn’t any. Don’t know by heart how to increase but there’s threads on the forum I believe you should be able to find.

Another potential reason is a bad zwave network.
Enable zwave debugging to see how loaded the queue is that should give you a first hint, then eventually heal single node or the full network.
ZWave fixes actually is a good reason to move to 2.5M1 or M3. Network heal was disabled in some previous version not sure off the top of my head if it was there in 2.4.

I am not sure what you mean by ‘don’t know how to increase’. I can easily give openhanded 2 more threads to see if it helps but I am not sure if I hav not change anything in openhab too.

I will try enabling z-wave debugging later on. Thanks!

Not being that familiar with how stable OpenHABs releases are, I have been afraid to upgrade to a non-released version. Would you recommend me upgrading?

Yes I meant to provide more threads but how can you simply “give more” if you don’t even know how to configure that in OH ?
You don’t mean you do any threading in your programming, do you ?

yes

I just meant to assign 2 more threads to the VM in my hypervisor.

I am not sure how to configure any type of CPU threading in OpenHAB.

That wouldn’t change a thing.

Oh okay. I am obviously unfamiliar with what you are referring to then. Can you explain?

The JavaVM (I assume you don’t mean that by “VM”, do you?) is multithreaded, so openHAB runs multithreaded. There’s parameters to change in $USER_DATA/etc/* how many threads to use/have ready for use.

No, I mean virtual machine as seen by Proxmox, my hypervisor.

Okay thank you. I will look tonight to see if I can change the settings in that file. Is there an explanation on how to do this somewhere?

That’s a whole bunch of files. I think the parameters are not documented. There were discussions on the forum I don’t find right now.
But you would need to have a pretty unusual set of rules if it didn’t work with the default settings so I don’t think it’s the reason.
I’d look into the zwave network first. Add some debug outputs to better track what rules run when.
Another thing comes to mind: do you edit rules or items in place ? You shouldn’t. Processing of changes can queue up. You would see java use >= 100% CPU.

Hi @TheJM,

Out of interest what Hypervisor are you using? You describe your VM as 1 core and 2 threads, this is very unfamiliar to me. This could just be the terminology your using. I’m not aware of any Hypervisor that allow you specify the number of threads your VM will be able to use. Besides just running the OS is likely to take 25/30 threads especially while booting.

I would suggest a configuration of 4 vCPU’s spread across two physical cores and a 1Gb of RAM per core used. Having too much RAM is just as bad as not having enough.

Cheers,

Garry

I apologize as my terminology probably wasnt correct (I am new to the hypervisor world). I am using Proxmox VE. I have a 9900k with 8 cores, 16 threads. The VMs see that as 16 vCPUs so in my case 1vCPU=1thread.

I will do that.

I believe you mentioned that 2.5 has some more z-wave features than 2.4 did. Would you recommend me upgrading and then troubleshooting? Also, are there any best practices to do before upgrading (besides making a backup)? Just dont want to lose the majority of my configs.

Thank you for your help.

I’d upgrade first. Backup your config (openhab-cli backup ...)

A single vCPU is more than enough. 4 GB is reasonable for x86-64bit. That usually takes 4 times the amount of ARM/32bit, and 1GB on a RasPi is the minimum it takes OH to run well.

Your use of the term “Threads” is unusual and overlaps with other meanings for the term. In an OH context, there are a number of thread pools that OH pulls from to run just about everything. By default, the thread pool for Timers is of size 2 and Rules is of size 5. If you have more Rules than there are threads available in the pool trying to run at the same time, the later ones will have to wait until a thread frees up.

These have nothing to do with your VM “threads”.

See Why have my Rules stopped running? Why Thread::sleep is a bad idea for a full explanation.

But I tend to agree that your problem may not be related to this assuming you have no other long running or blocking commands like executeCommandLine or sendHttp*Request.

Depending on the Hypervisor and physical hardware being used as the host, your not likely to be able to address more than a Gb of RAM at a time. In addition memory addressed by Hypervisors is divided across the CPU bus, so you need to make sure that you have a minimum of two vCPU’s and they are spread even across physical sockets. But were now heading way off topic.

So I upgraded to 2.5 and enabled debugging and found an issue with a couple of my rules doing this not im not sure what im looking for.

I actually just had another incident where a rule didn’t work properly. I have a daytime scene and it turns on and off several lights. It works almost every time but this time, one of the lights didn’t turn on. After checking the logs, it looks normal. The Kitchen Pendant lights are the one that didn’t actually turn on. It looks normal.

Like i said above, i don’t see anything weird in the logs. What should i be looking for?

2019-09-15 10:03:32.373 [ome.event.ItemCommandEvent] - Item ‘DummyDaytimeScene’ received command ON

2019-09-15 10:03:32.375 [vent.ItemStateChangedEvent] - DummyDaytimeScene changed from OFF to ON

2019-09-15 10:03:32.378 [ome.event.ItemCommandEvent] - Item ‘ZWaveNode006GarageEntryHallDimmerSwitchSwitchDimmer’ received command 100

2019-09-15 10:03:32.381 [ome.event.ItemCommandEvent] - Item ‘ZWave007HallwayRecessedLightsDimmerSwitchSwitchDimmer’ received command 100

2019-09-15 10:03:32.382 [nt.ItemStatePredictedEvent] - ZWaveNode006GarageEntryHallDimmerSwitchSwitchDimmer predicted to become 100

2019-09-15 10:03:32.384 [ome.event.ItemCommandEvent] - Item ‘ZWave008GreatRoomSwitchLampDimmerSwitchSwitchDimmer’ received command 100

2019-09-15 10:03:32.390 [ome.event.ItemCommandEvent] - Item ‘ZWave010KitchenPendantLightsDimmerSwitchSwitchDimmer’ received command 50

2019-09-15 10:03:32.391 [ome.event.ItemCommandEvent] - Item ‘ZWave009KitchenRecessedLightsDimmerSwitchSwitchDimmer’ received command 100

2019-09-15 10:03:32.394 [ome.event.ItemCommandEvent] - Item ‘ZWave020GreatRoomLightDimmerSwitchSwitchDimmer’ received command 100

2019-09-15 10:03:32.396 [ome.event.ItemCommandEvent] - Item ‘ZWave019MasterBedroomFoyerLightDimmerSwitchSwitchDimmer’ received command OFF

2019-09-15 10:03:32.398 [ome.event.ItemCommandEvent] - Item ‘ZWaveNode006GarageEntryHallDimmerSwitchSwitchDimmer’ received command OFF

2019-09-15 10:03:32.399 [ome.event.ItemCommandEvent] - Item ‘DummyDaytimeScene’ received command OFF

2019-09-15 10:03:32.399 [nt.ItemStatePredictedEvent] - ZWave007HallwayRecessedLightsDimmerSwitchSwitchDimmer predicted to become 100

2019-09-15 10:03:32.405 [nt.ItemStatePredictedEvent] - ZWave008GreatRoomSwitchLampDimmerSwitchSwitchDimmer predicted to become 100

2019-09-15 10:03:32.408 [vent.ItemStateChangedEvent] - ZWaveNode006GarageEntryHallDimmerSwitchSwitchDimmer changed from 0 to 100

2019-09-15 10:03:32.408 [nt.ItemStatePredictedEvent] - ZWave010KitchenPendantLightsDimmerSwitchSwitchDimmer predicted to become 50

2019-09-15 10:03:32.410 [nt.ItemStatePredictedEvent] - ZWave009KitchenRecessedLightsDimmerSwitchSwitchDimmer predicted to become 100

2019-09-15 10:03:32.411 [nt.ItemStatePredictedEvent] - ZWave020GreatRoomLightDimmerSwitchSwitchDimmer predicted to become 100

2019-09-15 10:03:32.413 [nt.ItemStatePredictedEvent] - ZWave019MasterBedroomFoyerLightDimmerSwitchSwitchDimmer predicted to become OFF

2019-09-15 10:03:32.415 [nt.ItemStatePredictedEvent] - ZWaveNode006GarageEntryHallDimmerSwitchSwitchDimmer predicted to become OFF

You’re banging out a lot of commands in a few milliseconds.

Isolating node 6 from your events

2019-09-15 10:03:32.378 [ome.event.ItemCommandEvent] - Item ‘ZWaveNode006GarageEntryHallDimmerSwitchSwitchDimmer’ received command 100
2019-09-15 10:03:32.382 [nt.ItemStatePredictedEvent] - ZWaveNode006GarageEntryHallDimmerSwitchSwitchDimmer predicted to become 100
2019-09-15 10:03:32.398 [ome.event.ItemCommandEvent] - Item ‘ZWaveNode006GarageEntryHallDimmerSwitchSwitchDimmer’ received command OFF
2019-09-15 10:03:32.408 [vent.ItemStateChangedEvent] - ZWaveNode006GarageEntryHallDimmerSwitchSwitchDimmer changed from 0 to 100
2019-09-15 10:03:32.415 [nt.ItemStatePredictedEvent] - ZWaveNode006GarageEntryHallDimmerSwitchSwitchDimmer predicted to become OFF

you send it two commands within 20 milliseconds. I wouldn’t like to guess which one gets to your device first. It’s not necessarily the same order as they get processed in openHAB.

ZWave packet loss. You would need to enable debugging for the zwave binding.