[2.5.0M1+Z-Wave Snapshot] Z-Wave power consumption not updating, but manual polling works instantly

leif · June 26, 2019, 4:05pm

I’m running openHAB 2.5.0M1 on raspbian. I think I’m using a Z-Wave 2.5.0 snapshot from June 18th, but ever since upgrading to 2.5.0M1 i can’t see ANYTHING z-wave related in karaf - bundle - list. I do see it in paper UI - bindings (which is strange, right?) but the very device that I updated to the snapshot binding for (0A01:5102) IS updating just fine.

So, for a few days I’ve been having problems with the NEO Coolcam NAS-WR01ZE plugs (0003:1087) monitoring my dehumidifiers not updating the power usage figure.
I finally realized why i’ve been feeling stuffy all evening – both dehumidifiers were full, my script had not warned me, and humidity was at 90 percent. Turns out that in a tropical climate, air conditioning alone is not enough when you’ve built an insanely well-insulated house! Who’da thunk it!

So anyway.

I have many other identical plugs, and one of them IS updating.

These are all in the same room as the controller.

Here is my z-wave map:

Node 10 is another 0003:1087 device and it’s reporting its power fine.

Nodes 27 and 28 are the dehumidifiers. They’re not updating at all right now, but if I click “refresh items” on the thing in HABmin, it updates instantly!

Would appreciate any help… and it would also be nice to have a way to automate the manual refresh from a script – even if it’s just even 5 minutes that would be way better than it having the wrong state for hours and hours. I can live with my dehumidifiers being off for a few minutes (in a worst case).

Obviously automatic updating should work too. I tailed and grepped the Z-Wave log while forcing power usage to change, before and after clicking refresh items, and NO automatic updates occur… however refresh items works instantly.

Here’s what the log looks like when I manually click refresh items:

2019-06-26 22:17:57.389 [DEBUG] [nal.protocol.ZWaveTransactionManager] - NODE 28: Application Command Request (ALIVE:DONE)
2019-06-26 22:17:57.390 [DEBUG] [ng.zwave.internal.protocol.ZWaveNode] - NODE 28: resetResendCount initComplete=true isDead=false
2019-06-26 22:17:57.391 [DEBUG] [ng.zwave.internal.protocol.ZWaveNode] - NODE 28: Incoming command class COMMAND_CLASS_METER, endpoint 0
2019-06-26 22:17:57.392 [DEBUG] [ng.zwave.internal.protocol.ZWaveNode] - NODE 28: SECURITY not supported
2019-06-26 22:17:57.394 [DEBUG] [tocol.commandclass.ZWaveCommandClass] - NODE 28: Received COMMAND_CLASS_METER V3 METER_REPORT
2019-06-26 22:17:57.395 [DEBUG] [.commandclass.ZWaveMeterCommandClass] - NODE 28: Meter: Type=Electric(1), Scale=W(2), Value=0E+1
2019-06-26 22:17:57.396 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 28: Got an event from Z-Wave network: ZWaveMeterValueEvent
2019-06-26 22:17:57.397 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 28: Got a value event from Z-Wave network, endpoint=0, command class=COMMAND_CLASS_METER, value=0E+1
2019-06-26 22:17:57.399 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 28: Updating channel state zwave:device:109534bc:node28:meter_watts to 0 [DecimalType]
2019-06-26 22:17:57.401 [DEBUG] [nal.protocol.ZWaveTransactionManager] - NODE 28: Commands processed 1.
2019-06-26 22:17:57.402 [DEBUG] [nal.protocol.ZWaveTransactionManager] - NODE 28: Checking command org.openhab.binding.zwave.internal.protocol.ZWaveCommandClassPayload@1044769.
2019-06-26 22:17:57.407 [DEBUG] [nal.protocol.ZWaveTransactionManager] - NODE 28: Command verified org.openhab.binding.zwave.internal.protocol.ZWaveCommandClassPayload@1044769.
2019-06-26 22:17:57.409 [DEBUG] [nal.protocol.ZWaveTransactionManager] - NODE 28: notifyTransactionResponse TID:2752 DONE
2019-06-26 22:17:57.415 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 28: Got an event from Z-Wave network: ZWaveTransactionCompletedEvent

And here’s what it looked like back when it happened automatically. I had to go back several log files find where it last happened, but it was really just yesterday:

2019-06-25 19:23:06.201 [DEBUG] [nal.protocol.ZWaveTransactionManager] - NODE 28: Application Command Request (ALIVE:DYNAMIC_VALUES)
2019-06-25 19:23:06.201 [DEBUG] [ng.zwave.internal.protocol.ZWaveNode] - NODE 28: Incoming command class COMMAND_CLASS_METER, endpoint 0
2019-06-25 19:23:06.202 [DEBUG] [ng.zwave.internal.protocol.ZWaveNode] - NODE 28: SECURITY not supported
2019-06-25 19:23:06.202 [DEBUG] [tocol.commandclass.ZWaveCommandClass] - NODE 28: Received COMMAND_CLASS_METER V3 METER_REPORT
2019-06-25 19:23:06.203 [DEBUG] [.commandclass.ZWaveMeterCommandClass] - NODE 28: Meter: Type=Electric(1), Scale=V(4), Value=220.1
2019-06-25 19:23:06.204 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 28: Got an event from Z-Wave network: ZWaveMeterValueEvent
2019-06-25 19:23:06.204 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 28: Got a value event from Z-Wave network, endpoint=0, command class=COMMAND_CLASS_METER, value=220.1
2019-06-25 19:23:06.205 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 28: Updating channel state zwave:device:109534bc:node28:meter_voltage to 220.1 [DecimalType]
2019-06-25 19:23:06.206 [DEBUG] [nal.protocol.ZWaveTransactionManager] - NODE 28: Commands processed 1.
2019-06-25 19:23:06.206 [DEBUG] [nal.protocol.ZWaveTransactionManager] - NODE 28: Checking command org.openhab.binding.zwave.internal.protocol.ZWaveCommandClassPayload@a66451.
2019-06-25 19:23:06.207 [DEBUG] [nal.protocol.ZWaveTransactionManager] - NODE 28: Command verified org.openhab.binding.zwave.internal.protocol.ZWaveCommandClassPayload@a66451.
2019-06-25 19:23:06.207 [DEBUG] [nal.protocol.ZWaveTransactionManager] - NODE 28: notifyTransactionResponse TID:71 DONE
2019-06-25 19:23:06.208 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 28: Got an event from Z-Wave network: ZWaveTransactionCompletedEvent

And one for NODE 27 too:

2019-06-25 19:23:03.728 [DEBUG] [nal.protocol.ZWaveTransactionManager] - NODE 27: Application Command Request (ALIVE:DYNAMIC_VALUES)
2019-06-25 19:23:03.729 [DEBUG] [ng.zwave.internal.protocol.ZWaveNode] - NODE 27: Incoming command class COMMAND_CLASS_METER, endpoint 0
2019-06-25 19:23:03.729 [DEBUG] [ng.zwave.internal.protocol.ZWaveNode] - NODE 27: SECURITY not supported
2019-06-25 19:23:03.730 [DEBUG] [tocol.commandclass.ZWaveCommandClass] - NODE 27: Received COMMAND_CLASS_METER V3 METER_REPORT
2019-06-25 19:23:03.731 [DEBUG] [.commandclass.ZWaveMeterCommandClass] - NODE 27: Meter: Type=Electric(1), Scale=V(4), Value=224.48
2019-06-25 19:23:03.732 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 27: Got an event from Z-Wave network: ZWaveMeterValueEvent
2019-06-25 19:23:03.733 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 27: Got a value event from Z-Wave network, endpoint=0, command class=COMMAND_CLASS_METER, value=224.48
2019-06-25 19:23:03.734 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 27: Updating channel state zwave:device:109534bc:node27:meter_voltage to 224.48 [DecimalType]
2019-06-25 19:23:03.735 [DEBUG] [nal.protocol.ZWaveTransactionManager] - NODE 27: Commands processed 1.
2019-06-25 19:23:03.736 [DEBUG] [nal.protocol.ZWaveTransactionManager] - NODE 27: Checking command org.openhab.binding.zwave.internal.protocol.ZWaveCommandClassPayload@6a922b.
2019-06-25 19:23:03.737 [DEBUG] [nal.protocol.ZWaveTransactionManager] - NODE 27: Command verified org.openhab.binding.zwave.internal.protocol.ZWaveCommandClassPayload@6a922b.
2019-06-25 19:23:03.737 [DEBUG] [nal.protocol.ZWaveTransactionManager] - NODE 27: notifyTransactionResponse TID:64 DONE
2019-06-25 19:23:03.739 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 27: Got an event from Z-Wave network: ZWaveTransactionCompletedEvent

Here’s the entirety of my ZWave log as a 7zip file (5MB only!)

And here are my ZWave XML files.

Not sure what else might be useful for troubleshooting.

Help, please?

By the way, it might make sense for the “node ID” to be more prominently displayed in HABmin. I couldn’t find it for the longest time, and even now that I know where it is (in Device Configuration), it’s quite well hidden. I’ve taken to naming my things after the node number as a workaround, like so:

rossko57 · June 26, 2019, 4:19pm

If you see no periodic updates from your zwave devices, then you need to look for device setting.

You’ll probably be interested in openHABs REFRESH command as a stopgap. It’s not a panacea and you should look at why your devices are not reporting.

leif · June 26, 2019, 4:24pm

Thank you. I have not changed the configuration of the devices since it was working. They are and have always been configured to report, like so:

Thanks for the tip about Refresh, I will try that right now!
edit: It works! Thank you!

events.sendCommand(ir.getItem("Dehumidifier1_Power"),"REFRESH")
events.sendCommand(ir.getItem("Dehumidifier2_Power"),"REFRESH")

rossko57 · June 26, 2019, 4:30pm

Well you told us your expected updates are not in the zwave log?

leif · June 26, 2019, 4:31pm

I did not, I did actually include two log entries from when it did still work automatically, but they’re from yesterday, several log files back. Clarified by edit.

leif · June 26, 2019, 4:50pm

Future visitor: here is my python script for forced refresh, designed to go slow in order not to overload the z-wave network:

from threading import Timer

from core.log import logging, LOG_PREFIX
log = logging.getLogger(LOG_PREFIX + ".forced_refresh")

def logprint(text):
	log.debug(text)

class Refresh(object):
	def __init__(self, itemid):
		self.itemid=itemid

#		logprint(self.itemid)

	def run(self):
		try:
			events.sendCommand(ir.getItem(self.itemid),"REFRESH")
		except:
			logprint("couldn't refresh " + self.itemid)
		else:
			pass
			#logprint("refreshed " + self.itemid)
		finally:
			pass



#add more items here
refresh_list = [
	Refresh("Dehumidifier1_Power"),
	Refresh("Dehumidifier2_Power"),
	 ]



class RefreshController(object):
	def __init__(self,refreshlist):
		self.refreshlist=refreshlist
		self.hTimer=0
		self.index=0

	def run(self):

		if self.index < len(self.refreshlist):
			#logprint("running. let's refresh " + self.refreshlist[self.index].itemid )
			refresh_list[self.index].run()


		#We'll refresh one item at a time, every 5 seconds, and then start over after 5 minutes.
		#If the list is more than 60 items long we will refresh less often.

		self.index+=1
		self.index %= max(60,len(refresh_list))	#always count 60 items even if the list is shorter

		self.hTimer=Timer(5, self.run)	#one every 5 seconds
		self.hTimer.start()
		pass

	def stop(self):
		if self.hTimer:
			self.hTimer.cancel()
		logprint("script unloaded.")


refreshctrl=RefreshController(refresh_list)

refreshctrl.run()

def scriptUnloaded():
	refreshctrl.stop()

This is a band-aid which will let me keep my sanity.
I would still obviously very much have updates working.

sihui · June 26, 2019, 5:04pm

That looks weird. This is how it looks for me since a couple of years:

leif · June 26, 2019, 5:23pm

Is that part of the manually entered item title? Same workaround as I did, only backwards?

sihui · June 26, 2019, 5:36pm

Nope.
I remember that sometimes I had to delete the Thing and readd it to get the node number displayed automatically

rossko57 · June 26, 2019, 7:04pm

Wow … but I suppose that is a consequence of the debug level logging (which proved useful in this case)

I note there are version issues with this product and parameters changing.

As there have been relevant device database changes in recent months … if you had these devices configured before you updated binding to snapshot, I think you made need to delete and rediscover to pick up changes??

expire binding is a useful tool for error detection and handling of periodic reporting devices.

Number myItem "flux sensor" (testGroup) {channel="blah", expire="60m"}

Set the timing element to say, 1.5x or 2.5 your expected reporting period.
When not updated, sets default state UNDEF which allows you to tailor your rules not to operate on old stale data.

Of course you can error detect

rule "wahh problems"
when
   Item myItem changed to UNDEF
then
   logInfo("fault", "sensor failure " + triggeringItem.name)
end

and extend to a retry

rule "wahh problems"
when
   Member of testGroup changed to UNDEF
then
   logInfo("fault", "sensor failure " + triggeringItem.name)
   triggeringItem.sendCommand(RefreshType.REFRESH)  // a workaround until next OH release
end

There is a limitation, it does not detect if the sensor is broken at load time when no update starts the expire timer

rule "wahh problems"
when
   System started or
   Item myItem changed to UNDEF
then
   if (myItem.state == NULL || myItem.state == UNDEF) {
      logInfo("fault", "sensor failure " + myItem.name)
      myItem.sendCommand(RefreshType.REFRESH)  // a workaround until next OH release
   }
end

5iver · June 26, 2019, 11:30pm

@rule("Refresh badly configured devices")
@when("Time cron 5 0/5 * * * ?")# I always add an offset for cron jobs
def refresh_devices(event):
    events.sendCommand("Dehumidifier1_Power", "REFRESH")
    #sleep(5)# if you really feel a need for it, but zwave does very well managing it's own queue and can handle multiple commands
    events.sendCommand("Dehumidifier2_Power", "REFRESH")

leif · June 27, 2019, 12:33am

LMAO, Scott . Okay I see what you mean. Holy crap. See, this is why I’m here. Thank you!

My one comment is that z-wave queing may not always be as good at managing itself as we may want. In particular, when the network is busy, it can take significant time (on the order of a couple of seconds) for a command to get through. That’s what I was trying to avoid by spreading things out.

also, sleep(5) violates every systems design principle I know. I can’t do it, if it’s not ready it needs to reschedule, not lollygag.

But still, holy crap. I will rewrite my script.

leif · June 27, 2019, 1:51am

Okay, “re-add your devices” seemed to make sense, so I tried deleting 27 and 28, and re-added them. I even restarted openHAB afterwards.

It seemed to work, the power usage even updated a couple of times during initialization.

But once it was all up and running, I turned the dehumidifiers off manuallyy… and the figures again did not update! Stayed at 300 and 230 watts, respectively.

Here’s my Z-Wave log. I restarted openHAB (after already having disabled my refresh script and re-added the devices) at 07:49:28.

Plenty of other Z-Wave devices are updating, but not those two. Problem remains.

I started looking through the log files:

2019-06-27 07:50:12.615 [DEBUG] [alization.ZWaveNodeInitStageAdvancer] - NODE 27: Node advancer: Restored from file - skipping static initialisation

That was another a-ha moment – maybe I need to stop openHAB, delete /var/lib/openhab2/tmp and cache, and start it again. So I did that. Took its sweet time, and then it came back up – with the same problem.

Other devices are updating. Not those two.

Please note that these devices worked flawlessly for months with the Fibaro system I was using before. One of the few things that did, but they did work. So, it would be a major coincidence if it was the devices themselves having a problem.

I finally figured out how to grep -i so now I was able to find the z-wave binding by grep -i wave.

192 │ Active │ 80 │ 2.5.0.201906161432 │ openHAB Add-ons :: Bundles :: ZWave Binding

So it’s definitely running the snapshot binding.

Stuck again. More help, please?

Also, the Node ID is missing from HABmin! The only place it is shown is in the label, which is a freeform user text field. This seems like an oversight to me, obviously if nowhere else it should still be shown under “attributes” shouldn’t it? Attributes shows all the neighbours, and even the device ID, but not the node ID? Paper UI shows it in the form of zwave:device:109534bc:node38 but not HABmin. Some devices show it under “device configuration” but not others.

5iver · June 27, 2019, 4:36am

Come to think of it, you may not need a rule at all. The zwave binding should be able to do this for you. Just go into the device settings in Habmin and set the polling interval for the device to 5 minutes. There’s still an issue if the devices are not reporting, but this may be the simplest workaround. Normally, I set this to 1 day for all of my devices, since they all report data on their own and this is really unneeded traffic. I think default is 1h.

In looking at your log, your meters are set to report every 3s. You may want to rethink that, as it makes for a very busy network. Also, consider not linking Items to all of the Channels… if you don’t use the data, there’s no need for it on the network or to be processed by the binding and OH. Do you know about the log viewer? It looks like node 29 is reporting on it’s own, but less frequently than the others. I don’t know this device, but what do you have the configuration parameters set to? https://www.cd-jackson.com/index.php/zwave/zwave-device-database/zwave-device-list/devicesummary/397

It can be longer (15s IIRC), due to retries. But delays are a sign that there is an issue with communications in the network. Normally, commands should be sent/received in 10s of ms.

It’s displayed under Configuration… but you need to go to Tools and enable ‘Show advanced settings’.

list -s may help

leif · June 27, 2019, 5:00am

Thanks for taking the time to look through the logs, Scott!

I didn’t think of that either. I’ll try it!
Edit: The shortest length you can set is 10 minutes… that’s a bit long for the purpose of this device. I’d rather know sooner that the dehumidifier tank is full so I can empty it and let it keep working.

Really? I don’t see that, and I certainly wouldn’t have intended to! Which node ID were you seeing reporting that often? The default NEO Coolcam “meter report interval” is 300. I assumed that’s seconds, not centiseconds, as nothing happens at that time scale on a Z-Wave network with only 100kbps to share.

Node 29:

It’s a TZ79 in the consumer board in my workshop, driving a split-unit air conditioner set to 30C (Dry Mode). In this mode the compressor will come on periodically to get the humidity out, regardless of what the temperature is. Normally it idles with fan-only. I think that’s what we’re seeing, it looks to be working correctly.

Node 9 measures the power consumption of my NAS. It’s probably reporting often because of power usage fluctuations, I don’t think it’s timing. I’ve been meaning to widen those thresholds, I guess I’ll do it now.
Node 10 measures the power consumption of my workstation PC. See above.
Node 11 measures the power consumption and switches power to my workstation (6 monitors, a HT audio amplifier, a subwoofer, and two lamps). As above, tolerances too narrow.

Indeed. This has gotten much better since I have given up on and excluded problematic devices. openHAB makes it easy to get to the bottom of these kinds of problems compared to what I was used to from Fibaro.

ARGH!! Okay, I’ll try not to kill the messenger, but I will argue that there is nothing advanced about the node ID in a z-wave network… it’s just about the most fundamental thing there is. Does anyone else agree that perhaps the Node ID should be displayed more prominently, not in a pitch black cellar with no stairs leading to it in the bottom of a locked filing cabinet in a disused lavatory with a sign on the door saying “beware of the leopard”?

5iver · June 27, 2019, 5:18am

My understanding is that Habmin will be retired in OH3 and the functionality will be included in a unified administrative UI. Because of this and other factors, it is very unlikely a developer will put time into Habmin.

I don’t recall… put you’re log in the viewer and you can’t miss them.

leif · June 27, 2019, 5:23am

I done nothing but looking at the log. I can’t see anything happening at that interval, that’s why I asked. It was probably the power fluctuations then.

Fair enough. Looking forward to OH3 then.

5iver · June 27, 2019, 5:23am

That sounds like a device config param. I was meaning the binding configuration… it periodically polls device states for devices that do not self report and also after a startup. Look again… I think this is another advanced setting.

sihui · June 27, 2019, 4:34pm

I can assure you that I did not manually edit anything regarding my screenshot
All my Things do show the node id in HABmin …

leif · June 28, 2019, 2:56am

Wasn’t implying that you edited it Sihui! Of course not. Your things indeed show the ID as part of the label, they were created that way, and you never changed the names from the default. But, with multiple unlabeled devices, how do you keep track of which one is in which room?

When I included my Z-Wave devices as things in PaperUI, I labeled them one by one as I added them, and I saw that the subheading under each device was something like zwave:device:109534bc:node6. So, I change the name to “Living room ceiling lights”. In PaperUI this showed as:

Living room ceiling lights
zwave:device:109534bc:node12

After adding 30 odd devices, I had a couple of problems, and then I went to HABmin… where the devices were presented with the names I had entered ONLY… no node ID.

Starting in paperUI, there was no indication that I needed to keep the node ID in the name! I thought it was superfluous.

I went back into PaperUI and added the node ID back to each device manually. Sometimes the operation wasn’t saved and then I had to do it again… maybe about a third of the time. Stuff happens.

So, I’m fine now. My things are labeled and my network is mostly under control. I’m only posting about it here because the way things are laid out, there’s no chance I’ll be the last one to make this mistake, and so it is something that can be improved. This is one of many crucial differences between an ethusiast-driven project like openHAB and the many commercial products… most commercial products don’t make improvenents unless it improves the bottom line… whereas we (in a project like openHAB) make improvents just because.

That said, for all the other discussion and workarounds, I still have no idea what to try next to troubleshoot the power reporting, so that I don’t have to rely on polling. Any ideas? Again, this worked fine in openHAB previously. Not sure if it stopped exactly when updating to 2.5.0M1 but that may have been it.