Debugging Homematic IP: no updates on some HmIP devices after some time

Hello community,

openHAB 2.5.10 on a RPi4, via LAN
RaspberryMatic 3.53.30 on a RPi3, via LAN

Many of my Homematic IP things in OH stop receiving data after some time (typically after some weeks or month, sometimes after two weeks).
This does happen only for single devices, not all together.
Actually a weather station HmIP-SWO-PR, and a HmIP-eTRV-2 (thermostat valve drive) are affected.

The HmIP devices properly receive demand values from OH - at least the thermostat valve; I wonder if setting a demand wind speed would make sense :wink:

Within Homematic WebUI I see that the devices work properly.
I saw in an earlier topic, that specifying a callback adress for the binding may help, if OH is connected via LAN and WIFI at the same time (two IPs). Mine is only connected via LAN, but I added the OH address as callbackHost.

Because it is usually working for quite some time I doubt that there is any network or firewall issue.
Non-IP devices work fine all the time.

I know that this no serious way of debugging, but I tried to dis- and enable the thing in PaperUI, which did not help to “reconnect” to the device.
Restarting OH did help. I did not touch anything in Homematic meanwhile.

As far as I know it is OH that polls data from HM.
Is there a way to check, if this polling for a particular thing (or item) actually happens in openHAB, or if this request was answered by Homematic?

And further, how can I check in HM, if the polling/request arrives, and if HM is answering the poll? (yes, I know, this question should be asked in another forum :flushed: )

Thanks,
Phil

Hi Phil

Found your post because I have similar problems. Some HmIP devices stop updating, but in my case sometimes even after 1-2 hours after a OH3 restart. A OH3 restart helps.

Did you make any progress on this? Didn’t know about the callback, need to read up on that.

Thanks

@pgruetter Can you check your openhab.log whether there are information about a lost connection to the CCU?

There is a problem that after a disconnect and re-connect updates from HmIP devices are no longer displayed.

As far as I know the callback is only necessary if you have more than one IP address.

As far as I know it is OH that polls data from HM.

This is not quite correct. Some values are retrieved by polling (e.g. the duty cycle or content of system variables) but device status information are sent from the Homematic gateway to the binding. Therefore the binding subscribes to events from HM gateway with the callback IP address.

Can’t see anything, no. Just grep’d through all logs for “disconnect” and “homematic” but can’t see anything out of the ordinary. Since the last restart, I didn’t have any problems. Will check the logs next time this happens, although I have to say it only happened a couple of times so far.

I could also not find any trace of lost connections. I have issues only for single devices; some HmIP devices are working without any problems.
I am actually thinking of moving over to MQTT, by means of RedMatic, that forwards events via this protocol. Yet I need to get more familiar with NodeRed.
BR
Phil

You could also try to ssh into the CCU and check the log files in /var/log. Maybe there are some hints of what may have happened.

Another option that maybe could help would be an automatic disconnect/reconnect to the CCU every n hours.

I’m facing the same issue. After some time, the homematic related items are not updated anymore. Sometimes its working normal for just a few hours, sometimes its working normally for weeks/month.

When this behavior occurs, the values in the Homematic UI are updated normally, so I would assume the problem is not related to Homematic itself.

Im running on OH3 + debmatic, but I recognized similar issues with OH2.
Debug logging is enabled and I can see many parsing errors, but I guess they are not related to the issue, as those parsing issues just occur after a binding restart.
Today at around 8:10 I recognized the items were not updated anymore. Since some temperature items are persisted, I can see the time approximately.
Neither in the openhab nor in the homematic log there are strange messages at this time. (Log attached)

Just now I added the callback ip to the Thing configuration, maybe this helps.

Any hint is appreciated.

openhab.log (577.4 KB) homematic.log (185.3 KB)

Happened again today, and this time I actually found something in the logs:

2021-01-30 11:19:12.349 [WARN ] [ommunicator.AbstractHomematicGateway] - Connection lost on gateway ‘KEQ0384588’, cause: “IOException java.util.concurrent.TimeoutException: Total timeout 15000 ms elapsed”
2021-01-30 11:20:07.378 [INFO ] [ommunicator.AbstractHomematicGateway] - Connection resumed on gateway ‘KEQ0384588’

This is right around the time the sensors didn’t get any updates anymore, you can see the flatline at shortly after 11:00:
image

You wrote there’s a known problem after a re-connect. Do you have any more detailed info about that?
Is there a workaround to detect this reconnects and restart the binding? The problem is solved if I disable the homematic binding in the karaf console and activate it again.

Guess a higher timeout would help already, but that’s not a definitive solution.

Thanks a lot

I have enhanced the automatic reconnect. You can get a test version here GitHub - MHerbst/openhab-addons-test You can use it with OH version 3.0 and above.

Thanks a lot! I’ll give that a try :+1:

Perfect, your test version works correctly for me. I disconnected the CCU2 until I got the message in the logs, then reconnected it. Now after 12 hours I checked all HmIP sensors that failed in the past, and the all send updates.

Can you create a PR to have this included in the next official release?
Thanks a lot for your work!

Good to hear, that it helped in your environment.

Already done. Is included in the 3.1 snapshots.

1 Like

Hey Martin,

I have the same issues with some HMIP sensors and tried to install the updated binding w/o success so far.

My environment:

OH 3.0.1 on Raspberry 4 and a CCU2

Steps I did to enable:

Uninstalled the HM 3.0.1 binding
systemctl openhab stop
openhab-cli clean-cache
cd /usr/share/openhab/add-ons
wget https://github.com/MHerbst/openhab-addons-test/blob/master/org.openhab.binding.homematic-3.1.0-SNAPSHOT.jar

Everything looks good so far:


I also installed the additional packages:

After that I started OH again and checked for the HM binding:

Somehow it does not show in the list - any ideas what could be the reason?

Thanks!

Bindings that are only stored in the addons folder are not visible in the list.

You can check whether it is working in the Karaf Console. Run bundle:list | grep Homematic to get the binding’s state. If it shows “active” then everything is fine. If the state is “installed”, you can try to start it with bundle:start <number>. You will see an error message if it can’t be started.

BTW: the test version of the binding requires a snapshot version. It won’t run on 3.0.1.

1 Like

Hello,

same issue here with Homematic IP Device HmIP-eTRV-C-2 and Openhab 3.2.0 (build: 2473) and Homematic Binding 3.2.0-SNAPSHOT.

In the openhab log i see no errors, but success changed state from CCU. Current round about 24° Degrees. But, Openhab shows always 22.1° Degrees since more than 2 Days. I will not do a restart now. Possible we can debug the current state
 ? It is not a duty cycle issue. DC at 2-3%

2021-10-11 16:15:21.064 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' changed from 25 °C to 25.5 °C
2021-10-11 16:15:52.410 [INFO ] [openhab.event.ItemCommandEvent      ] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' received command 26
2021-10-11 16:15:52.411 [INFO ] [penhab.event.ItemStatePredictedEvent] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' predicted to become 26
2021-10-11 16:15:52.419 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' changed from 25.5 °C to 26 °C
2021-10-11 16:15:54.039 [INFO ] [openhab.event.ItemCommandEvent      ] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' received command 25.5
2021-10-11 16:15:54.042 [INFO ] [penhab.event.ItemStatePredictedEvent] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' predicted to become 25.5
2021-10-11 16:15:54.051 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' changed from 26 °C to 25.5 °C
2021-10-11 16:15:55.332 [INFO ] [openhab.event.ItemCommandEvent      ] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' received command 25
2021-10-11 16:15:55.334 [INFO ] [penhab.event.ItemStatePredictedEvent] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' predicted to become 25
2021-10-11 16:15:55.342 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' changed from 25.5 °C to 25 °C
2021-10-11 16:15:59.205 [INFO ] [openhab.event.ItemCommandEvent      ] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' received command 24.5
2021-10-11 16:15:59.207 [INFO ] [penhab.event.ItemStatePredictedEvent] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' predicted to become 24.5
2021-10-11 16:15:59.215 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' changed from 25 °C to 24.5 °C
2021-10-11 16:16:00.494 [INFO ] [openhab.event.ItemCommandEvent      ] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' received command 24
2021-10-11 16:16:00.548 [INFO ] [penhab.event.ItemStatePredictedEvent] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' predicted to become 24
2021-10-11 16:16:00.551 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' changed from 24.5 °C to 24 °C
2021-10-12 11:30:31.778 [INFO ] [openhab.event.ItemCommandEvent      ] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' received command 24.5
2021-10-12 11:30:31.782 [INFO ] [penhab.event.ItemStatePredictedEvent] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' predicted to become 24.5
2021-10-12 11:30:31.790 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' changed from 24 °C to 24.5 °C
2021-10-12 11:30:33.246 [INFO ] [openhab.event.ItemCommandEvent      ] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' received command 25
2021-10-12 11:30:33.248 [INFO ] [penhab.event.ItemStatePredictedEvent] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' predicted to become 25
2021-10-12 11:30:33.255 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' changed from 24.5 °C to 25 °C
2021-10-12 15:55:19.570 [INFO ] [openhab.event.ItemCommandEvent      ] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' received command 24.5
2021-10-12 15:55:19.574 [INFO ] [penhab.event.ItemStatePredictedEvent] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' predicted to become 24.5
2021-10-12 15:55:19.583 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' changed from 25 °C to 24.5 °C
2021-10-12 15:55:20.490 [INFO ] [openhab.event.ItemCommandEvent      ] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' received command 24
2021-10-12 15:55:20.491 [INFO ] [penhab.event.ItemStatePredictedEvent] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' predicted to become 24
2021-10-12 15:55:20.499 [INFO ] [openhab.event.ItemStateChangedEvent ] - Item 'HmIPeTRVC2***geheim***_SetPointTemperature' changed from 24.5 °C to 24 °C

CCU are correct.

EDIT 1
Looks like the issue exist since 2 days at night 02:00 am.

EDIT 2
Ahaaa, looks like the ActualTemperatur not updated since 2 days. But
 Why ?

EDIT 3
Mhhh, Openhab reboot solved the issue
 Not good


Do you know whether the connection between OH and the CCU was interrupted? In this case there is a general problem with HmIP devices. I have created a test version of the binding where I try to fix this problem. See my post from Jan 30 here in this thread.

What kind of CCU do you have? For a CCU2 the fix is probably not sufficient, becaut it nees longer until all HmIP devices are online again.

@MHerbst
I have the same problem. I’m using Raspberrymatic in a Docker container and since some versions there is a monitoring of the various services. When monitoring decides to restart a service, no more events arrive in openHab. In my case, the HMIPServer get’s restarted sometimes per day and the connection is lost.

So we need to modify the connection tracker so that it not only tests the client, but also whether the server receives events. The best way i think is to send a ping and check if a pong is received. If there is no pong event, the RPC server must be restarted. I had already implemented this for the RF interface, but you removed it in January :wink:

I think we need the ping/pong for the RF and the HMIP interface. I am very busy, so I would like to ask you if you can implement these checks.

@gerrieg
I think the current way of detecting a lost connection is sufficent. The re-connect for the RF interface is OK and worked during my tests perfectly.

The big problem is the re-connect for the HmIP interface. It can take quite a while until this process is available and ready to register a callback. In the release version the add-on tried to register the callback for HmIP too early.

Therefore I have already modified the (re)connect handling.

I have already implemented a test version of the binding with a retry for the event registration. I have not created a PR for it yet, because I wanted to get more test results.
I also have to make the retry mechanism more flexible, because a CCU2 needs much longer than a CCU3 until the HmIP process is completely restarted.

Would be great if you could test it in your environement (it needs at least the latest 3.2 milestone version).
If the solution is not sufficient, I would (re)add a ping/pong mechanism.

@MHerbst
Good news! I’m ready to test, where can i find the source/jar?

It’s here: MHerbst/openhab-addons-test (github.com)

I will extend it in the next days to make the number of retries and the time span between the retries configurable.