Homematic IP device status updates stop being received by openHAB (but controlling these devices still works)

Platform information:

  • Homematic IP Hardware: CCU2 (2.59.7), multiple Homematic IP devices
  • openHAB Hardware: Raspi 3 (Raspian OS 10)
  • openHAB Software: 3.1.0 Release Build, running on Docker 2.6.2
  • openHAB Bindings: Homematic Binding (logging the heating, controlling lights and blinds), Gardena Binding for Gardena Gateway / smart irrigation control (logging soil humidity), Alexa Binding (controlling the Homematic lights)

I observed a rather odd behavior with my Homematic binding. While I could fix it easily with a reboot of openHAB, I’d like to know whether this is either a (known) bug or a configuration issue on my side.

Observation:
(log entries illustrative, so log times do not match my observation)
Until around 4:12 pm today, things worked perfectly: I was able to control Homematic items perfectly via openHAB…

22:49:24.360 [INFO ] [openhab.event.ItemCommandEvent       ] - Item 'Stehlampe' received command ON
22:49:24.368 [INFO ] [openhab.event.ItemStatePredictedEvent] - Item 'Stehlampe' predicted to become ON
22:49:24.379 [INFO ] [openhab.event.ItemStateChangedEvent  ] - Item 'Stehlampe' changed from 14 to 100

… and sensors and devices (if switched on the device itself / not via openHAB) regularly sent their status updates to openHAB…

23:16:08.637 [INFO ] [openhab.event.ItemStateChangedEvent  ] - Item 'Lichtsensor_Garten_HighestIllumination' changed from 1.08 lx to 1.09 lx
23:16:50.843 [INFO ] [openhab.event.ItemStateChangedEvent  ] - Item 'Thermostat_Garten_ActualTemperature' changed from 23.3 °C to 23.2 °C
23:30:52.802 [INFO ] [openhab.event.ItemStateChangedEvent  ] - Item 'Stehlampe' changed from 0 to 100
23:30:55.839 [INFO ] [openhab.event.ItemStateChangedEvent  ] - Item 'Stehlampe' changed from 100 to 0

At 4:12 something odd happened: While I could continue to control my Homematic devices via openHAB, the status updates of the sensors and devices stopped reaching openHAB.

Example:
While the CCU2 continued to monitor my light sensor with a time resolution of 1 - 3 minutes…


… it did not reach openHAB anymore (refer to the straight line at the end of the graph)…

… even though the communication between openHAB and the CCU was still in place (I could still control devices).

What’s even more odd: The communication between the CCU and Openhab did not stop completely at 4:12. After a short break there was one more update at 4:32, and then after a loooong break another update only at 10:15 (~ 6 hours later).

After doing an openHAB reboot, everything went back to normal (devices and sensors sending their status updates regularly again).

My interpretation: Since…

  • …everything was still logged correctly at the CCU…
  • … a reboot of the openHAB binding fixed the problem…
  • … and, during the time of the odd behavior, devices could still be controlled via openHAB…
  • … the problem can’t be on the CCU-side…
  • … and I suspect maybe a bug in the openHAB Homematic binding?

Or is it a configuration issue?

Problem is that, if this happens unnoticed, it prevents logging altogether for as long as you reboot openHAB.

Looking forward to your thoughts.

Update: What I experienced sounds similar to what has been described here After network outage Homematic items are visible and receiving updates on the device, but not updating values in Openhab - #10 by Cplant and here Debugging Homematic IP: no updates on some HmIP devices after some time

Update 2: After having had a bunch of connectivity problems for months, and after (unsuccessfully) having changed almost everything except the rounter, I finally replaced the router with something decent (Fritz! Box). And voillà: All my connectivity problems are gone.

The IP devices are a bit “stubborn”. I have tried to find a solution and you can download a test version of the binding here: https://github.com/MHerbst/openhab-addons-test

I am not sure whether it will work with 3.1. Maybe you have to use a Milestone release of 3.2. Because you are using the binding in a Docker container you also have to configure the correct callback address.

I observed the same behaviour including the most recent Openhab 4.04. However I noticed that after rebooting the Homematic binding the problem shows up again after only a couple of hours. Restarting openhab itself fixes the problem for a couple of days to a couple of weeks.

I can confirm @mschlee observation. Also after update to 4.04 homematic binding stop updating values w/o any log messages. All things show up as online. A restart of the binding solves the issue for a couple of hours.
Updating to 4.1.0.M4 did not help.

A very weired observation. I also have the NTP binding stopis being updated. There often it helps to restart the Spotify binding.

This leads me to the hypothesis that these issues might be correlated / originated from I have a the the issues reported around Heap, threads, memory issues mentioned:

Even after restart of openHAB homematic binding only runs stable for apporx 24h. Then no updates are received. I have th following log entry. in openhab.log.

2023-12-17 02:40:05.113 [ERROR] [ommunicator.AbstractHomematicGateway] - java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException: Total timeout 15000 ms elapsed
2023-12-17 02:40:40.704 [WARN ] [ternal.handler.HomematicThingHandler] - Value for datapoint HmDatapoint[name=TX_THRESHOLD_POWER,value=0.0,defaultValue=100.0,type=FLOAT,minValue=0.01,maxValue=160000.0,options=null,readOnly=false,readable=true,unit=W,description=TX_THRESHOLD_POWER,info=null,paramsetType=MASTER,virtual=false,trigger=false] is outside of valid range, using default value for config.
2023-12-17 02:40:42.021 [WARN ] [ternal.handler.HomematicThingHandler] - Value for datapoint HmDatapoint[name=TX_THRESHOLD_POWER,value=0.0,defaultValue=100.0,type=FLOAT,minValue=0.01,maxValue=160000.0,options=null,readOnly=false,readable=true,unit=W,description=TX_THRESHOLD_POWER,info=null,paramsetType=MASTER,virtual=false,trigger=false] is outside of valid range, using default value for config.
2023-12-17 02:40:49.405 [WARN ] [ternal.handler.HomematicThingHandler] - Value for datapoint HmDatapoint[name=TX_THRESHOLD_POWER,value=0.0,defaultValue=100.0,type=FLOAT,minValue=0.01,maxValue=160000.0,options=null,readOnly=false,readable=true,unit=W,description=TX_THRESHOLD_POWER,info=null,paramsetType=MASTER,virtual=false,trigger=false] is outside of valid range, using default value for config.
2023-12-17 02:40:49.495 [WARN ] [ternal.handler.HomematicThingHandler] - Value for datapoint HmDatapoint[name=TX_THRESHOLD_POWER,value=0.0,defaultValue=100.0,type=FLOAT,minValue=0.01,maxValue=160000.0,options=null,readOnly=false,readable=true,unit=W,description=TX_THRESHOLD_POWER,info=null,paramsetType=MASTER,virtual=false,trigger=false] is outside of valid range, using default value for config.

Here my thing configuration:

UID: homematic:bridge:LEQ1005809
label: thi_HM-CCU3-App - 192.168.0.236
thingTypeUID: homematic:bridge
configuration:
  cuxdPort: 8701
  socketMaxAlive: 900
  installModeDuration: 60
  callbackRegTimeout: 120
  hmIpPort: 2010
  timeout: 15
  factoryResetOnDeletion: false
  discoveryTimeToLive: -1
  wiredPort: 2000
  gatewayType: ccu
  groupPort: 9292
  gatewayAddress: 192.168.0.236
  unpairOnDeletion: false
  rfPort: 2001
  bufferSize: 2048
channels:
  - id: DUTY_CYCLE_RATIO
    channelTypeUID: homematic:DUTY_CYCLE_RATIO
    label: Auslastungsgrad
    description: Aktuelle Zyklusbenutzung
    configuration: {}

My info:


                           _   _     _     ____
   ___   ___   ___   ___  | | | |   / \   | __ )
  / _ \ / _ \ / _ \ / _ \ | |_| |  / _ \  |  _ \
 | (_) | (_) |  __/| | | ||  _  | / ___ \ | |_) )
  \___/|  __/ \___/|_| |_||_| |_|/_/   \_\|____/
       |_|       4.1.0.M4 - Milestone Build

Use '<tab>' for a list of available commands
and '[cmd] --help' for help on a specific command.
To exit, use '<ctrl-d>' or 'logout'.

openhab> info
Karaf
  Karaf version               4.4.4
  Karaf home                  /usr/share/openhab/runtime
  Karaf base                  /var/lib/openhab
  OSGi Framework              org.eclipse.osgi-3.18.0.v20220516-2155

JVM
  Java Virtual Machine        OpenJDK 64-Bit Server VM version 17.0.9+8-LTS
  Version                     17.0.9
  Vendor                      Azul Systems, Inc.
  Pid                         1338504
  Uptime                      1 day 14 hours
  Process CPU time            1 hour 37 minutes
  Process CPU load            0.03
  System CPU load             0.09
  Open file descriptors       239
  Max file descriptors        102,642
  Total compile time          9 minutes
Threads
  Live threads                297
  Daemon threads              138
  Peak                        313
  Total started               78989
Memory
  Current heap size           618,353 kbytes
  Maximum heap size           2,048,000 kbytes
  Committed heap size         845,824 kbytes
  Pending objects             0
  Garbage collector           Name = 'G1 Young Generation', Collections = 872, Time = 24.253 seconds
  Garbage collector           Name = 'G1 Old Generation', Collections = 0, Time = 0.000 seconds
Classes
  Current classes loaded      34,543
  Total classes loaded        39,929
  Total classes unloaded      5,386
Operating system
  Name                        Linux version 6.2.16-3-pve
  Architecture                amd64
  Processors                  2
  Total physical memory       3,432,636 kbytes
  Free physical memory        119,040 kbytes

It seems the be same issue as decribed here:

Anyone an idea how to approach this?

The issue is not appearing any more. Most likely updating CCU3 to latest firmware version 3.73.9 was the solution.

Homematic binding is now stable since 7 days.