openHABian / Services freezes

Hi,

i´ve also installed htop. At the moment everythings look fine. I´ve to wait until the systems freezes again to see if i´ve the same behavior like Robert.
The difference to Robert is that i dont have mosquitto installed.

2 Likes

With MQTT deactivated, everything seems fine for a while:

I guess, it has to do with the interpretation of values being published by various devices - I saw some “java conversation” errors in my logs. But even if some format isn’t okay, it shouldn’t crash the whole system - right?

@Thorhammer, do you have any java errors in your logfile?

the openhab.log or another one?
Now, with the MQTT binding deactivated: is the OH log clean?

some general questions:

  • what kind of mqtt traffic exists between openHAB and the broker?
  • how many OH items are using the mqtt binding?
  • item configs: mainly incoming (subscribing) or outgoing (publishing) (or both) from openHAB towards the broker?
  • how often the items receive updates?

There may be a bug somewhere in the code but we need to dig deep to identify it :slight_smile:

Hi Robert, i´ve no java errors in my logfile.

Hi Dim,
here are my mqtt-items:

subscribing:

String PoolSolarStatus        "Status Pool-Sensoren [%s]"      <status2>                     {mqtt="<[mqtt_broker:poolsolar/status:state:default]"}
Number PoolSolarTempSkimmer   "Skimmer [%.1f °C]"              <skimmer_temp>   (gPoolSolar) {mqtt="<[mqtt_broker:poolsolar/temp_skimmer:state:default]"}
Number PoolSolarTempPumpe     "Pumpe [%.1f °C]"                <pump_temp>      (gPoolSolar) {mqtt="<[mqtt_broker:poolsolar/temp_pumpe:state:default]"}
Number PoolSolarTempDach      "Kollektor [%.1f °C]"            <kollektor_temp> (gPoolSolar) {mqtt="<[mqtt_broker:poolsolar/temp_dach:state:default]"}
Number PoolSolarTempSiri      "Schwimmbad [%.1f °C]"        ["CurrentTemperature"]       {mqtt="<[mqtt_broker:poolsolar/temp_skimmer:state:default]"}
String AlarmStatus                 "Status Alarmanlage [%s]"         <alarm>        {mqtt="<[mqtt_broker:haus/alarmwatcher/status:state:default]"}
Number Badezimmer_Temperatur          "Temperatur [%.1f °C]"                     <temperature>  {mqtt="<[mqtt_broker:haus/badezimmer/temp:state:default]"}
Number Badezimmer_Feuchte             "Luftfeuchte [%.1f %%]"                    <humidity>     {mqtt="<[mqtt_broker:haus/badezimmer/humidity:state:default]"}

publishing:

Number PoolSolarRelais        "Pool-Relais"                                                  {mqtt=">[mqtt_broker:poolsolar/relais:command:ON:1],>[mqtt_broker:poolsolar/relais:command:OFF:0]"}
Switch Gartentor                   "Gartentor"                       <tuere>        ["Switchable"] {autoupdate="false", mqtt=">[mqtt_broker:haus/gartentor/open:command:ON:1],>[mqtt_broker:haus/gartentor/open:command:OFF:0]"}

I also do have a bunch of rules; I can send them to you if you need to see them all.
Somehow I assume, it happens in this part here:

///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
// Pool-Steuerung
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
rule Pool_Solar_Steuerung
     when 
          // Bei Betätigung des Schaltes oder jede Minute
          Item PoolSolarMode received command or
          Time cron "0 0/1 * * * ?"   
     then 
       // Durchschnitt aus Skimmer- und Pumpen-Temperatur berechnen
       var Number PoolTempDS = ((PoolSolarTempSkimmer.state as DecimalType + PoolSolarTempPumpe.state as DecimalType)/2)
       // Temperaturdifferenz von Dachtemperatur abziehen (für den einfacheren Vergleich)
       var Number PoolTempDach = PoolSolarTempDach.state as DecimalType
       var Number PoolTempDiff = PoolSolarTempDiff.state as DecimalType
       var Number PoolTempDD = (PoolTempDach - PoolTempDiff)
       logDebug("RULE","PoolTempDS=" + PoolTempDS + ", PoolTempDach=" + PoolTempDach + ", PoolTempDiff=" + PoolTempDiff + ", PoolTempDD=" + PoolTempDD)
 
       switch (PoolSolarMode.state)
       {
       // Wenn "AUTO" gewählt ist, dann abhängig von der Temperaturdifferenz schalten
       case 2: if ( PoolTempDS < PoolTempDD )
                  {
                    sendCommand(PoolSolarRelais, 1)
                    logDebug("RULE", "PoolSolarMode AUTO: ON (Pool=" + PoolTempDS +"°, Dach=" + PoolTempDach + "°)")
                  } else
                  {
                    sendCommand(PoolSolarRelais, 0)
                    logDebug("RULE", "PoolSolarMode AUTO: OFF (Pool=" + PoolTempDS +"°, Dach=" + PoolTempDach + "°)")
                  }       
       // "OFF" (manuell)
       case 0:  {
                      sendCommand(PoolSolarRelais, 0)
                      logDebug("RULE", "PoolSolarMode MANUAL: OFF")
                    }
       // "ON" (manuell)
       case 1:   {
                      sendCommand(PoolSolarRelais, 1)
                      logDebug("RULE", "PoolSolarMode MANUAL: ON")
                     }
       }
       
end
1 Like

regarding the question how often they are being updated:

  • “Alarmstatus” is being published by the Arduino who monitors my alarm system as soon as the state changes - this happens a few times a day
  • Badezimmer_Temperatur and Badezimmer_Feuchte are being updated every 5 minutes
  • all other items (belonging to our pool) are being updated every 60 seconds
1 Like

I don’t see a lot of load on the OH2<->Broker link…
this shouldn’t be the problem…
I don’t know what to check next :frowning:
I would try to narrow it down by slowly adding stuff to the mqtt setup in OH2 to see if some specific config causes these problems…

Crashed again, this time without MQTT - this doesn’t seem to be the problem.
One CPU core again is at 100%.

Now I try to deactivate the Harmony Hub, because I found some errors in the log:

2017-07-15 22:29:22.182 [WARN ] [harmonyhub.handler.HarmonyHubHandler] - heartbeat failed

java.lang.RuntimeException: Failed communicating with Harmony Hub

	at net.whistlingfish.harmony.HarmonyClient.sendOAStanza(HarmonyClient.java:275)[189:org.openhab.binding.harmonyhub:2.1.0]

	at net.whistlingfish.harmony.HarmonyClient.sendOAStanza(HarmonyClient.java:264)[189:org.openhab.binding.harmonyhub:2.1.0]

	at net.whistlingfish.harmony.HarmonyClient.sendPing(HarmonyClient.java:333)[189:org.openhab.binding.harmonyhub:2.1.0]

	at org.openhab.binding.harmonyhub.handler.HarmonyHubHandler$2.run(HarmonyHubHandler.java:215)[189:org.openhab.binding.harmonyhub:2.1.0]

	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)[:1.8.0_121]

	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)[:1.8.0_121]

	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)[:1.8.0_121]

	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)[:1.8.0_121]

	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)[:1.8.0_121]

	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)[:1.8.0_121]

	at java.lang.Thread.run(Thread.java:745)[:1.8.0_121]

Caused by: org.jivesoftware.smack.SmackException$NoResponseException: No response received within reply timeout. Timeout was 5000ms (~5s). Waited for response using: net.whistlingfish.harmony.protocol.OAReplyFilter@14583d2.

	at org.jivesoftware.smack.SmackException$NoResponseException.newWith(SmackException.java:110)

	at org.jivesoftware.smack.SmackException$NoResponseException.newWith(SmackException.java:97)

	at net.whistlingfish.harmony.HarmonyClient.getNextStanzaSkipContinues(HarmonyClient.java:287)[189:org.openhab.binding.harmonyhub:2.1.0]

	at net.whistlingfish.harmony.HarmonyClient.sendOAStanza(HarmonyClient.java:273)[189:org.openhab.binding.harmonyhub:2.1.0]

	... 10 more

2017-07-15 22:30:05.841 [WARN ] [ore.internal.events.OSGiEventManager] - Dispatching event to subscriber 'org.eclipse.smarthome.io.monitor.internal.EventLogger@10e48a6' takes more than 5000ms.

2017-07-15 22:29:29.015 [WARN ] [ore.internal.events.OSGiEventManager] - Dispatching event to subscriber 'org.eclipse.smarthome.core.thing.internal.ThingManager@1ad441a' takes more than 5000ms.

2017-07-15 22:30:42.285 [WARN ] [ore.internal.events.OSGiEventManager] - Dispatching event to subscriber 'org.eclipse.smarthome.core.internal.items.ItemUpdater@3c20b1' takes more than 5000ms.

2017-07-15 22:31:04.802 [WARN ] [software.smack.tcp.XMPPTCPConnection] - shutdownDone was not marked as successful by the writer thread

org.jivesoftware.smack.SmackException$NoResponseException: No response received within reply timeout. Timeout was 5000ms (~5s). While waiting for shutdown completed

	at org.jivesoftware.smack.SmackException$NoResponseException.newWith(SmackException.java:92)

	at org.jivesoftware.smack.SynchronizationPoint.checkForResponse(SynchronizationPoint.java:260)[189:org.openhab.binding.harmonyhub:2.1.0]

	at org.jivesoftware.smack.SynchronizationPoint.checkIfSuccessOrWait(SynchronizationPoint.java:155)[189:org.openhab.binding.harmonyhub:2.1.0]

	at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketWriter.shutdown(XMPPTCPConnection.java:1346)[189:org.openhab.binding.harmonyhub:2.1.0]

	at org.jivesoftware.smack.tcp.XMPPTCPConnection.shutdown(XMPPTCPConnection.java:484)[189:org.openhab.binding.harmonyhub:2.1.0]

	at org.jivesoftware.smack.tcp.XMPPTCPConnection.shutdown(XMPPTCPConnection.java:465)[189:org.openhab.binding.harmonyhub:2.1.0]

	at org.jivesoftware.smack.AbstractXMPPConnection.disconnect(AbstractXMPPConnection.java:732)[189:org.openhab.binding.harmonyhub:2.1.0]

	at org.jivesoftware.smack.AbstractXMPPConnection.disconnect(AbstractXMPPConnection.java:707)[189:org.openhab.binding.harmonyhub:2.1.0]

	at net.whistlingfish.harmony.HarmonyClient.disconnect(HarmonyClient.java:106)[189:org.openhab.binding.harmonyhub:2.1.0]

	at org.openhab.binding.harmonyhub.handler.HarmonyHubHandler.disconnectFromHub(HarmonyHubHandler.java:237)[189:org.openhab.binding.harmonyhub:2.1.0]

	at org.openhab.binding.harmonyhub.handler.HarmonyHubHandler.setOfflineAndReconnect(HarmonyHubHandler.java:242)[189:org.openhab.binding.harmonyhub:2.1.0]

	at org.openhab.binding.harmonyhub.handler.HarmonyHubHandler.access$3(HarmonyHubHandler.java:241)[189:org.openhab.binding.harmonyhub:2.1.0]

	at org.openhab.binding.harmonyhub.handler.HarmonyHubHandler$2.run(HarmonyHubHandler.java:218)[189:org.openhab.binding.harmonyhub:2.1.0]

	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)[:1.8.0_121]

	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)[:1.8.0_121]

	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)[:1.8.0_121]

	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)[:1.8.0_121]

	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)[:1.8.0_121]

	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)[:1.8.0_121]

	at java.lang.Thread.run(Thread.java:745)[:1.8.0_121]

2017-07-15 22:30:42.290 [WARN ] [.whistlingfish.harmony.HarmonyClient] - Send heartbeat failed

java.lang.RuntimeException: Failed communicating with Harmony Hub

	at net.whistlingfish.harmony.HarmonyClient.sendOAStanza(HarmonyClient.java:275)[189:org.openhab.binding.harmonyhub:2.1.0]

	at net.whistlingfish.harmony.HarmonyClient.sendOAStanza(HarmonyClient.java:264)[189:org.openhab.binding.harmonyhub:2.1.0]

	at net.whistlingfish.harmony.HarmonyClient.sendPing(HarmonyClient.java:333)[189:org.openhab.binding.harmonyhub:2.1.0]

	at net.whistlingfish.harmony.HarmonyClient$2.run(HarmonyClient.java:177)[189:org.openhab.binding.harmonyhub:2.1.0]

	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)[:1.8.0_121]

	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)[:1.8.0_121]

	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)[:1.8.0_121]

	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)[:1.8.0_121]

	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)[:1.8.0_121]

	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)[:1.8.0_121]

	at java.lang.Thread.run(Thread.java:745)[:1.8.0_121]

Caused by: org.jivesoftware.smack.SmackException$NotConnectedException: The connection HarmonyXMPPTCPConnection[client@1111/main] (1) is no longer connected. done=true smResumptionPossible=false

	at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketWriter.throwNotConnectedExceptionIfDoneAndResumptionNotPossible(XMPPTCPConnection.java:1307)[189:org.openhab.binding.harmonyhub:2.1.0]

	at org.jivesoftware.smack.tcp.XMPPTCPConnection.throwNotConnectedExceptionIfAppropriate(XMPPTCPConnection.java:356)[189:org.openhab.binding.harmonyhub:2.1.0]

	at org.jivesoftware.smack.AbstractXMPPConnection.sendStanza(AbstractXMPPConnection.java:670)[189:org.openhab.binding.harmonyhub:2.1.0]

	at net.whistlingfish.harmony.protocol.HarmonyXMPPTCPConnection.sendStanza(HarmonyXMPPTCPConnection.java:49)[189:org.openhab.binding.harmonyhub:2.1.0]

	at net.whistlingfish.harmony.HarmonyClient.sendOAStanza(HarmonyClient.java:272)[189:org.openhab.binding.harmonyhub:2.1.0]

	... 10 more

@Thorhammer: Do you have a Logitech Harmony Hub in your configuration?

:frowning: :frowning: :frowning:

I hate unstable systems :face_with_monocle:
I wish I could help more… I am switching off for tonight :slight_smile:

No, i don´t have a Logitech Harmony Hub.
What i´ve done now is to clone the SD Card with “dd” to a new one…only to make sure it´s not the SD card.
I double checked syslog, dmesg, debug, kern.log, messages, openhab.log, homegear.log, homegear.err but none of them show errors that give me a hint.

Should I try to move to 2.2 snapshot? Now it’s not working at all; couldn’t get any worse with 2.2, could it?

Switching now over to 2.2 “unstable” to see if it’s more stable than the 2.1 “stable”. If the problem vanishes, fine - and if not, I’ll open a bug request in github.

I’m using the unstable branch since almost 3 years now and only had issues two or three times. Reverting back is easy so almost no risk … good luck.

2 Likes

Running 2.2 now - cannot tell much at the moment except that the load on the single cores seems to spread across all cores, not only one core at 100%. Will post an update later. Thanks for all your contribution, especially @Dim

1 Like

Running stable on 2.2 for two hours now, no delays anymore, system is fully responsive. Whatever these problems caused, they are gone now with 2.2 “unstable”.

2 Likes

My excitment was premature - it happened again:

@Thorhammer, do you use Homekit integration? Now I get JAVA exceptions from HomeKit…maybe this causes the crashes?

Hi Robert,

yes i do but i don´t have java exceptions…not yet :wink:
So now since changing the SD card…more then 24h stable system. So let´s see if this was the error. I´ll keep you informed.

Still stable?

hi, yes still stable. I´ve ordered a second PI and i´ll test the card in the new PI.