Java runs amok weekly

Hi Skinah,
Thanks for that assessment. Basic question I know…but I’m not sure about the Java update procedure here… presume I logout/quit openhab… is the update on the Openhab resources page and can it overwrite/update the previous Java without uninstall… don’t want to make things worse…
UPDATE: Figured it out. All updated to latest versions as you recommended. Thanks for your help. We’ll see now what happens this week :slightly_smiling_face:

Hello again Skinah…and everyone else on this topic, Java gone wild again between late last night and this morning. Running at 128% CPU with fans blasting. Updated to latest Java stuff yesterday. Logs show nothing at all unusual. I’m starting to despair… :frowning:
Shell and log screen shots attached. Anybody any new insights…? Thanks to all so far.

May we see your events.log for he period?

Is that HABpanel activity just you working again?

Hi Ross,
Yes that was me messing about with Habpanel. Events log is rather long. Copied into a text file with ‘txt’ suffix…hope that’s readable to all?
Events_log.txt (263.3 KB)
regards,

Ross, details of Java wild process using Activity Monitor on the Mac (saves me from tormenting Unix code):
Activity monitor for Java process Activity Stats for Java Sample of java.txt (264.8 KB)
Maybe this will shed some more light on what’s going on… otherwise I’m baffled.
The ‘java.txt’ file was produced by clicking the ‘sample’ button.
Thanks again.

2020-10-13 07:56:59.285 [vent.ChannelTriggeredEvent] - astro:sun:local:rise#event triggered START
2020-10-13 07:56:59.285 [vent.ChannelTriggeredEvent] - astro:sun:local:civilDawn#event triggered END
2020-10-13 07:56:59.286 [vent.ChannelTriggeredEvent] - astro:sun:local:civilDawn#event triggered END
2020-10-13 07:56:59.286 [vent.ChannelTriggeredEvent] - astro:sun:local:rise#event triggered START

You’re getting doubled Astro events, which is interesting but unlikely to be a problem. Just make a note for now, and see if it persists after anything else gets sorted out.

2020-10-13 02:54:24.198 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:c0ec4738:node21' has been updated.
2020-10-13 02:54:24.203 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:c0ec4738:node9' has been updated.
2020-10-13 02:56:22.227 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:c0ec4738:node11' has been updated.
2020-10-13 02:56:24.694 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:c0ec4738:node9' has been updated.
2020-10-13 02:56:26.478 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:c0ec4738:node12' has been updated.
2020-10-13 02:56:28.432 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:c0ec4738:node11' has been updated.
2020-10-13 02:57:07.789 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:c0ec4738:node13' has been updated.
2020-10-13 02:57:09.506 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:c0ec4738:node12' has been updated.
2020-10-13 02:57:16.933 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:c0ec4738:node13' has been updated.
2020-10-13 02:57:17.287 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:c0ec4738:node18' has been updated.
2020-10-13 02:57:19.416 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:c0ec4738:node14' has been updated.
2020-10-13 02:57:20.261 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:c0ec4738:node22' has been updated.
2020-10-13 02:57:30.684 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:c0ec4738:node18' has been updated.
2020-10-13 02:57:34.128 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:c0ec4738:node22' has been updated.
2020-10-13 02:57:36.306 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:c0ec4738:node23' has been updated.

This is not usual.
“Thing updated” is referring to the configuration of the Thing, not status like online/offline.
It suggests your configuration does not quite match reality.
What happens at this time of morning is a zwave “network heal”, which can be a fairly disruptive process on zwave (but who cares at openHAB end).
However, knock-on problems can arise from Thing updates, unexpected reinitialization of various stuff like the binding itself. I don’t know if twenty such Thing updates lead to a queue of twenty reinitialzes …

2020-10-13 07:16:18.920 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:c0ec4738:node15' has been updated.
...
2020-10-13 10:00:32.928 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:c0ec4738:node20' has been updated.

Other Thing updates trickle in later on as devices wakeup, we’re talking long-running events here.

For purposes of elimination, you can I believe disable the zwave nightly heal.

For a fix, you may need to delete and rediscover each Thing so that they match the discovery. This is often needed if pre-existing Things are carried forward to updated binding that changed some aspect. If you re-use the same Thing uID, pre-existing Items etc. should reattach.

Hi Ross, thank you for all your time on this. I appreciate it.
I have disabled the z-wave nightly heal now from within the HABmin panel and saved the change.
What are the consequences…short or long term…of not having a wave heal?
Will I wait and see before deleting the ‘Things’?
I am using an Aeotec z-wave stick as my zwave radio. It can be brought about the house to add things as required whose details are then stored within it. Do I need to actually wipe its memory /reset or will I just delete the ‘things’ from Openhab and let Openhab rediscover them? If for some reason they get a different ‘Thing uID’ will I just overwrite it in the Paper UI with the original ID? I don’t want to lose all the setup/connections between the present items and the present Things.
regards,

It’s to do with pathway mapping. As I understand it, everything will work, you only need to heal when you reconfigure the mesh by moving/adding/removing nodes. And then I think it’s more “desirable” than “needed”.

No, leave it alone! Only look at OH Things.

As I understand it should not, because the unchanged stick will supply the same id

Hi Ross, …and others who contributed…time to give up on this I fear and close the topic as ‘unsolved’. I really thought Ross’s suggestion to disable ‘Nightly Z-wave heal’ had worked but today, 8 days later, Java is at it again. Sometime between 1 and 9 am it got out of control. Have examined all logs and nothing particular stands out. So time to pack it in and try another hardware platform or a different system altogether. Maybe I’ll be luckier with Openhab 3… Thanks to all who offered advice and were generous with their time.
regards, Denis

Final effort to solve Java runaway if anyone can help. Caught Java event in the act!
Java unravels around 01.47 am 27-Oct. I shut down system at 02.00 am.
Had installed binding ‘Systeminfo’ on previous day to log fan speed and CPU temperature and set up a rule to notify me if they went beyond certain limits.
Openhab log, Events log and CPU Temperature log attached for runaway period.
I had installed another Binding ‘Network Discovery’ shortly before the event but deleted the ‘things’ discovered and uninstalled it because I decided it wasn’t useful to me. Is this the culprit this time?? It’s all very odd the way the runaways seem to happen in the early hours even though Z-Wave heal is NOT enabled. Also, ‘Network Discover’ was not involved in previous runaways. All advice very welcome. regards, Java runaway-Events log.txt (226.1 KB) Javarunaway starts at 01.45-openhab-log.txt (5.3 KB) Javarunaway-CPU Temperature log.txt (21.1 KB)

Did you restart OH just before the log starts?

2020-10-27 01:45:29.922 [INFO ] [basic.internal.servlet.WebAppServlet] - Stopped Basic UI\
2020-10-27 01:45:38.538 [INFO ] [ui.habmin.internal.servlet.HABminApp] - Started HABmin servlet at /habmin\
2020-10-27 01:45:38.807 [INFO ] [panel.internal.HABPanelDashboardTile] - Started HABPanel at /habpanel\
2020-10-27 01:45:38.922 [INFO ] [openhab.ui.paper.internal.PaperUIApp] - Started Paper UI at /paperui\
2020-10-27 01:45:39.840 [WARN ] [lipse.smarthome.io.net.exec.ExecUtil] - Execution failed (Exit value: -559038737. Caused by java.io.IOException: Cannot run program "arping" (in directory "."): error=2, No such file or directory)\
2020-10-27 01:45:39.908 [WARN ] [lipse.smarthome.io.net.exec.ExecUtil] - Execution failed (Exit value: -559038737. Caused by java.io.IOException: Cannot run program "arping" (in directory "."): error=2, No such file or directory)\
2020-10-27 01:45:42.339 [INFO ] [zwave.handler.ZWaveControllerHandler] - Attempting to add listener when controller is null\
2020-10-27 01:45:47.500 [INFO ] [ve.internal.protocol.ZWaveController] - Starting ZWave controller\
2020-10-27 01:45:47.501 [INFO ] [ve.internal.protocol.ZWaveController] - ZWave timeout is set to 5000ms. Soft reset is false.\
2020-10-27 01:45:49.354 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = XXXXXXXXXXXXXXXXXXX, base URL = http://localhost:8080)\

These lines indicate that at least a few services re-/starts. Can you include the logs a few more minutes back?

1 Like

Deeply suspicious this is around the time when heals should often happen.

So far as I can see, what you are labelling “java runaway” appears to be a system boot up. We need that preceding info.

I do not like this entry -

2020-10-27 01:45:51.880 [me.event.ThingUpdatedEvent] - Thing 'zwave:serial_zstick:c0ec4738' has been updated.

as before, it suggests you recorded Thing configuration does not quite match the reality discovered.

Hello Anders, no I didn’t restart OH at all at any stage. I have posted the OH log info for earlier (from 22.20 actually) as you requested. Also now posting Events log from earlier till shutdown. I had been experimenting with my ‘fan speed’ rule by ramping up the fan with ‘Mac fans control’ to simulate a runaway during the day with no adverse affects.
Reminder: I’m running OH in its own Desktop folder on MacOSX. Restart done by simply dragging the ‘start.sh’ file into a Terminal window. Shutdown by ‘logout’ in Terminal window. regards and thanks, OpenHab log a bit earlier.txt (2.1 KB) Events log from 0030 until shutdown at 0200.txt (827.1 KB)

Well, let’s be clear …

So, us who cannot see what you are doing are going to assume the apparent reboot at 01:45 is not your doing, but the 02:00 one is your intervention.

Alright, logs show business as usual until 01:45 where we have a major reinitialize. Very similar to a reboot - but maybe it’s just all/many bindings reinitializing.
The first to speak up is zwave - and it is still suspicious that time of day relates to zwave heal activities.
I do not like the stick Thing updating.
But let’s call zwave ‘prime suspect’, not culprit.

There are I think other ways to cause chaos - binding version updates, or failure to to update, etc. What is in your addons.config? People have seen very similar affects with rubbish in there - but usually much more frequently.

In the middle of all thing initializations:

2020-10-27 01:45:39.913 [thome.event.ExtensionEvent] - Extension 'binding-network' has been installed.\

As you suspected, that might be the cause this time, but unfortunately doesn’t help with the issue the other times it has happened for you.

1 Like

Yes absolutely Ross.

Z-Wave heal definitely deactivated using Habmin.

Addons.config folder is empty. I have only 9 bindings installed: Astro, Automower, Expire, Harmony Hub, OpenWeatherMap map, Sysinfo, Tradfri (which I don’t use so far), Weather Binding ver 1.14.0 (which I don’t use and dunno why I installed it), Z-Wave. Also, Map & Scale Transformations, OpenHan cloud com add-on and MapDBpersistence.
Ross can you step me through your suggested process of deleting all my ‘things’ (I’m using paper UI for adding things)…do I delete the Z-Wave stick first or the other things first? Just wary of mucking up…I do have backups done. Thanks,

addons.config is not a folder, it is file. Squirelled away in some other place from the addons folder I think.
addons.config is a working copy that we are not supposed to edit, but if it gets messed up sometimes you have to.
It is supposed to mirror the similarly named addons.cfg file.

Don’t rush around deleting all Things. One step at a time.
Guessing you not yet attempted this as described earlier, focus on the stick Thing.
Note the UID.
Delete the Thing.
Reboot for belt and braces.
Shortly find the rediscovered stick I think? Zwave discovers? If not, manually recreate however you did it before.
Make sure the UID is the same and all its children should reattach.

The point here is to create the Thing under the current binding version, not use a historic Thing shape.

Haha isn’t it always the way that stuff gets squirrelled away :upside_down_face: I probably could hunt it down in unix but I’ll go with your advice and leave it alone for the moment.
I’ll follow your steps and just focus on the Aeotec stick Thing and delete & re-add etc. I’ll then do nothing else to the system and see how it all runs during the week…so that way we’re only changing one part at a time…does that sound like a good plan?
ps. do you suspect anything with that old ‘Weather binding’ ver 1.14.0? I may have installed that solely for Habpanel at some stage when experimenting. thanks again,

Why keep it if not using it.

You might also look into that network binding oddity @pacive noted - that sounds like the work of addons.cfg
Maybe it’s gone away after a reboot after reinstalling, maybe it hasn’t.

Hello Ross, more bad news. I deleted and reconnected the Z-Wave stick as you recommended 4 days ago. Everything fine until early hours of this morning…must be the halloween factor… :alien: My Java/temperature/fan speed rule alerted me at 02.27 am. I shutdown OpenHab at 03.28 am
I attach events log from 01.30 until shutdown at 03.28. You’ll see the CPU temperature creeping up from 02.27 onwards. MacMini doing nothing, just idling quietly. I wasn’t on it. Found Java hogging >220% of CPU!! (using Terminal app utility).
Also attach filtered events log to spot those ‘double astro’ events you mentioned in earlier posts. I can’t see anything drastic in the events log that might explain Java runaway. Also, I checked the Java garbage/thread situation before shutdown in the Karaf terminal using ‘shell:info’ and all seemed relatively normal. I’m really baffled and don’t want to abandon Openhab after all my time investment and designing a nice Habpanel interface too. Hope you have a few more suggestions. regards,Astro double events.txt (3.2 KB) Events log 31Oct.txt (377.0 KB)