Sorry for not acknowledging your post, I didn’t see it until today; Three months late!
Thanks for the advice. Unfortunately clearing the cache did not solve the problem. System still runs, but the log file is reporting the error. Example:
2018-10-22 15:38:40.327 [ERROR] [core.karaf.internal.FeatureInstaller] - Failed installing 'openhab-misc-myopenhab': Error restarting bundles:
Unable to acquire the state change lock for the module: osgi.identity; osgi.identity="org.openhab.binding.insteonplm"; type="osgi.bundle"; version:Version="1.12.0" [id=210] STARTED [STOPPED]
I’m fixing that now. After my post I found that OpenHab2 had been unexpectedly corrupted from the cache clear. The system was stuck endlessly trying to load.
I am reverting back to an earlier SD Card backup image. That is much easier/faster than trying to debug the crash problem.
The name of the addon is openhabcloud and not myopenhab. You might be seeing this error now because the feature installation again logs missing addons on error (see PR #459).
Interesting. I failed reproducing it on 2.5.0-SNAPSHOT after adding some invalid binding to addons.cfg. I did open Basic UI, but maybe my desktop PC is too fast so it doesn’t continuously try to reconnect and add more SSE listeners. Did you test it on a more resource constrained device (rpi3) and did you obtain a heapdump?
Basic UI is not a factor here. Sorry if I wasn’t clear, but I was referring to the issue with invalid binding names in addons.cfg.
My test box is fairly powerful (6 core, 3.7 GHz, 8 GB). When I put an invalid binding name in addons.cfg, and monitor memory use with pidstat, I can see memory increase slightly every minute when it fails to install the bogus binding.
Below is a snip of a pidstat I’m running on the openHAB java process that logs every 15 seconds. This box is running build 1484. You can see the RSS and %MEM slowly increasing every minute. When I started this at 09:42:28, %MEM was at 11.67. It’s now at 12.57%.
I tried to debug this a while back, but I got a little lost in the code. I never did figure out what was causing the leak.
09:53:13 AM UID PID minflt/s majflt/s VSZ RSS %MEM Command
09:53:28 AM 0 25958 0.20 0.00 7052464 1020164 12.49 java
09:53:43 AM 0 25958 25.27 0.00 7054512 1020336 12.49 java
09:53:58 AM 0 25958 0.40 0.00 7054512 1020336 12.49 java
09:54:13 AM 0 25958 4.67 0.00 7054512 1020336 12.49 java
09:54:28 AM 0 25958 0.87 0.00 7054512 1020336 12.49 java
09:54:43 AM 0 25958 22.53 0.00 7054512 1020588 12.50 java
09:54:58 AM 0 25958 1.80 0.00 7054512 1020588 12.50 java
09:55:13 AM 0 25958 3.47 0.00 7054512 1020588 12.50 java
09:55:28 AM 0 25958 0.80 0.00 7054512 1020588 12.50 java
09:55:43 AM 0 25958 34.07 0.00 7054512 1021500 12.51 java
09:55:58 AM 0 25958 1.93 0.00 7054512 1021500 12.51 java
09:56:13 AM 0 25958 4.47 0.00 7054512 1021500 12.51 java
09:56:28 AM 0 25958 0.33 0.00 7054512 1021500 12.51 java
09:56:43 AM 0 25958 34.80 0.00 7054512 1022280 12.52 java
09:56:58 AM 0 25958 0.47 0.00 7054512 1022280 12.52 java
09:57:13 AM 0 25958 4.93 0.00 7054512 1022280 12.52 java
09:57:28 AM 0 25958 0.60 0.00 7054512 1022208 12.52 java
09:57:43 AM 0 25958 33.00 0.00 7054512 1022744 12.52 java
09:57:58 AM 0 25958 1.80 0.00 7054512 1022744 12.52 java
09:58:13 AM 0 25958 2.00 0.00 7054512 1022744 12.52 java
09:58:28 AM 0 25958 2.67 0.00 7054512 1022744 12.52 java
09:58:43 AM 0 25958 38.67 0.00 7054512 1023696 12.53 java
09:58:58 AM 0 25958 1.53 0.00 7054512 1023696 12.53 java
09:59:13 AM 0 25958 11.13 0.00 7054512 1023696 12.53 java
09:59:28 AM 0 25958 2.40 0.00 7054512 1023696 12.53 java
09:59:43 AM 0 25958 44.20 0.00 7054752 1024764 12.55 java
09:59:58 AM 0 25958 0.53 0.00 7054752 1024764 12.55 java
10:00:13 AM 0 25958 12.60 0.00 7054752 1024764 12.55 java
10:00:28 AM 0 25958 3.40 0.00 7054752 1024764 12.55 java
10:00:43 AM 0 25958 34.47 0.00 7056800 1025184 12.55 java
10:00:58 AM 0 25958 14.60 0.00 7056800 1025188 12.55 java
10:01:13 AM 0 25958 6.13 0.00 7056800 1025188 12.55 java
10:01:28 AM 0 25958 3.47 0.00 7056800 1025188 12.55 java
10:01:43 AM 0 25958 27.60 0.00 7056800 1025868 12.56 java
10:01:58 AM 0 25958 1.73 0.00 7056800 1025868 12.56 java
10:02:13 AM 0 25958 3.47 0.00 7056800 1025868 12.56 java
10:02:28 AM 0 25958 0.67 0.00 7056800 1025868 12.56 java
10:02:43 AM 0 25958 29.47 0.00 7056800 1026244 12.57 java
10:02:58 AM 0 25958 0.40 0.00 7056800 1026244 12.57 java
10:03:13 AM 0 25958 3.60 0.00 7056800 1026244 12.57 java
I should add that I first noticed this when one of my prod boxes was crashing every couple weeks. Once I removed the bogus binding name, that box has run rock solid for the last 11 months.
I had indeed the idea that things slowed down.
I did remove the myopenhab last night, to wait till things could be fixed, so you tell me I never have to add it and my system will run better?
did I understand that correctly?
Based on my testing results, any invalid binding name in addons.cfg will cause a memory leak that eventually will consume all available memory. The first indication of this may be that your system becomes more sluggish. Depending on the hardware config of your system, it could take a couple days or a few weeks. Not sure what your experience has been.
@wborn FTR, I’m running this to capture the above memory stats.
pidstat -r -p <openHAB process ID> 15
which reports memory consumption of the openHAB java process every 15 seconds.
And, if you run pidstat with a more fine-grained interval (like every 1 or 2 seconds), you’ll see a flurry of page faults every time the feature installer runs.