Jetty update, Karaf 4.1.3 upgrade and full LSP support

Thatā€™s why I didnā€™t configure it that way as a default. It will eventually write the log, but possibly only after a few MB of data have piled up - until then, everything is only held in memory (which should make it much faster than if every single line is directly written to disk).

Just my 2 cents after updating and running into similar problems.

Snapshot 1061 started to show strange behaviour after running successfully for weeks.
Harddisk activity would increase to severe levels for no reason, so i decided to reboot.

After the reboot ( still on snapshot 1061) my openhab would not start anymoreā€¦ UI would show " waiting for UIā€™s to install". Cleared cache, cleared temp ā€¦ no succes.

After reading this thread i decided to update to snapshot 1102ā€¦used apt-get (running on Ubuntu) to updateā€¦ hoorayā€¦ Openhab starts again.

After that i decided to update to the latest snapshot (1114) only to run into the same problems again. No matter what i triedā€¦ it would remain stuck at " installing UIā€™s".

After some investigation i went ahead and cleared my addons folder that still contained JAR files for the Yeelight and Unify bindingā€¦Succes!! Openhab startsā€¦ iā€™ve tried several restartsā€¦ no errorsā€¦ all is working fineā€¦

hope this information helps to clear the situation

Thanks for your observations, @snoekieboe!
As you identified the jars in the addons folder as an issue: May I ask you to try this reconfiguration and see whether it solves the issue?

Do you have a link to this issue?

1 Like

I need to update to #1115?
if so, then I updated to openHAB 2.2.0 Build #1115
warnings are available :frowning:
but now I know for sure, these warnings appear when the network is running a miniDLNA server.

thankyou for information, this has also fixed for me.
1115 running fine.

This helped me as well. I got jars for zwave, robonect and minecraft bindings. Zwave i cannot live without so I removed the other two and then everything worked. Then I thought Iā€™d check exactly which of them was the culprit so I put them back, first one and then the other with openhab restarts in between. But now all jar files are back and everything is still working like it should, so I donā€™t really understand why it didnā€™t work in the first place. Confusedā€¦

I did some more testing and found out that the JAR files in the addons folder only create problems if Openhab is started and the binding as defined in org.openhab.addons.cfg still need to be installed.

Once these binding (including the UI and standard packages) have been installed you can put your JAR files back in the addons folder and Openhab will start and run without problems

@Kai THIS does not seem to solve it for me.

1 Like

Just for the record: This workaround completely breaks the Tellstick binding. Iā€™m starting to understand why @Kai wouldnā€™t recommend it :slight_smile:

Yes, I think this is what I have seen in testing too.

I cannot imagine that this could be related. Do you have any proof?

Iā€™m starting to understand why @Kai wouldnā€™t recommend it :slight_smile:

I was just about to recommend it - no not recommend it but to set this as a default. I did some tests with openHAB 2.1 and saw that this never ever did blacklisting of event handlers, no matter how long they block. So with the Karaf upgrade, this seems to be a new behaviour (and the problem of exceeded timeouts had been present before) - by adding this option, we would be back at full 2.1 compatibility. I have created a PR for it, so if you think this is a bad idea, you should veto asap :slight_smile:

Sounds as if Felix FileInstall isnā€™t our friend either :frowning: Not just the KAR handling is broken, also normal jars cause issues. Can anyone confirm that this is only since the Karaf upgrade? Or were there problems on Karaf 4.0.8 already as well?

To speed up things Iā€™ve set the zwave binding to DEBUG, makes a lot of logging data :grinning:

In about 14 hours of log files (30 MB of data) both timeouts

2017-12-06 08:47:49.578 [WARN ] [nal.common.AbstractInvocationHandler] - Timeout of 5000ms exceeded while calling method 'ThingHandler.initialize()' on 'org.eclipse.smarthome.binding.hue.handler.HueLightHandler@7aec67'. Thread 'safeCall-18' (444) is in state 'WAITING'
2017-12-06 08:50:03.906 [WARN ] [nal.common.AbstractInvocationHandler] - Timeout of 5000ms exceeded while calling method 'EventSubscriber.receive()' on 'org.eclipse.smarthome.io.monitor.internal.EventLogger@1b389ce'. Thread 'safeCall-18' (444) is in state 'RUNNABLE'

did not show up anymore.
So what should be my lessons learned? I guess this is not an openHAB related problem, but a problem with my hardware?
Btw, Iā€™m using a RPi3 with 128 GB USB 2.0 SSD Harddisk attached, so there is no sd card playing around.
I need to upgrade from Jessie to Stretch anyway, so maybe itā€™s time to do a new OS install ā€¦

Not more than Iā€™m describing in this issue. I did everything I could to try to get the binding going (restarted telldus-core, restarted the binding, restarted entire openhab, fiddled around with the library path setting) and nothing helped. As a desperate measure I removed the org.apache.felix.eventadmin.Timeout=0 and restarted OH, and suddenly everything was working.

edit: Ok, Iā€™m starting to give up my efforts to understand whatā€™s happening here. I now put the Timeout=0 back and restarted OH and everything seems to be working. I HATE problems that I donā€™t understand!

edit 2: Nope. Too happy too soon. Another restart of the binding and itā€™s broken againā€¦

Havinā€™ problems, too. Havenā€™t seen this before the Karaf upgrade, and it started some time after I recently added another jar. OH somehow stopped and after that it refused to do anything after its startup sequence. Tried multiple restarts, removed/readded jars until I noticed my ZWave controller thing was gone, in fact ALL of my things were gone.
Had to recreate the controller manually, now itā€™s working again (I think ā€¦ still checking).
On one restart occasion, I noticed it keeps loading .items file multiple times ??
FWIW, felix.eventadmin.Timeout=0 was in place before and all the time.
This was with #1099 as well as now with latest #1115.
Confused, tooā€¦

Well, it hints at the fact that indeed sometimes the logging framework (log4j2) takes long for writing a single logging message to disk - and requiring more than 5 seconds to execute a logging line in the code is something rather unexpected to all developersā€¦ As I have removed the event handler timeout in the distro now, I would prefer to keep the logging configuration the way it is (i.e. to always flush it to disk), because it is rather annoying to see the latest log entries only after minutes, when the logger decides that stuff should be written to the file. We should keep it in the back of our minds though that extensive logging might cause an issue on some platforms and that changing it to use a memory buffer can resolve this.

1 Like

I have changed the Jetty configuration in the latest build #1117. Could you update to this and see if the warnings are gone?

Sorry for my late reply, I was traveling.

This is probably a stupid question, but what do mean exactly with ā€œlog rolloverā€? I guess this is the event where openhab.log becomes openhab.log.X and a new (empty) openhab.log is created. Right?

If I look at the timing of the first Blacklist event, this might be possible:

  • 2017-12-03 22:00:59.962 => log rollover;
    In between: 67.600 entries in the openhab.log are written;
  • 2017-12-03 22:01:24.384 => Blacklist event occurs;
    In between: 28.200 entries in the openhab.log are written;
  • 2017-12-03 22:01:30.098 => next rollover event occurs;

On the other hand, Iā€™m not sure if this Blacklist event can be considered as a reliable/realistic case. As soon as I switched on the trace for org.eclipse.smarthome.core.internal.common, OpenHAB was mostly occupied with writing data to the log (it needed just 30 seconds to create 16Mb of log data).

My second Blacklist event (2017-12-03 22:51:18.769) happened approx. 50 minutes after a log rollover (and after 23000 lines were written in the logfile). And nothing significant happens right before that event. The last event in openhab.log was 9 minutes earlier:

2017-12-03 22:42:52.895 [INFO ] [lipse.smarthome.model.script.Spotify] - Spotify client ID and Secret set: spotify_client_id (...)
2017-12-03 22:51:18.769 [WARN ] [org.apache.karaf.services.eventadmin] - EventAdmin: Blacklisting ServiceReference [{org.osgi.service.event.EventHandler, org.eclipse.smarthome.core.events.EventPublisher}={event.topics=smarthome, component.name=org.eclipse.smarthome.core.internal.events.OSGiEventManager, component.id=68, service.id=157, service.bundleid=111, service.scope=bundle} | Bundle(org.eclipse.smarthome.core_0.9.0.201712011551 [111])] due to timeout!

I saw you included the workaround as default in the latest snapshot build. Iā€™m not sure if that means you are still investigating the issue or not. My experience shows at least no disadvantages with this workaround.

Kind regards,
Dries

After moving to #1119, OH reproducibly fails to load all of its bindings as defined in services/addons.cfg.
Iā€™ve got 3 jars in the addons folder that do load. When I remove them, addons-defined bundle processing on (re)start is fine.
Any hint how to force reprocessing of addons.cfg, any bundle I would need to restart for that to happen ?

Iā€™ll follow up with another maybe stupid question: Is the openhab logs supposed to be rotated? Mine has never done any rotating, they just fill up forever. Iā€™m using the package from snapshot release (Debian). Should a logrotate config file be included in the package?