Handler takes more than 5000ms for processing event

It seems that some recent refactoring causes multiple safeMethodCalls in a row, which can lead to an exhausted thread pool very easily. @sjka is working on a fix, I hope it won’t take long. I’ll wait with the next ESH stable build until this is available.

I added this line in my config and for now it looks promising.

Kai, I also haven’t seen any repository build for the unstable snapshot version on bintray for the last couple of days. The last build is from February 5th. Is this intentionally or is there a broken mechanism?

@sjka did multiple fixes and improvements on that part (thanks for this!) and the problem should hopefully be gone now (without the need to reconfigure the pool size). I have just done a new ESH stable build and an openHAB distro that contains the fix. Please test and report back (especially in case you still see a problem!).

We had to deactivate the snapshot apt builds due to quota issues. @Benjy is working on moving them to a new place. Meanwhile, I will trigger manual apt snapshots, whenever something important happens (like just right now), so the latest build is available via apt as well).

Updated to the latest snapshot and removed the “workaround” from runtime.cfg.

My dashbuttons stuck at INITIALIZING, error in the log is:

Initializing handler for thing ‘amazondashbutton:dashbutton:ac-63-be-55-0b-1a’ takes more than 5000ms.

Moreover, all my Hues and the bridge remain offline.

I just updated to the latest snapshot (didn’t previously apply the workaround but was having the same issue) and mine started back up with no problem.

@philKrylc Sounds weird - I wouldn’t know how the fixes could be causing this; maybe something else is wrong in your setup. If the problem remains, could you provide more detailed logs (including org.eclipse.smarthome.core.common debug) here?

A bit more information, I wasn’t having this issue until I increased the number of things I own. Now, when I restart openHAB two of my five Amazon Dash buttons (and it is always two, but could be different each time) will get stuck on initialising, one of my six hue bulbs will respond very slowly too. bundle:restart ### fixes this problem each time.

Will get a debug log when it next happens.

1 Like

@Benjy And you are on a snapshot that has the fixes already?

I believe so yes, this is with the build on 12-02-17 10:46:42 (#789)

Ok, then we’ll need logs…

2017-02-13 17:25:36.282 [DEBUG] [rthome.core.common.ThreadPoolManager] - Created thread pool 'ConfigStatusService' with size 5
2017-02-13 17:25:37.110 [DEBUG] [ore.common.registry.AbstractRegistry] - Provider 'org.eclipse.smarthome.core.thing.internal.ChannelItemProvider' has been added.
2017-02-13 17:25:37.157 [DEBUG] [rthome.core.common.ThreadPoolManager] - Created scheduled thread pool 'thingManager' of size 5
2017-02-13 17:25:38.234 [DEBUG] [ore.common.registry.AbstractRegistry] - Provider 'org.eclipse.smarthome.model.item.internal.GenericItemProvider' has been added.
2017-02-13 17:25:41.585 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'mapdb.persist'
2017-02-13 17:25:46.604 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'other.rules'
2017-02-13 17:25:48.691 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'alexwork.rules'
2017-02-13 17:25:49.281 [INFO ] [el.core.internal.ModelRepositoryImpl] - Validation issues found in configuration model 'bedroom.rules', using it anyway:
There is no context to infer the closure's argument types from. Consider typing the arguments or put the closures into a typed context.
2017-02-13 17:25:49.288 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'bedroom.rules'
2017-02-13 17:25:49.628 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'livingroom.rules'
2017-02-13 17:25:51.176 [INFO ] [el.core.internal.ModelRepositoryImpl] - Validation issues found in configuration model 'hallway.rules', using it anyway:
There is no context to infer the closure's argument types from. Consider typing the arguments or put the closures into a typed context.
2017-02-13 17:25:51.181 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'hallway.rules'
2017-02-13 17:25:51.637 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'kitchen.rules'
2017-02-13 17:25:52.855 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'default.sitemap'
2017-02-13 17:25:52.988 [DEBUG] [ore.common.registry.AbstractRegistry] - Provider 'org.eclipse.smarthome.model.thing.internal.GenericItemChannelLinkProvider' has been added.
2017-02-13 17:25:53.021 [DEBUG] [ore.common.registry.AbstractRegistry] - Provider 'org.eclipse.smarthome.model.thing.internal.GenericThingProvider' has been added.
2017-02-13 17:25:53.350 [DEBUG] [rthome.core.common.ThreadPoolManager] - Created thread pool 'safeCall' with size 3
2017-02-13 17:25:53.456 [DEBUG] [ore.common.registry.AbstractRegistry] - Provider 'org.eclipse.smarthome.core.thing.link.ManagedItemChannelLinkProvider' has been added.
2017-02-13 17:25:53.839 [DEBUG] [ore.common.registry.AbstractRegistry] - Provider 'org.eclipse.smarthome.core.thing.ManagedThingProvider' has been added.
2017-02-13 17:25:54.013 [DEBUG] [ore.common.registry.AbstractRegistry] - Provider 'org.eclipse.smarthome.core.items.ManagedItemProvider' has been added.
2017-02-13 17:25:54.031 [DEBUG] [rthome.core.common.ThreadPoolManager] - Created scheduled thread pool 'discovery' of size 3
2017-02-13 17:25:55.251 [INFO ] [.dashboard.internal.DashboardService] - Started dashboard at /start
2017-02-13 17:25:55.516 [DEBUG] [rthome.core.common.ThreadPoolManager] - Created scheduled thread pool 'thingHandler' of size 3
2017-02-13 17:25:57.057 [INFO ] [basic.internal.servlet.WebAppServlet] - Started Basic UI at /basicui/app
2017-02-13 17:25:57.202 [INFO ] [arthome.ui.paper.internal.PaperUIApp] - Started Paper UI at /paperui
2017-02-13 17:25:57.304 [DEBUG] [rthome.core.common.ThreadPoolManager] - Created scheduled thread pool 'pcapNetworkInterfaceService' of size 5
2017-02-13 17:25:58.399 [DEBUG] [rthome.core.common.ThreadPoolManager] - Created thread pool 'items' with size 5
2017-02-13 17:25:58.611 [WARN ] [org.pcap4j.util.PropertiesLoader    ] - [org/pcap4j/pcap4j.properties] Could not get value by org.pcap4j.af.inet, use default value: null
2017-02-13 17:25:58.614 [WARN ] [org.pcap4j.util.PropertiesLoader    ] - [org/pcap4j/pcap4j.properties] Could not get value by org.pcap4j.af.inet6, use default value: null
2017-02-13 17:25:58.616 [WARN ] [org.pcap4j.util.PropertiesLoader    ] - [org/pcap4j/pcap4j.properties] Could not get value by org.pcap4j.af.packet, use default value: null
2017-02-13 17:25:58.617 [WARN ] [org.pcap4j.util.PropertiesLoader    ] - [org/pcap4j/pcap4j.properties] Could not get value by org.pcap4j.af.link, use default value: null
2017-02-13 17:25:58.675 [INFO ] [ui.habmin.internal.servlet.HABminApp] - Started HABmin servlet at /habmin
2017-02-13 17:25:58.812 [WARN ] [org.pcap4j.util.PropertiesLoader    ] - [org/pcap4j/pcap4j.properties] Could not get value by org.pcap4j.dlt.raw, use default value: null
2017-02-13 17:25:58.823 [INFO ] [panel.internal.HABPanelDashboardTile] - Started HABPanel at /habpanel
2017-02-13 17:26:01.732 [INFO ] [ing.zwave.handler.ZWaveSerialHandler] - Connecting to serial port '/dev/ttyACM0'
2017-02-13 17:26:01.832 [INFO ] [ing.zwave.handler.ZWaveSerialHandler] - Serial port is initialized
2017-02-13 17:26:01.859 [INFO ] [mmandclass.ZWaveSecurityCommandClass] - Update networkKey
2017-02-13 17:26:01.901 [INFO ] [ve.internal.protocol.ZWaveController] - Starting ZWave controller
2017-02-13 17:26:01.903 [INFO ] [ve.internal.protocol.ZWaveController] - ZWave timeout is set to 5000ms. Soft reset is false.
2017-02-13 17:26:02.315 [DEBUG] [arthome.core.common.SafeMethodCaller] - Timeout of 5000ms exceeded, thread safeCall-2 (99) in state RUNNABLE is at org.pcap4j.core.NativeMappings.pcap_findalldevs(NativeMappings.java:-2).
2017-02-13 17:26:02.316 [WARN ] [ome.core.thing.internal.ThingManager] - Initializing handler for thing 'amazondashbutton:dashbutton:50-f5-da-93-9f-67' takes more than 5000ms.

Hope this helps.

This is perfect - just exactly what the changes were about: To correctly identify badly behaving handlers and blame them (and no one else anymore):

2017-02-13 17:26:02.315 [DEBUG] [arthome.core.common.SafeMethodCaller] - Timeout of 5000ms exceeded, thread safeCall-2 (99) in state RUNNABLE is at org.pcap4j.core.NativeMappings.pcap_findalldevs(NativeMappings.java:-2).
2017-02-13 17:26:02.316 [WARN ] [ome.core.thing.internal.ThingManager] - Initializing handler for thing 'amazondashbutton:dashbutton:50-f5-da-93-9f-67' takes more than 5000ms.

This is absolutely true: The handler takes more than 5s to initialize() because it does some fancy pcap stuff in there. Instead, it should schedule a job to do such long running activities asynchronously. @oliver Could you improve the implementation that way?

2 Likes

@Kai, sure, but I don’t know when I will find the time.

I’m getting the same errors with the homematic binding. :frowning:

I created https://github.com/openhab/openhab2-addons/issues/1967 and https://github.com/openhab/openhab2-addons/pull/1968

Could this be related to my 5s delay problem?

No - your problem is related to a timeout on the network.

Indeed. I will look into it, but it seems to be systematic re-transmissions that fails after 3 attempts on multiple nodes.
I have a mix of Nexa WallPlugs and BuildIn Dimmers and ON/OFF switches.
Heard of any issues with Nexa Z-wave?

Ok, but it’s still not related to this thread - it’s a retransmission in the zwave binding - not the handler timeout.

I’d suggest to discuss in a separate thread, but please provide a debug log so we can see what’s happening.