Tradfri - Ignoring unmatchable piggy-backed response from /x.x.x.x:5684

Does anyone have any idea why my logs are suddenly full of this?

14:06:16.072 [INFO ] [ipse.californium.core.network.Matcher] - Ignoring unmatchable piggy-backed response from /10.21.2.30:5684: ACK-2.05   MID=65055, Token=3553, OptionSet={"Content-Format":"application/json", "Max-Age":604800}, "{"9002":1549996701,"9001".. 223 bytes
14:06:16.074 [INFO ] [ipse.californium.core.network.Matcher] - Ignoring unmatchable piggy-backed response from /10.21.2.30:5684: ACK-2.05   MID=65057, Token=12, OptionSet={"Content-Format":"application/json", "Max-Age":604800}, "{"9002":1550086616,"9001".. 278 bytes
14:06:16.102 [INFO ] [ipse.californium.core.network.Matcher] - Ignoring unmatchable piggy-backed response from /10.21.2.30:5684: ACK-2.05   MID=65056, Token=e2, OptionSet={"Content-Format":"application/json", "Max-Age":604800}, "{"9002":1551904545,"9001".. 227 bytes
14:06:18.515 [INFO ] [ipse.californium.core.network.Matcher] - Ignoring unmatchable piggy-backed response from /10.21.2.30:5684: ACK-2.05   MID=65057, Token=12, OptionSet={"Content-Format":"application/json", "Max-Age":604800}, "{"9002":1550086616,"9001".. 278 bytes
14:06:24.631 [INFO ] [ipse.californium.core.network.Matcher] - Ignoring unmatchable piggy-backed response from /10.21.2.30:5684: ACK-2.04   MID=65058, Token=442dd8b8340a2fc7, OptionSet={}, no payload
14:06:24.701 [INFO ] [ipse.californium.core.network.Matcher] - Ignoring unmatchable piggy-backed response from /10.21.2.30:5684: ACK-2.05   MID=65059, Token=f15a0901, OptionSet={"Content-Format":"application/json", "Max-Age":604800}, "{"9023":"2.tradfri.pool.".. 463 bytes
14:06:24.703 [INFO ] [ipse.californium.core.network.Matcher] - Ignoring unmatchable piggy-backed response from /10.21.2.30:5684: ACK-2.05   MID=65060, Token=5da5c9, OptionSet={"Content-Format":"application/json", "Max-Age":604800}, "[65551,65542,65544,65554".. 157 bytes
14:06:24.774 [INFO ] [ipse.californium.core.network.Matcher] - Ignoring unmatchable piggy-backed response from /10.21.2.30:5684: ACK-2.05   MID=65059, Token=f15a0901, OptionSet={"Content-Format":"application/json", "Max-Age":604800}, "{"9023":"2.tradfri.pool.".. 463 bytes
14:06:24.776 [INFO ] [ipse.californium.core.network.Matcher] - Ignoring unmatchable piggy-backed response from /10.21.2.30:5684: ACK-2.05   MID=65060, Token=5da5c9, OptionSet={"Content-Format":"application/json", "Max-Age":604800}, "[65551,65542,65544,65554".. 157 bytes

Binding to gateway communication has a dreadful delay when this happens. turning lights on and off takes seconds rather than instant.

Thanks
Chris

No idea, but most recently, I am seeing the identical behavior on my systemā€¦something newā€¦

running 2.5M2; restarted both OH2 and my VM multiple times, cleaned cache and tmp. But the issue is coming back and spamming my logs. Did you solve this @chriscolden

no. it still doing it. the gateway is flaky at the moment. if it does go online after a while it will start spamming the logs, but seems to still work although with a bit of a delay.

Thanksā€¦weird thoughā€¦have not read anything else, looks like we are the only ones with that issueā€¦

I do have the exact same behavior. While with openhab there is ā€œjustā€ a heavy delay the ikea app doesnā€™t work anymore at all. It looks like the gateway is leakingā€¦ after a restart (followed by the openhab-re-initialization, see https://github.com/openhab/openhab2-addons/issues/5489), it works for a while then the log-spamming starts and the gateway gets unusable. Iā€™m using 2.5M2 as well, but I think I saw this behavior before.
Ideas for isolating the problem:

  • Check if the issue occurs with the openhab binding disabled (or: enable it after a 1-2 days) to check for interference.
  • Could it be dependent on the amount of paired devices? In my case 10 panels and 8 remote controllers are paired.

Iā€™ve just disabled the gateway in paperui and restarted my gateway.

I have noticed that when it goes offline in openhab the app and google home work fine for a while. So my guess is the binding is causing the gateway to have too many connection requests which fail. That issue you have posted shows the logic is wrong, however I am configured solely via paperui and therefore only have the client id and psk stored, which would suggest I shouldnā€™t be seeing that issue. So maybe there is something else a miss here.

Anyway Iā€™ll see how I get on for a few days without openhab in the mix.

Chris

So far so good. Tradfri Gateway has remained stable for Google home and the tradfri app. Iā€™ll leave it a while longer before enabling the gateway in paper ui.

Updated openhab today as I think some commits to the binding happened last night which I think will be included in todays snapshot. Iā€™ve decided to enable the gateway in openhab to see how we go, although I think those commits werenā€™t to fix our issues. 2.5 days of a stable gateway which is more than i had before. Let see if things break now Iā€™ve got the gateway online in openhab.

Itā€™s already spamming the logs with the piggy-backed response

Edit - its now 7 hours since i enabled the gateway. itā€™s just gone offline in paper ui. Gateway is fine in app and google home for now.

Edit - itā€™s now 2am Google home and tradfri app now dead. Unclear when this exactly died. I think this little experiment proves something though, itā€™s the binding causing the gateway instabilities not a bad firmware from IKEA.

Just a blind guess: I donā€™t know Google home, but the ikea app is nothing you keep running permanently. Meaning: Usually you open it, perform an action and then close it again. So most likely the app only queries the status ob the connected devices when opening the app (and then does some refreshing).
Openhab does constant polling of the status every 3 seconds. Maybe this is ā€œtoo muchā€ for the gateway? What I will try today: Activating the gateway with only one more lamp active. If I understand the code correctly this should lower the amount of requests to 2 (one for the gateway, one for the lamp) every 3 seconds.

Does anybody know if there is a chance to debug the gateway device itself? Would be interesting to see if there is some logging inside the gateway that gives a hint about the requests from the binding.

You are quite possibly correct there. Google Home wont be polling constantly I wouldnt have thought.

Gateway is now running for 2 days with only the gateway itself added to OpenHab, no individual devices. But: According to the logs all devices are already queried in some way, regardless whether the things are registered in Openhab or not. I guess this is also for the discovery:

  1. One call to the gateway itself:
  2. Response with all the registered devices
  3. One GET request for each of the registered devices (16 in my case)

This happens every minute (not every 3 seconds as I wrote earlier. The 3 seconds are just for the Openhab internal observer).

So far no piggy-backed response in the logsā€¦ I will now add the first panel to Openhab and wait.

I suppose this is part of the auto discovery process. Maybe there is an option to disable auto-discovery? Once the things are discovered there is no need for this until a new devices installed.

I think in general auto discovery has a limited benefit, but adds overhead when the protocol does not signal a new device (like UPnP, mDNS do) and the binding has to poll to identify them. The user knows best when Iā€™m initially installing the binding or have installed a device. Where is the challenge to go to the Inbox and run a manual scan to get those. For this maybe the binding should have an option to disable auto discovery once the initial setup has been performed.

As far as I can remeber, there is an option to disable discovery at karaf console.

The piggy-back log entry is back, but as already written: The amount of requests was not really different after having attached one panel (except for status updates when turning the panel on or off).
In the used californium library I came across the following old bug:


To me this looks a bit like the issue we are seeing. The first piggy-back message in my case has the ID 35718:
2019-08-25 17:10:44.986 [INFO ] [pse.californium.core.network.Matcher] - Ignoring unmatchable piggy-backed response from /192.168.188.45:5684: ACK-2.05 MID=35718, Token=6f0816b29b5c4b, OptionSet={"Content-Format":"application/json", "Max-Age":604800}, "{"9001":"FLOALT panel WS".. 292 bytes
The issue in californium is closed, but I am having a hard time to find out whether it was already solved in the version OpenHab is using (1.0.6 according to the dependencies.xml).
If this is really related to the bug it would mean that with more items connected to the gateway the critical number is reached earlier.

one option would be to use NON (Non-confirmale) instead of CON (wich requires ACKs). Piggy bagged is only used in CON mode. I think the risk that a packet gets lost on the local network is acceptable. The CON mode results in one ACK per GET, which could create a lot of traffic when you have a bunch of devices. According to the RFC the client has to support piggy bagging when using CON mode.

I just had a quick look to the binding. I would say there are 2 locations where you need to insert a client.usNONs().
TradfriGatewayHandler: establishConnect():
try {
URI uri = new URI(gatewayURI);
deviceClient = new TradfriCoapClient(uri);
deviceClient.useNONs();
} catch (URISyntaxException e) {
logger.error(ā€œIllegal gateway URI ā€˜{}ā€™: {}ā€, gatewayURI, e.getMessage());
updateStatus(ThingStatus.OFFLINE, ThingStatusDetail.CONFIGURATION_ERROR, e.getMessage());
return;
}

and TradfriThingHandler:
try {
URI uri = new URI(uriString);
coapClient = new TradfriCoapClient(uri);
coapClient.setEndpoint(handler.getEndpoint());
coapClient.useNONs();
} catch (URISyntaxException e) {
logger.debug(ā€œIllegal device URI {}: {}ā€, uriString, e.getMessage());
updateStatus(ThingStatus.OFFLINE, ThingStatusDetail.CONFIGURATION_ERROR, e.getMessage());
return;
}

I can give that a try. Another idea I had, was trying out the 2.x branch of californium. While the code changes needed in the binding are minimal, I am having trouble in assembling the package due to the changed dependencies:

[ERROR] Failed to execute goal org.apache.karaf.tooling:karaf-maven-plugin:4.2.6:verify (karaf-feature-verification) on project org.openhab.binding.tradfri: Feature resolution failed for [openhab-binding-tradfri/2.5.0.SNAPSHOT]
[ERROR] Message: Unable to resolve root: missing requirement [root] osgi.identity; osgi.identity=openhab-binding-tradfri; type=karaf.feature; version=2.5.0.SNAPSHOT; filter:="(&(osgi.identity=openhab-binding-tradfri)(type=karaf.feature)(version>=2.5.0.SNAPSHOT))" [caused by: Unable to resolve openhab-binding-tradfri/2.5.0.SNAPSHOT: missing requirement [openhab-binding-tradfri/2.5.0.SNAPSHOT] osgi.identity; osgi.identity=org.openhab.binding.tradfri; type=osgi.bundle; version="[2.5.0.201909070948,2.5.0.201909070948]"; resolution:=mandatory [caused by: Unable to resolve org.openhab.binding.tradfri/2.5.0.201909070948: missing requirement [org.openhab.binding.tradfri/2.5.0.201909070948] osgi.wiring.package; filter:="(&(osgi.wiring.package=org.eclipse.californium.core)(version>=2.0.0)(!(version>=3.0.0)))"]]
[ERROR] Repositories: {
[ERROR] 	file:/Users/<me>/Documents/workspace/openhab2-addons/bundles/org.openhab.binding.tradfri/target/feature/feature.xml
[ERROR] 	mvn:org.apache.karaf.features/framework/4.2.6/xml/features
[ERROR] 	mvn:org.apache.karaf.features/standard/4.2.6/xml/features
[ERROR] 	mvn:org.openhab.core.features.karaf/org.openhab.core.features.karaf.openhab-core/2.5.0-SNAPSHOT/xml/features
[ERROR] 	mvn:org.openhab.core.features.karaf/org.openhab.core.features.karaf.openhab-tp/2.5.0-SNAPSHOT/xml/features
[ERROR] 	mvn:org.ops4j.pax.web/pax-web-features/7.2.10/xml/features
[ERROR] }

I tried with californium 2.0.0-M17 and 2.0.0-SNAPSHOT - same result. I tried changing this in openhab-core as well, but then I donā€™t get this built either. Any hint on how to deal with this?

I gave the useNONs() a try, this is the outcome:
As expected the message about piggy-backed reponse does no longer occur. Still after a while, the gatway is not usable with the ikea app anymore and has a huge delay in OpenHab - although the discovery-messages still appear regularly.
After having enabled debug log level in californium, I do see the following messages in the log:
a couple of:

2019-09-10 19:26:41.043 [DEBUG] [pse.californium.core.network.Matcher] - Stored open request by KeyMID[42254 for null:0], KeyToken[81496741]

followed by
2019-09-10 19:26:52.483 [DEBUG] [.core.network.stack.ReliabilityLayer] - Timeout: retransmission limit reached, exchange failed, message: CON-GET MID=42249, Token=19662ee9cc4b25, OptionSet={"Observe":0, "Uri-Port":5684, "Uri-Path":["15001","65546"]}, no payload

I will reset the gateway now and try to find the tipping point between successful connects and the first timeout occurrence.
Any idea on how the upgrade of californium could be handled? I have the impression that it could be an issue with the version number, maybe the 2.0.0-M17 is considered as >2.0.0?

I managed to get a build with californium 2.0.0-M17. It is running stable for more than 2 days now neither crashing the gateway nor any piggy-backed messages. The ikea app is still responsive. So for me it is really an improvement.
Code changes were minor, please find them here: https://github.com/flo-02-mu/openhab2-addons/tree/update-californium
@chriscolden As you had similar issues, maybe you want to try it out as well? To do so, you have to put the following three binaries into the addons folder:

According to https://projects.eclipse.org/projects/iot.californium/releases/2.0.0/plan the stable 2.0.0 release is planned for 2nd October. So maybe openhab can switch to the new version then? I donā€™t think many bindings are currently dependent on californium.