GlobalCache Broken Pipe Error

jeremybd · August 12, 2019, 6:08am

Hello,

I’m trying to set my home automation using GlobalCache iTach Flex Serial Cable. I currently have an home automation system (Colorado vNet) but it’s proprieritary and the only way to open it is over Serial.

Everything seems to be working fine (Control with iPhone, Rules, Mapping…) the only issue I’m encountering is that it seems that if I don’t use it for some times (a few hours?) the first command I send always fails.

I encountering this error:

2019-08-12 08:00:00.922 [ERROR] [.GlobalCacheHandler$CommandProcessor] - Comm error for thing itachFlex:000C1E0363AE at 192.168.1.75: Broken pipe (Write failed)
2019-08-12 08:00:00.933 [ERROR] [e.internal.command.CommandSendserial] - Execute 'sendserial' failed on thing 000C1E0363AE at 192.168.1.75: errorCode=N/A, errorMessage=ERROR: Broken pipe (Write failed)
==> /var/log/openhab2/events.log <==
2019-08-12 08:00:00.950 [hingStatusInfoChangedEvent] - 'globalcache:itachFlex:000C1E0363AE' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): ERROR: Broken pipe (Write failed)
2019-08-12 08:00:00.958 [vent.ItemStateChangedEvent] - GC_DIRECT changed from 3,DIM-6,2,21,0%0D to 2,ITACH,73,1,1%0D
==> /var/log/openhab2/openhab.log <==
2019-08-12 08:00:10.972 [INFO ] [GlobalCacheHandler$ConnectionManager] - Got a connection to command port for thing itachFlex:000C1E0363AE at 192.168.1.75
2019-08-12 08:00:10.981 [INFO ] [GlobalCacheHandler$ConnectionManager] - Got a connection to serial-1 port for thing itachFlex:000C1E0363AE at 192.168.1.75
==> /var/log/openhab2/events.log <==
2019-08-12 08:00:10.997 [hingStatusInfoChangedEvent] - 'globalcache:itachFlex:000C1E0363AE' changed from OFFLINE (COMMUNICATION_ERROR): ERROR: Broken pipe (Write failed) to ONLINE

Then I can issue my commands just fine, but the first one will fail. It seems that the connection to the iTach Flex is lost at some point.

How can I debug that? Is there something I can do to solve this issue?

Here is my globalcache.things:

globalcache:itachFlex:000C1E0363AE [ ipAddress="192.168.1.75", mapFilename="serial-codes.map", activeCable="FLEX_SERIAL"]

And my GC Related Items:

String GC "My GC device"  {channel="globalcache:itachFlex:000C1E0363AE:sl-m1#c1"}
String GC_DIRECT "GC_DIRECT"  {channel="globalcache:itachFlex:000C1E0363AE:sl-m1#c1-direct"}

I have two items so that I can both send Direct and mapped commands through the same iTach Device.

Thanks in advance for your help,

mhilbush · August 13, 2019, 1:42am

I’ve not seen this issue. I have a iTach Flex and I’m using it in almost exactly the same manner as you – except I have the iTach Flex connected to the serial port on a Russound zoned audio controller. There are times when it goes for days without sending a serial command, yet the first command that’s sent after this time period works fine.

This is just a guess, but is your Colorado vNet system possible going to sleep due to inactivity, causing the binding to fail on the write of the command?

jeremybd · August 13, 2019, 5:15am

Thanks for the answer. The vnet system is not going to sleep I have 4 other serial connectors (not with itach) plugged into the vnet and none of them are doing the same.

I’m missing something but I have no idea how to debug it…

mhilbush · August 13, 2019, 12:41pm

So, the IOException (Broken Pipe) generally means that the other end of the connection was closed when the binding tried to perform the write operation on the TCP socket. That would suggest that the iTach Flex closed the TCP connection on port 4999 (that’s the port over which the serial strings are exchanged versus port 4998, which is used for controlling the iTach).

I don’t know why this would be happening, and I’m not sure what else to suggest. You could try putting the globalcache binding into DEBUG or TRACE mode, but I’m not sure of that will provide any useful info (it’s been a while since I looked at the code). I suppose you also could check to see if the iTach is on the latest firmware.

If you decide to try DEBUG or TRACE mode, post your log here and I’ll take a look. Note that TRACE mode may be a bit lengthy, so it would also help if you could pinpoint where in the log (timestamp) the issue occurred.

As a last resort, if you’re handy with Wireshark, you could do a packet capture on port 4999. That might reveal what’s going on.

jeremybd · August 14, 2019, 5:29am

Ok so here is my log from last night :

openhab.log

Does not seem to be really useful but maybe you will see something that I don’t the first line is an error from last night without the TRACE mode, and the one this morning occurs at 08:00:01.137 where there is another Broken Pipe.

I have used this iTach for 2 years with iRule without any issue, could it be an issue with openhab or the raspberry ?

Thanks

mhilbush · August 14, 2019, 11:49am

Agree that there’s nothing very useful here.

One other thought that might explain what’s happening…

The globalcache binding maintains a persistent TCP connection to the iTach on ports 4998 and 4999. It’s possible that iRule opens the connection(s) each time a command is sent. This possibly could explain why you don’t see the issue with iRule. This is speculation on my part. While I’ve used iRule in the past, I don’t really know how it works internally.

This could be verified by running Wireshark on port 4999. With iRule, you would see the TCP connection set up (SYN, SYN-ACK, ACK) and terminated (FIN, ACK) on each command.

jeremybd · August 14, 2019, 12:21pm

I’ll try to investigate with wire shark over the next few days see if I can see something that might explain this issue.

I’ve also reset my iTach and added a rule that send a fake command every 3 hours to see how it goes. I also found trace of a broken pipe with my tado and Nuki binding. I will try to use another raspberry when I get another one.

Thanks for the help

mhilbush · August 14, 2019, 2:30pm

That’s a good idea to try sending a fake command every so often.

Let me know how it goes.

Interesting… Could be a coincidence, or could be the same root cause as with globalcache.

jeremybd · August 15, 2019, 6:15am

So fake commands does not work, it seems that even if my first command at 8am is failing even if a command was sent less than 3 hours ago…

I’ll run a tcpdump on my rapsberrypi today to see if it goes somewhere.

Thanks a lot,

mhilbush · August 15, 2019, 10:49am

That’ll work if you can’t run Wireshark on the openHAB box. Just save to a file and then import into Wireshark. You only should need ports 4998 and 4999.

jeremybd · August 15, 2019, 11:14am

My openHAB box is the raspberry I’m using openHABian. Hopefully that will lead somewhere.

mhilbush · August 15, 2019, 11:16am

I’m not sure it matters, but does your RPi have a wired or wireless connection to your network?

jeremybd · August 15, 2019, 2:01pm

Wired connection directly on the router as the GlobalCache device (I wanted to eliminate any possibility of faulty cable/switch)

jeremybd · August 16, 2019, 10:04am

Did you also setup your globalcache as receive item?

I did that yesterday ( plus changing the fake command to every 2 hours) and have not seen any broken pipe error yet. I will continue to monitor it but I’m wondering if adding the globalcache also as receiver would change anything.

mhilbush · August 16, 2019, 10:44am

Yes, I do have the receive channel linked to an item.

I’ll need to look at the code. I don’t recall if it would make a difference whether the receive channel is linked to an item or not.

Maybe it was the change from 3 to 2 hours. You could try changing back to 3 hours, and leave the receive channel linked to see if that makes a difference. In the meantime, I’ll look at the code.

jeremybd · August 16, 2019, 10:53am

Yes that’s what I plan to do

I will just wait 2-3 days to see if I have any errors and start rolling back everything I tried to prevent the broken pipe errors.

mhilbush · August 16, 2019, 10:57am

From what I can see, the serial port reader is started when you tick “Enable Two Way” in the thing config. It doesn’t matter whether or not the channel is linked to an item. I suppose it’s possible for a buffer to fill up to the point where it causes the connection to close. I’ve not seen that, nor has it been reported by anyone (yet).

I’d suggest you disable sending the periodic command completely. Leave “Enable Two Way” ticked in the config. If after a few days you don’t see an issue, untick “Enable Two Way” to prove that the issue reappears.

jeremybd · August 16, 2019, 1:36pm

Actually it seems that I’m mistaken and I had one this morning at 08:57:00 and actually several other during the night the logs and the tcpdump but can’t seem to notice something that might indicate what’s wrong.

The files are here if you can take a look https://www.dropbox.com/sh/a5lktl97bomrc66/AAC7MVYBakoJj9Wd8b1J-BxCa?dl=0

I really appreciate it thanks a lot

mhilbush · August 16, 2019, 2:34pm

I see a lot of traffic going back and forth between the iTach and openHAB. That all looks pretty normal. I don’t see anything in the tcpdump that looks bad.

I do see numerous Broken Pipe errors in openhab.log. I see those for the globalcache binding, as well as quite a few from Jetty. I’m sorry, but I don’t know what to make of those errors.

jeremybd · August 16, 2019, 2:53pm

Thanks anyway for helping I really appreciate it.

Will try to check every piece of hardware to see if it might be the power unit or the raspberry itself or the globalcache. Hopefully I’ll figure it out.

Again thank you