RFXCom read error - RFXComStreamReader Connector died: java.lang.IndexOutOfBoundsException

Hello,

I am facing read issues with RFXCom. Below are the details.

I am running OpenHab 2.4.0 on Raspberry Pi 4 (4G Ram version) with RFXCom RFXtrx433XL running firmware version 1041. The RFXCom is connected to serial port /dev/ttyUSB0.

Following a restart of the Raspberry everything works as expected and DEBUG and TRACE logs show correctly the messages captured by the RFXCom for the different 433MHz Things.

After a period of time, the below show in /var/log/openhab2/openhab.log TRACE logs:

2019-09-25 23:25:19.369 [DEBUG] [g.rfxcom.handler.RFXComBridgeHandler] - Checking RFXCOM transceiver connection, thing status = ONLINE
2019-09-25 23:26:19.372 [DEBUG] [g.rfxcom.handler.RFXComBridgeHandler] - Checking RFXCOM transceiver connection, thing status = ONLINE
2019-09-25 23:27:19.374 [DEBUG] [g.rfxcom.handler.RFXComBridgeHandler] - Checking RFXCOM transceiver connection, thing status = ONLINE
2019-09-25 23:28:05.084 [TRACE] [rnal.connector.RFXComSerialConnector] - RXTX library CPU load workaround, sleep forever
2019-09-25 23:28:05.082 [ERROR] [nternal.connector.RFXComStreamReader] - Connector died:
java.lang.IndexOutOfBoundsException: null
        at gnu.io.RXTXPort$SerialInputStream.read(RXTXPort.java:1383) ~[?:?]
        at org.openhab.binding.rfxcom.internal.connector.RFXComBaseConnector.read(RFXComBaseConnector.java:63) ~[?:?]
        at org.openhab.binding.rfxcom.internal.connector.RFXComStreamReader.processMessage(RFXComStreamReader.java:81) ~[?:?]
        at org.openhab.binding.rfxcom.internal.connector.RFXComStreamReader.run(RFXComStreamReader.java:61) ~[?:?]
2019-09-25 23:28:19.377 [DEBUG] [g.rfxcom.handler.RFXComBridgeHandler] - Checking RFXCOM transceiver connection, thing status = ONLINE
2019-09-25 23:29:19.379 [DEBUG] [g.rfxcom.handler.RFXComBridgeHandler] - Checking RFXCOM transceiver connection, thing status = ONLINE
2019-09-25 23:30:19.381 [DEBUG] [g.rfxcom.handler.RFXComBridgeHandler] - Checking RFXCOM transceiver connection, thing status = ONLINE
2019-09-25 23:31:19.384 [DEBUG] [g.rfxcom.handler.RFXComBridgeHandler] - Checking RFXCOM transceiver connection, thing status = ONLINE

Beyond this point 433 MHz transmitted messages by Things are no more read as before.
Transmission of 433MHz towards Things continues to work as expected. Reading from Things only is impacted.
Note that I was unable to determine this period (after restart) or the trigger for the above error, yet the logs mention “RXTX library CPU load workaround, sleep forever” which I am unsure of the meaning.

Is this a known issue for you? Is there a way around it or a permanent fix?
Any help from the community is much appreciated.

Thanks loads!!
Patrick

@pstephan, this seems to be bug in rfxcom binding, so could you make an issue to github (https://github.com/openhab/openhab2-addons/issues)?

@pauli_anttila Thanks!
Bug is now raised here: https://github.com/openhab/openhab2-addons/issues/6162

Note: So far, it seems that by turning off some unnecessary listening channels on the RFXCom (Kept on are: Lightning4, Blinds -both, X10) seems to delay the issue from happening. But RFXCom will still become deaf.

Sorry @pstephan that I forgotten to inform you, but I have introduced some fixes which will hopefully also fix your problem. See more details from https://github.com/openhab/openhab2-addons/pull/6143.

Hello @pauli_anttila and thanks for the reply and for working on this.
Can you please:

  • Point out the location of the .jar file to download containing the new committed code and to be used in ${OPENHAB_HOME}/addons (after removing the old RFXCom biding)

  • Few words about the error and it’s solution.

From my end I will test the new binding in the coming week and report back on this thread.
Thanks a lot!

You can test fixed binding via latest openhab snapshot build or just download the latest rfxcom binding snapshot version from here https://ci.openhab.org/job/openHAB2-Bundles/ws/openhab2-addons/bundles/org.openhab.binding.rfxcom/target/org.openhab.binding.rfxcom-2.5.0-SNAPSHOT.jar

Fixes:

  • Fix IndexOutOfBoundsException when message len is 255 (this should be the only reason for it), this actually shouldn’t happen as rfxcom messages are not that long, but if serial port communication is out of control this could happen.
  • If receiver thread dies for any reason, connection to rfxcom controller is reopened, so error should stop receiving data only for short period of time.
  • Logging is improved. So if those fixes doesn’t fix your issue, logging should give more information about it.

Please, enable trace level logs anyway, so that we can see the root cause why IndexOutOfBoundsException happens on your environment.

Thanks @pauli_anttila.

The issue i am facing though now is that after:

  • uninstalling “RFXCOM Binding” in Paper UI
  • dropping the above .jar in “/usr/share/openhab2/addons”
  • reebooting the device

OpenHab is unable to list all the serial ports in Paper UI.
For instance 3 serial ports should be listed in my case:

  • /dev/ttyAMA0
  • /dev/ttyAMC0
  • /dev/ttyUSB0

and now in Paper UI, under " RFXCOM USB Transceiver" and “Z-Wave Serial Controller” only 1 options is shown:

  • /dev/ttyUSB0

Have I missed something here? What logs should I share?
Please let me know if a new post is needed here…

Thanks a lot!
Patrick

So I was able to go around the serial ports not showing, by doing the below. I believe the errors that happened were due to missing dependencies like “Eclipse SmartHome Configuration USB-Serial Discovery”:

  • DO NOT uninstall the “RFXCOM Binding 2.4.0” in Paper UI
  • drop the .jar in “/usr/share/openhab2/addons”
  • (if more dependencies are needed, also drop them in the same folder)
  • reboot the RP4
  • logon to Karf consol (that’s “ssh -p 8101 openhab@localhost” once logged in via ssh on the RP4)
  • list the bundles: “bundle:list”
  • make sure both RFXCom 2.4 and 2.5 are listed
  • stop the RFXCom 2.4.0: “bundle:stop <id_of_RFXCom_2.4.0>”
  • restart the RFXCom 2.5.0: “bundle:restart <id_of_RFXCom_2.5.0>”

Now monitoring the situation in case the RFXCom fix fails.
Will report in a week time.
Thanks!

Hello,

For some reason I have noticed that RFXCom fails silently now. I have noticed this once. Below are the logs showing how:

  • at 08:12:33: 1 set of shutters (RFXCOMSaloonShutterLogia2_Shutter) opened correctly, and just after
  • at 08:12:34: RFXCom did not transmit the needed command for the second shutter (RFXCOMBlinds1Actuator_Shutter). Also RFXCom stoped receiving 433 MHz messages.

Thus, I have implemented a cron job that restart RFXCom 2.5 binding everyday during the night.
Since then I had no issues to report.

Logs:

2019-10-05 08:10:50.075 [DEBUG] [internal.handler.RFXComBridgeHandler] - Checking RFXCOM transceiver connection, thing status = ONLINE
2019-10-05 08:11:50.077 [DEBUG] [internal.handler.RFXComBridgeHandler] - Checking RFXCOM transceiver connection, thing status = ONLINE

==> /var/log/openhab2/events.log <==
2019-10-05 08:12:33.525 [ome.event.ItemCommandEvent] - Item 'RFXCOMSaloonShutterLogia2_Shutter' received command UP

==> /var/log/openhab2/openhab.log <==
2019-10-05 08:12:33.529 [DEBUG] [fxcom.internal.handler.RFXComHandler] - Received channel: rfxcom:blinds1:XXXXX:shutter, command: UP
2019-10-05 08:12:33.533 [DEBUG] [internal.handler.RFXComBridgeHandler] - Transmitting message 'Raw data = unknown, Packet type = BLINDS1, Seq number = 0, Sub type = T6, Device Id = XXXXXXXXXX.X, Command = OPEN, Signal level = 0, Battery level = 0'
2019-10-05 08:12:33.535 [TRACE] [rnal.connector.RFXComSerialConnector] - Send data (len=10): 091906007608B3710000

==> /var/log/openhab2/events.log <==
2019-10-05 08:12:33.539 [nt.ItemStatePredictedEvent] - RFXCOMSaloonShutterLogia2_Shutter predicted to become UP
2019-10-05 08:12:33.549 [vent.ItemStateChangedEvent] - RFXCOMSaloonShutterLogia2_Shutter changed from 100 to 0
2019-10-05 08:12:34.077 [ome.event.ItemCommandEvent] - Item 'RFXCOMBlinds1Actuator_Shutter' received command UP
2019-10-05 08:12:34.080 [nt.ItemStatePredictedEvent] - RFXCOMBlinds1Actuator_Shutter predicted to become UP

==> /var/log/openhab2/openhab.log <==
2019-10-05 08:12:34.081 [DEBUG] [fxcom.internal.handler.RFXComHandler] - Received channel: rfxcom:blinds1:XXXXX:shutter, command: UP

==> /var/log/openhab2/events.log <==
2019-10-05 08:12:34.087 [vent.ItemStateChangedEvent] - RFXCOMBlinds1Actuator_Shutter changed from 100 to 0

==> /var/log/openhab2/openhab.log <==
2019-10-05 08:12:40.514 [DEBUG] [fxcom.internal.handler.RFXComHandler] - Received channel: rfxcom:blinds1:XXXXX:shutter, command: UP

==> /var/log/openhab2/events.log <==
2019-10-05 08:12:40.518 [ome.event.ItemCommandEvent] - Item 'RFXCOMBlinds1Actuator_Shutter' received command UP
2019-10-05 08:12:40.521 [nt.ItemStatePredictedEvent] - RFXCOMBlinds1Actuator_Shutter predicted to become UP
2019-10-05 08:12:43.182 [ome.event.ItemCommandEvent] - Item 'RFXCOMBlinds1Actuator_Shutter' received command UP
2019-10-05 08:12:43.186 [nt.ItemStatePredictedEvent] - RFXCOMBlinds1Actuator_Shutter predicted to become UP

==> /var/log/openhab2/openhab.log <==
2019-10-05 08:12:43.185 [DEBUG] [fxcom.internal.handler.RFXComHandler] - Received channel: rfxcom:blinds1:XXXXX:shutter, command: UP
2019-10-05 08:12:50.078 [DEBUG] [internal.handler.RFXComBridgeHandler] - Checking RFXCOM transceiver connection, thing status = ONLINE

==> /var/log/openhab2/events.log <==
2019-10-05 08:12:54.123 [ome.event.ItemCommandEvent] - Item 'RFXCOMBlinds1Actuator_Shutter' received command UP
2019-10-05 08:12:54.128 [nt.ItemStatePredictedEvent] - RFXCOMBlinds1Actuator_Shutter predicted to become UP

==> /var/log/openhab2/openhab.log <==
2019-10-05 08:12:54.131 [DEBUG] [fxcom.internal.handler.RFXComHandler] - Received channel: rfxcom:blinds1:XXXXX:shutter, command: UP

Thanks!

Hello @pauli_anttila
The issue seems to persist even with an automatic restart of the RFXCom binding bundle every 24 hours. As described above the transmitter receiver fails now silently. Here is what is observed in chronological order:

  • No more 433 MHz transmissions are received/logged, whereas no error showed.
  • The first command to transmit is sent by RFXCom correctly but all subsequent commands fail.

The logs above in the thread show this.

Any thoughts how to proceed here? What other logs can I check/provide?

Thanks a lot!

I havet the same problem. I have to restart my PI every night.

/mike

I fixed bug which caused receiver thread to die in situation where message length was 255. But as I said on my previous post, this shouldn’t happen as messages from RFXCom controller is never that long. To be able to help, you should provide full logs in trace level (logs which contains communication just before and after communication dies).

Hi @pauli_anttila, sure no problem. The logs above are already TRACE level for RFXCom binding, thus i’ll be sharing with you the full logs since last known command that worked as expected.
Note that the original error of “RFXComStreamReadConnector died” is no more showing.
Let me get back to you this WE as I am out of town with no remote access to my system.

Yes, but I can’t see any log messages about the message lengths, etc which new binding versions should print out in trace level.

Well, that’s then good and bad news as it also indicates that root cause have been the illegal message length. So it seems that your serial port communication goes out of control and most probably that can’t fixed by the binding. Hopefully we can get more information from logs, if you are able to catch the moment when problem occurs.

Hello @pauli_anttila, as promissed attached are the logs starting by the last known working command, until issue is seen. Device ids are masked.
So far I am not seeing any explicit message to indicate an error.
I’ll try enabling other TRACE logs on the Pi4. Please let me know if anything in those logs rings a bell for you.

extract-openhab.log (106.7 KB)

@pstephan, 2019-10-04 23:26:53 binding receive last message from RFXCom controller, after that nothing is received. Receiver thread is still running as when you shutdown, reader thread is shutdown correctly as well. 2019-10-05 08:12:33 you tried to send message, but binding doesn’t receive any response. Because next send after couple of seconds later doesn’t even print “Transmitting message …” we can assume that serial port writing have jammed in previous sending. So problem could be in the serial library, operating system (serial port driver) or maybe even rfxcom controller itself.

You could try to run openhab and rfxcom binding e.g. on your pc (windows/mac/linux) to see if problem occurs there as well.

Does syslog or dmesg shows any suspicious?

@pauli_anttila, so sorry for this very late reply.

Fact is currently i am running this with 4 restarts every 24 hours of the RFXCom biding (using a cron job). I haven’t notice any error in transmit. I can’t know if misses happened on receive of course.
To answer your question about errors in linux log files, i do not recall checking dmesg or syslog.

If i have time i’ll remove the cron jobs and retest without any restart until issue reproduces. I’ll set some other logs to TRACE too and will give dmseg and syslog a check.

I don’t believe it will be possible for me to test on my pc this. Thus any help from the community is more than welcome.

This issue seems to be solved in OpenHab v2.5.
Also no need to restart the RFXCom binding service.