I have been having very similar problems recently but I have made some changes to my system that have reduced the frequency of the problem considerably.
I have openHAB 2.0 running on Ubuntu 16.10 as a VM on VMware ESXi 6.0u3. I use USB passthrough to passthrough an Aeotec Zwave stick S2, an RFXtrx433e (433MHz), a CUL (can’t remember the exact version) (868Mhz) and a bluetooth stick. The issue would manifest itself by a complete failure of Zwave comms via the Aeotec Zwave USB stick S2. Everything would behave normally for between a couple of hours and half a day but then one of my Zwave devices would timeout with similar messages to yours, then another, and another and within a few minutes the USB stick wouldn’t be able to send any messages, all attempts timed out. I noticed that the frequency of this behaviour increased significantly after I’d upgrade from VMware ESXi 6.0u3 to 6.5. Some googling showed that VMware have changed the USB stack in v6.5 and it had caused problems for lots of users. It is possible to force ESXi to use the old driver by issuing the command listed in the following knowledgebase article:
I tried that and while it helped by extending the working life of the Zwave binding after a reboot of the Ubuntu server I have running openHAB, the problem would still occur. I reverted to ESXi 6.0u3.
Maybe there was too much Zwave traffic on my system and this was swamping the Zstick? Or maybe running openHAB on a system running on ESXi and using USB passthrough reduced the amount of traffic the USB stick could cope with before dying? In an effort to reduce the amount of Zwave traffic I looked at the config parameters of the 4 Aeotec Multisensor6 devices I have and reduced the poll time to 1 hour on each of them. I had reduced it to 10 minutes in the past when I was first setting up my network. This extended the “working life” of the binding to around 24 hours but the problem would still occur and eventually all nodes would be declared dead.
Progress, but not perfect. Further investigation of the logs revealed lots of references to dead nodes, I thought that maybe the system was sending messages to dead nodes and somewhere along the line a backlog was growing for responses that would never arrive. I physically removed a Duwi ceiling light dimmer that no longer worked and I used the Zensys Windows software to remove nonexistant nodes from the Zwave USB stick. These nonexistant nodes were due to devices I’d factory reset without excluding from the Zstick. This made a huge difference! As of writing my openHAB server has been running for over two and a half days without problems, there is still the occasional message in the log about a dead node but this seems to be a temporary issue and resolves itself.
TL;DR
Don’t use ESXi 6.5, use 6.0u3
Ensure that the polling period on all your devices is set to at least 60 minutes
Remove any missing nodes from your Zstick using the Zensys software