OpenHabian on Raspberry PI 3 Occasionally Locks Up, Can't SSH in, After reboot Rules no longer work

Tags: #<Tag:0x00007fc3ed9fe710> #<Tag:0x00007fc3ed9fe5a8> #<Tag:0x00007fc3ed9fe4b8>

Based on the recommendation of others… I’m opening this issue as a NEW and separate thread…

I’ve been fighting with rules that work perfectly for a while (1-3 weeks) then they suddenly stop working and I need to delete and recreate them to make them functional again.

I running OpenHabian on a Raspberry Pi 3 Model B Plus Rev 1.3.

I use an Aeon Labs Aeotec Z-Wave Z-Stick, Gen5 Z-Wave Plus - ZW090 USB Stick.

I’m running OpenHabian 4.19.42-v7+ #1219 SMP Tue May 14 21:20:58 BST 2019 armv7l.

OpenHabian boots/runs from an SD Card.

I set up my rules using the Paper UI “Experimental” Rules.

I installed and configured OpenHabian back in May/June of this year. As noted above, I have an ongoing issue where “rules suddenly stop working” (lights stop turning on/off at the prescribed times/events).

I set up some very simple rules in Paper UI to operate switches that turn lights ON/OFF in my home. The rules make use of Astro Channel events such as local Sun Civil Dusk and local Sun Sunrise events to run a command that turns ON/OFF Z-Wave and Wemo Wifi Switches that are connected to lights.

The rules work great for anywhere from 1 up to 3 weeks. Then they just stop working. ALL OF THEM.
At this point often when I try to go to the Paper UI or try to ssh into OpenHabian from another machine on my home network, I can’t. I have to reboot my Raspberry PI to get OpenHabian running and accessible again. After the reboot, OpenHabian Paper UI comes back and I’m able to ssh in again. At this point the rules appear to be corrupt/broken. If I edit a rule by making a minor change to the rule description and then save it, then manually run the rule with the “play button” icon in Paper UI, it appears to work again. Sometimes however, the rule while functional with the “play button” doesn’t trigger anymore based on the Channel event or fixed time. I need to delete the rule and recreate it again in Paper U before it works again. This goes for ALL of the rules. Things work for a while again before I have to repeat the entire cycle again in 1-3 weeks.

Guidance and suggestions for debugging and tracing the root cause of this issue would be appreciated.

Where should I start?

Thanks in advance for your help.

…Steve

@5iver you support NGRE correct?
What version of OoenHAB are you running? The stable 2.4 is now quite old and many are running 2.5M2 or later.
That rules engine was considered experimental. I believe it is an addon so you might be able to run a newer version with an older OH. Scott should know for sure.

Upgrade to 2.5M3 and see if you have the same issues. A lot has changed since 2.4. Do you use an SD card or SSD? If just an SD, it may be a good time to refresh it.

Even if just for standardization for support, it’s best to update everything at the same time.

1 Like

Also, just to mention, your title says Raspberry PI 2 and your description says Raspberry PI 3 model B which I think is the minimum recommended platform

1 Like

I have not yet made the jump from M2 to M3. My current main use is Z-Wave but I will likely work more with the API too.
In your opinion, is M3 as stable as M2?

I use snapshots, so haven’t tried either. But by definition, a more recent milestone will be an improvement over an older version. 2.5M2 was IMO an exception and was rushed to release for unknown reasons without adequate tools to support further development efforts… and there are still outstanding issues… but there are other threads about that.

OK. M2 has been fine for me.
Since I am doing some physical reorganizing of my zwave network I will wait for that to get stable before upgrading. It is curious that some devices, although close to my HUSBZB stick prefer to route through another more distant node.

Thanks for pointing that out. I’ve updated the Thread description to reflect Raspberry PI 3

1 Like

Hi,

I’d suggest two actions for different reasons:

  • Backup all your config to a remote disk.
    I’ve been using RPi2 and RPi3 for years, and on the few occasions when there have been issues, they have been with hardware and not software. Modern uSD cards last for years, but as I don’t take any special precautions (no RAM disk, write logs to uSD continuously), they do eventually start behaving oddly - and random lockups are a symptom I’ve seen.
    You can never have too many backups! :slight_smile:
  • I have used rules created in text files, and don’t tend to see any issues. Without any evidence, this makes me suspect Paper UI “Experimental” Rules!
    Have you tried adding instrumentation to one or more rules? I use varying patterns of log messages which are normally commented out, but help a lot when issues arise. Add entry/ exit log lines to ALL rules, and consider a regular event (such as a cron job) to create a ‘heart beat’ to understand when things lock-up.

As I don’t use the NGRE, here’s a plain old vanilla example…

    rule    "CronHeartBeatTest"
    when
        /*        sec min hr dom mon dow [yr]    */
        Time cron "5   *   *  ?   *   *"
    then
        logInfo("CronHeartBeatTest", "Rule entry...")

        // optional stuff to print state variables goes here
        //logInfo("ItemName State", "ItemName <" + ItemName.state +">")


        logInfo("CronHeartBeatTest", "Rule exit...")
    end

Even commented out with //, the entry / exit log lines also have the benefit of adding documentation and context - you know WHICH rule that brace ends.

@5iver

I used openhabian-config and the option to upgrade/change the installed version to install the latest test release (2.5M3). According to openhabian-config, it downloaded and installed the image successfully. However, I couldn’t get it running. I tried rebooting several times, also tried backing out and downgrading my installation back to the prior “stable” release 2.4, but after a couple of hours of hitting the wall, I finally gave up.

For now I’ve downloaded the Raspian opeHAB 2.5.0.M3 milestone Runtime (available here: https://openhab.jfrog.io/openhab/libs-milestone-local/org/openhab/distro/openhab/2.5.0.M3/openhab-2.5.0.M3.zip) and installed it manually according to the instructions on the openHab site here: https://www.openhab.org/docs/installation/linux.html#manual-installation

I now have 2.5.0.M3 installed and running.

I didn’t bother preserving my old config or try to restore it as I hadn’t invested in a lot of home automation to this point. I had to install the Z-wave, wemo and Astro bindings again. My Z-stick Zwave Controller still had it’s list of registered Z-Wave devices, so the setup was pretty straight forward and quick. Faster than the openhab upgrade :slight_smile:

I installed the Next Generation Rules Engine (NGRE) from the Paper UI (Add-ons:MISC:Rule Engine (Experimental).

I then recreated my rules in the NGRE (Experimental) Rules Paper UI interface.

So I’m now back to the same point I was at with openHab 2.4 Stable.

I will monitor my Raspberry PI, openHab installation and the NGRE rules for the next few weeks to see if I can better durability and skirt the “Rules Suddenly Stop Working” and the “PI/openHab hangs with no Web UI and no ssh access” issue.

I’d rather make use of the Paper UI rules config and the NGRE if it’s stable in 2.5.0.M3, so I’m sticking with it rather than heading down the tried-and-true text based rules path for now. The average person is going to want to use Paper UI to set up rules because of the ease of use. Hopefully the team can focus on making NGRE and the Paper UI rules config more robust.

Wish me luck! I’ll report back on my findings over the coming weeks.

…Steve

Just a thought.
There may be 2 different things happening here. My 2.5M3 Pi seems to lock up after a few weeks. Ssh and OpenHAB do not work but my nginx reverse proxy works giving 500 error because OH is down.

After a reboot my rules work aspected though. This is on Raspian Stretch running off an SSD on a USB port. I also had the lockup happen when using an SD Card.

There is no team… just individuals volunteering their time and development efforts. I plan to add some more ModuleTypes, like “Member of” and “System started”, and to migrate the core actions, but only after finishing up a Jython ScriptEngine addon. I also have some ideas for expanding the new rule engine to support multiple paths.

You might want to take a look at this… it’s not something you can setup in a UI, but it simplifies rule create a LOT. The example provided requires to setup Jython and the helper libraries.

How does NGRE relate to the other JSR(?) scripting options? Still trying to wrap my head around the choices & options.
When I move from Rules DSL I am considering Jython. I assume it is more powerful than NGRE?

I’ve covered this in some other topics, so you might find more detail in those.

  • New rule engine (NGRE)
    • Rules can be stored in JSON resource bundles
      • These can be created by hand, through the REST API (used by Paper UI), or they can be imported from a template
      • The only editor right now is Paper UI, but the JSON files can be edited directly too
      • Rules and ModuleTypes can be imported from templates (this will be built out into a new ExtensionService, like the Eclipse Marketplace)
      • These rules can include Scripted Conditions and Actions for any supported language
        • This is basically scripted automation inside an Action or Condition
        • Libraries can be utilized in Scripted Conditions and Actions
        • Scripted Conditions and Actions can be edited through the REST API (e.g. Paper UI)
    • Scripted automation is included with the new rule engine and allows the use of scripting languages that have implemented the Java Scripting API
      • Script files can contain several rules, like is done in a .rules file for the rules DSL
      • Creating rules is difficult without the Jython decorators, which make them as easy to create as it is in the rules DSL
      • Helper libraries have been written for Jython and JavaScript, and planned for Groovy
        • The Jython helper libraries are more evolved and IMO is more powerful than JS
        • JavaScript will benefit greatly from an upgrade to JDK9, where some of the ES6 features were added to Nashorn, allowing for the decorators to be included for the JS HLs
      • Other languages should also work and just need to be tested
      • You can do a LOT more than just create rules using scripted automation
      • It is still marked as experimental, so that breaking changes could be made outside of a major release, but there have not been breaking changes since S1319 (if you ignore the ESH package name changes in S1566)

While the rules DSL (arguably) helped inexperienced coders to build rules, it is proprietary and limits what you can do with your rules. Scripted automation lets you use well-known and documented scripting languages to create your rules and it is pretty much unlimited in its capabilities.

Rules reated through a JSON file, the REST API (Paper UI), or scripted automation are ALL using the new rule engine. For use with scripted automation, I recommend Jython and feel it is the most powerful. For testing, I have Jython, JS and Groovy scripts, so try them out! The new rule engine and scripted automation can be used along with the rules DSL.

1 Like