Struggles with stability

So yeah, my RPI reports Java 11.0.6. Doesn’t OH flag such an inconsistency? I have to say I’d be mighty pi**ed if I’ve just spent a huge amount of time struggling with problems caused by such a fundamental discrepancy.

No. OH3 will be Java 11 but the current stable version requires Java 8

Amazing. :roll_eyes:

Java 8 installed. Will OH automatically pickup Java 8 over 11 or do I need to set a default somewhere?

Not sure. I think there is a command update-alternatives ?

Got it. Now let’s see if there’s any improvement. I hope I don’t have to go back to the restore though…

Pfff… better but no cigar. Looks like I’ll have to restore the backup. :rage:

I know there is a lot written here but not a whole lot of details. But I see at least four different and unrelated issues here.

First, I recommend using openHABian since you are on an RPi anyway. This is a known good configuration which will eliminate a number of sources of error such as having the wrong JRE installed.

Issue 1: Evohome

Clearly this binding is not happy for some reason. Put that binding into debug logging, gather the logs around the time that it loses it’s connection, and submit an Issue on the binding if it seems warranted. It’s impossible to say what is going on and why it’s losing the connection without some logs. But it does seem to be a binding issue.

I didn’t know you could delete links using openhab-cli. But that would only work if you are defining the Links in PaperUI. Are you doing so or are your Links defined in your .items files (e.g. { channel="evo:...)?

Also note that openhab-cli clean-the-cache just deletes the cache and tmp directories. There is no need to do both.

When you clear the cache, it causes OH to download and reinstall the bindings again. That’s probably why it started to work again. But I’m willing to bet that restarting OH or just restarting the bundle from the Karaf console would have been sufficient.

Issue 2: Duplicate Things

This is an odd one but it could be caused by the deleting of the Links or having a mix of Links defined by PaperUI and other defined in .items files. When you have stray Links in JSONDB that refer to Things or Items that no longer exist OH get’s confused and you end up with zombie Items or Things. But without more details it’s impossible to tell what is going on.

Issue 3: Grafana

Well, the data really flows from OH to InfluxDB. Grafana pulls the data from there. So you need to focus your looking on InfluxDB. Is it running? Are new values being written to it? Is the add-on still installed?

Again, putting the add-on into debug or trace level logging would be helpful.

Issue 4: Items don’t exist in Rules

This is a known issue that hits only a minority of users some of the time. There is an issue open for it but a fix will probably have to wait for OH 3.

The work around is to wait for OH to finish starting after clearing the cache an simply restart OH. A very few have reported needing to restart OH twice or more but I’ve only ever had to restart once. The issue is related to the indeterminate boot ordering and stuff starts running before dependencies are loaded.

You will potentially see this error every time you clear the cache (which includes updates).

Honestly, I think Evohome may have some sort of bug and the rest is caused by your efforts to fix that bug.

You really should not have to delete the Links and I’m not sure where you got that idea. It’s unclear how you are defining your Links and how you do will greatly impact what openhab-cli will do when it removes the Links.

The fact that the binding works for awhile but then fails and the fact that it continues to do this after one clear the cache is an indicator that you should not clear the cache to address that original problem. So stop doing that. Try a simple restart of OH. If that doesn’t kick start Evohome uninstall and then reinstall just that one binding. Don’t clear the cache.

When you see the no such Item errors, all you need to do is wait and reboot. But if you stop clearing the cache, you shouldn’t see those errors any more.

I can’t say what’s going on with InfluxDB/Grafana without more details and getting a more stable system to debug from.

Hi Rich,

Thanks for your very helpful answer. Indeed there’s a mish-mash of things going on here, some related, some not. Your remark about Issue 4 is very helpful. On Issue 1, I have a .items file with all my items defined with their channel mappings. So indeed, open-cli strips out the items and the links, but I leave the things definitions alone. On boot the items and links/channels are restored. But yeah, this does rather seem like overkill. However a simple reboot (of the whole RPI even) didn’t fix the Evo problem. If it reoccurs I guess I’ll have to read up on the diagnostics you mention.

Issue 2, I’m not sure what happened. I think the the Plugwise binding ‘things’ check (re)discovered PW circles that were already present as things and in my ignorance I clicked on them in the inbox. Certainly I’ve seen the plugwise binding consistently rediscovering existing things. Whatever, it ended up in a big mess.

Issue 3, sure I know that the data flows through Influx. I checked that Influx was running, entered the Influx console and checked what little I know, but nothing jumped out. I’ll have to figure out how to check if new items are flowing into it.

Issue 4. Good to know. My sw background is in concurrency. My experiments with rules the last days have taught me that they don’t have run-to-completion semantics, which leads to interesting (!) cascades of events running through the system when mass updating a group of items.

My backup should now be restored and ready to for me to check out. I’ll go and see whether things are any better…

You might find this thread helpful, or not

Interesting example. As an educational example I setup a Group of Switches and a rule that when one switch was turned on, all others that were off would also be turned on. It was interesting to see how the Info logs from the resultant casade of update events to the group was interleaved. Thus I learnt that rules do not have run-to-completion.

This would all be very enjoyable if it weren’t for the problems. Anyway, after restoring the backup and saved config files I seem to be back where I was with what seems to be a working system.

1 Like

Sigh… still problems. Only some data is making it through to Grafana. Evohome data is flowing but the Plugwise data stopped - see graph. There are also some seriously corrupted looking messages from Plugwise in the logs. I guess it’s time to start from scratch. Don’t think I have the enthusiasm to start that this evening.

The PW log looks like this:

2020-02-26 20:44:51.541 [WARN ] [gwise.internal.PlugwiseMessageSender] - Error sending: No ACK received after 1 second: 0023000D6F000072970B47AE
2020-02-26 20:44:52.694 [WARN ] [gwise.internal.PlugwiseMessageSender] - Error sending: No ACK received after 1 second: 0012000D6F000072970B573E
2020-02-26 20:44:53.035 [WARN ] [se.internal.PlugwiseMessageProcessor] - Plugwise protocol message error: 0000153101E9
00131531000D6F000027B000C010500000151
001315320D600A61010D40011F4700000000000EEDDB
2020-02-26 20:44:53.848 [WARN ] [gwise.internal.PlugwiseMessageSender] - Error sending: No ACK received after 1 second: 0012000D6F0000AF671FAC98
#ASeusNdIf:Suc A:006007C8#ASeusNdIf:DsiainMC PlugwiseMessageProcessor] - Plugwise protocol message error: 000015330C4F
0DF00F7F
00241533000D6F0000AF671F140291400004487001856539070140234E0844C202B23B

Do you see Items updating in events.log but not showing up in the chart? If not than the problem has to do with the Plugwise binding. If so then we need debug logging from the InfluxDB add-on.

Yes. Plugwise events are appearing in the events log and current status is appearing in the Paper UI. Data from Evohome items appears in Grafana, but no data from Plugwise items.

Hi,

Just wanted to say that I’m running OH2 stable on Buster/Java 11 - /wo any issues. But yes, as the docs say Java 8 is required.

Can you please verify in Grafana’s data-source administration that there is everything up to date?

Just wanted to tell you that there’s no direct data-flow from OH to Grafana. Grafana “pulls” data from a data-base (in most cases a time-series DB is preferred, e.g. influxDB).

If you need a GUI for data exploration, I strongly recommend Chronograf. After installation, you can easily browse your influxdb and see what happens.

One thing you could also double-check if persistence is set-up correctly. There was an issue last week, maybe this helps!

OK, if the events are coming in than the issue is with the InfluxDB configuration. Double check your influxdb.persist file and make sure everything that should be persisted is listed and the strategy makes sense. Remember that * is not a wild card in .persist files. “Foo*” means all members of the Group “Foo”, not all Items whose names start with “Foo”.

Put the InfluxDB addon into debug or trace logging to see if there is any activity when the Items update/change state from the InfluxDB addon.

I’ve never seen it selectively fail like what is described so I’m going to guess somewhere along the way the configuration got messed up.

Hi Rich, you hit the nail on the head. It is indeed an influx .persist problem. All my plugwise items have names beginning with “PW”. So originallly my Influx persist file had an entry “PW* : strategy : everyUpdate”. This worked and I assumed that PW* expands to all items with names starting with PW. (Yes, I have to get my naming conventions standardised at some point :innocent:)

But it actually doesn’t make sense to log anything but power and energy measurements to Influx. While trying to track down the cause of the various problems I’ve been seeing, I decided to reduce what was being logged to Influx. I have a group defined for all plugwise energy measurements “PWE” and another group for all plugwise power measurements “PWP”. I have a main group “PW”, that includes these (sub) groups. So I changed the .persist entries to “PWP : strategy everyUpdate” and “PWE : strategy everyUpdate”, expecting that now only plugwise power & energy items would be logged. But yeah, this is the problem.

If I understand you correctly, PW* works because it expands the MAIN “PW” group and not because it’s a wildcard for all PW items? Hmmmm… I don’t think I would have discovered that on my own. I suppose it’s an argument for better naming conventions in my definitions.

At the moment everything seem to be running smoothly. I’ll make a new backup and set about the hard task of figuring out how to do arithmetic with items and the harder task of how to accumulate energy consumption by period. The penny is yet far from dropping on how item methods work and why/how one needs to cast .states into different types to do simple sums. Eg, I can make a rule work that does simple arithmetic with a Number item. But if I change that item to Number:Energy, I can’t figure out how to change the local var need to perform arithmetic into the equivalent of Number:Energy. I lack proper understanding of what’s going on, I guess.

Anyway, many thanks for your help.

You can create the other groups too. An Item can belong to multiple groups.

In OH more so than most other platforms or software you may have used, you should always go back to the docs. This is explained there: Persistence | openHAB

Relying on examples or “common sense” does not always end with the right answer so, especially when something isn’t working, it’s always a good idea to return to the docs and reread them to make sure you haven’t made a false assumption.

There are tons of little details like this that can trip you up.

As for the naming convention, I always strongly recommend using names that are meaningful in terms of your actual home automation. Don’t include stuff like what technology it is linked to or some complicated series of fields to name Items. The Items are the model of your home automation so make it a meaningful model. So use names like “Bedroom_Lamp” or “Familyroom_Chandelier” over “Zwave_Node12_Switch”.

Also review Design Pattern: Associated Items for some of the things you can do if you are a little careful with your naming conventions.

It’s not so hard, you just need to cast the states to Number.

(MyNumber1.state as Number) + (MyNumber2.state as Number)

The Rules Engine for Rules DSL isn’t so good at figuring out how to work with DecimalType so just explicitly cast the state to Number every time and you won’t see any problem.

Ah, UoM is different. It does depend on whether you are doing math using the same units or not. For example, if you want to see if the state of a Number:Energy is greater than 5 kWh

MyEnergy.state > 5|kWh

To store that local variable as an kWh units,

var five = 5|kWh

The bar converts the number to the given units. If you are working with two Number:Energy Item’s states, I don’t think you need to do anything assuming they are compatible units.

If you need to remove the units of measurement

(My_UoM_item.state as QuantityType<Number>)
1 Like

Many thanks. I’ll work further on it - it’s all rather addictive, I have to say.