Data Analysis and Filtering - looking for suggestions

Tags: #<Tag:0x00007efecf8f9bc8> #<Tag:0x00007efecf8f9ad8> #<Tag:0x00007efecf8f9a10>

Hi All,
I use OH 2.5.7 on an openhabian system on a RPi4, with data being stored in influxdb (?ver), displayed in grafana 7.1.3. As it may be relevant, the pi is headless, and I also have a ubuntu headless server, and I use a windows laptop for my daily computing.

I’ve recently installed magnetic reed switches in a pedestrian gate, and have found that wind generates a lot of false positive notifications of the gate opening. I have a decent bungee cord acting as a spring holding the gate closed, and feel that anything stronger would be a frustration/challenge for my kids.

I’ve graphed what I believe to be the relevant details in grafana, and was hoping to do some actual analysis on the data so that I can filter out notifications during likely false positive periods, and yet still push notifications otherwise. (I use telegram for notifications, and am generally quite happy with it - I’ve had to mute the channel during windy periods, though. Other more important factors like the driveway button being pressed need to be recognized, though, and currently the false positives are a significant barrier to the wife acceptance factor to get the app installed on her phone.

I was hoping to find a way to do some analysis in grafana (doesn’t exist, as it’s focussed on displaying, not analyzing, it seems), and looking into options in influxdb was a huge can of worms. It’s hard to figure out where to start, and what route is the best use of my time for the learning curve. As I amass more data, I imagine questions of analysis will come up more often, so I’d kind of like to know the best way to do simple analysis of my data.

In this case, I want to find out what percentage of events (gate opening) are captured by searching for wind direction between 110 and 120, with speeds greater than 32 km/h (for example). I’d then want to be able to graph the outlying data/inverse only, as it’s less likely to be false positives. Validating that would be fairly simple, as it would match my kids school bus schedule and my wife’s run times for the most part.

In the below images, yellow/orange is the detected gate opening, displaying the frequency. Actual use of the gate is likely 4/day, so most of what is seen are false positives. m The blue is wind direction (right axis), and green is wind speed. Superficially I believe I get false positives when the direction is between 100 and 120 degrees, and speed is above 32, and especially when above 40 (pretty sure the unit is km/h)
Past 7 days:

Past 30 days:

I have some superficial experneice with excel for data analysis, and 15+ years ago I used SPSS in university/grad school, so I have likely inappropriate aspirations… This goal should be fairly straight forward, though.
Thanks for any guidance / suggestions on best routes for learning curve (time expenditure) vs benefit for simple analysis like I described above.

Hi,

very interesting issue. To be honest, my proposal would be to change the root of the problem, before throwing a lot of thought and time at it.
Could you maybe share what your pedestrian gate looks like? I have a normal garden fence gate in mind and there must be a reason why you don’t want it to mechanically lock in place I guess?
That at least would solve your problem right away.

As to grafana, well you certainly can do some math in the query, but to be honest it is not exactly what you are looking for to develop a rule, at least in my opinion.

Completely suppressing notifications in a certain wind direction and magnitude frame is clearly not what you want, as you still want to get notifications during that time…

One thing I could think of is that you require a certain amount of time for the “open” state. Probably a person will take at least x time to close the door after opening and going through - so everything below that time you could eliminate as an event.
Before fumbling with an influx query, just get the values to csv and then do it in excel or the like.

So yeah, I’d try the minimum time filter and consider mechanical locking. I hope this helps you a little!

Fully agree with @BobMiles. I do data processing for a living and this is a very recurrent situation. My ‘customers/colleagues’ often ask “Can’t you just clean-up the data?” … and my answer is always “You’re better off starting with clean data and good sensors”. In any event, not lecturing here at all. Yes, data filtering is possible, you can do conditional filtering/reporting, some moving-point-average (low pass filter) … there are many options. The easiest might be to find a more robust sensor, one that is not too sensitive to wind and small deviations. Of course, the devil is in the details, and most likely you’ve done a lot of exploring before settling on your current solution.

PS. Love those Grafana graphs.

Thanks Bob.
You’re absolutely correct - this is the best fix. I had been distracted by the side question (which I will likely eventually want to come back to).
I have completely silenced the data until the weather improves/I get out there and install a proper latch.

1 Like

Thanks DrJB, I have a bit of a data and graph fetish, for sure, and it confirmed the problem (from the correct direction, with enough wind, the gate is pushed open).
I agree about fixing the data in the first place. I think the reed switch I used had a fair bit of room (about 5 cm movement relative to the magnet), which I thought was pretty good, and then I installed another magnet beside it to increase this wiggle room and tested to ensure it didn’t activate as it moved between the two magnetic fields. This somewhat easy fix didn’t solve my problems.

I plan on installing a proper latch, and then most of the false positives should be taken care of.

After that, I may have to dive into low pass filters and possibly still filtering out extreme weather for my main driveway gate, which doesn’t have the luxury of being able to have a latch (no plans to install a magnetic switch on this old thing). These plans are on hold, though, as I broke the reed switch when installing it, and … ug … winter weather.