Hello everyone.
Just wanted to share the results of the investigation and my current approach in persisting data to an external database (mySQL in specific) from OH3. Absolute beginner so please take what I say about OH3 with critical spirit.
What is all this for?
Once your data are stored in a relational database, you can - if you are handy at SQL - define the relationships between data to be able to reaggregate and query them more quickly.
In any case, you can extract and work on the data using common spreadsheets like MS Excel
The problem
In short, I need to persist data from a sensor I made (ESP8266) which is broadcasting data using a MQTT broker. OpenHAB connects the broker and displays data. Nice.
-
I wish I could
I wish I could store the data, after charting, to a mySQL database -as I need them stored accurately to further develop my project (in the future I think of using different readings of luminosity,humidity,temperature,soil humidity to find the ideal condition to let different plants grow in my orchard, by controlling shading, watering and the like) -
What I tried
I tried setting up a mySQL persistence in OH3. I did not know exactly what I would end up to, to I tried setting up. Various issues, but i fixed them -
What was not satisfying
I think natively there is just a partial coverage of my “wish I could”. Limitations being:- When you’ve got a sensor board broadcasting different data at the same time, data seem to be always treated individually by openHAB -I still have to investigate how I can attach metadata and if this can help to fix the issue, but to make it short the practical effect is that you’ll end up having the JDBC mySQL persistence create different tables for each of the data items.
Below, for instance, the three tables created when persisting the three readings of my sensor board
This is not OK for me for some reasons:
a) you miss the link between the data. Three readings for the same original timestamp are being treated as (potentially - following item on this) three unrelated reading. This is messy when you want to analyze data, using the timestamp to identify a set of data read “at the same time”
b) openHAB optimization, to compress data I bet, posts data to these tables only on value changes. That means that if you have a equipment sending 3 sensor data every second and of the three sensor data one is constant while the other is everchanging, you’ll get in one table 1 line (0 potentially) and 10 in the other.
c) memory use in the database. In mySQL every datatime uses 5bytes. a DECIMAL(3,1) uses 2bytes. So to store the original package set of 1 timestmp and three readings in a proper form, I need 5+2+2+2=11 bytes. With OH persistence -which creates a table with a DATETIME and DOUBLE fields - (neglecting the optimization mentioned) (5+4)*3=27 bytes. Things get obviuosly worse if your thing has multiple sensors and broadcasts even more data readings at the same time. Moreover double is an approximate value, while DECIMAL is exact value (it will make analysis easier to read, I’m aware that obviously the sensor readings are approx).
d) OH optimization is good but brings data loss. Two examples: if a reading is not being broadcasted when it stays constant, you cannot simply compute an average with a database function, as a 10 minutes value gets the same relevance of a one time glitch. Second one is that you don’t know if the silence from OH means that the data is constant or simply your sensor is dead and is not broadcasting anymore. Not good.
- When you’ve got a sensor board broadcasting different data at the same time, data seem to be always treated individually by openHAB -I still have to investigate how I can attach metadata and if this can help to fix the issue, but to make it short the practical effect is that you’ll end up having the JDBC mySQL persistence create different tables for each of the data items.
OK, enough for the foreword.
My current setup to fix this
a) have the sensor broadcast a JSON packet through MQTT and create an Item in OH to get it
{ "time": "2020-12-21 14:53:37.", "temperature": 13.6, "humidity": 66.2, "senstemp": 12.7 }
b) potentially use JSONPATH to extract the single data and use them to feed Items showing the three data in the OH UI
c) OH will persist the data in a table similar to this
don’t get confued by the time value -more on that in item (e). It should be an actual timestamp from OH
d) set up a new table with optimized structure, something like
e) add a trigger to the table created by the JSONDB mySQL persistence, that is:
CREATE DEFINER=
root
@localhost
TRIGGER bundle_tsdc_0007_BEFORE_INSERT
BEFORE INSERT ON bundle_tsdc_0007
FOR EACH ROW BEGININSERT INTO data_tsdc SET date=NEW.value->"$.time",temp=NEW.value->"$.temperature",tempsens=NEW.value->"$.senstemp",
umidita=NEW.value->"$.humidity";
SET NEW.time=“2000-01-01 00:00:00”;
END
basically what I am doing here is - when a new data point is pushed to the table by OH, I
- pick the values and push them to the optimized table
- alter the timestamp to a standard time point, here “2000-01-01 00:00:00”. This will bring to overwrite the previous entry and have - actually - OH store just one row in this table.
Result is what I expected, that is:
optimized table
OH transit table
perk of this are
- I can read directly in the DB when was the last trasmission from the equipment
- if any error occurs during the data push to the optimized table, the unimported value will sit there and won’t get lost. It can be examined at anytime and recovered by the db admin.
Hope this helps some other newbie.
Improvements
- looking into this Design Pattern: Group Based Persistence to restrict mySQL persistence to this bundle only, as the rest is not useful anymore
- Trying to figure out if I can add a field in the OH table to identify the equipment, so that I can control easily at database level (extracting the last timestamp from the data) when every equipment transmitting sensor data was seen. Can be done adding an equipment ID to the JSON string, though.
- Would be nice to improve the JDBC persistence with some of this. Not sure where to start tough. Not even found the code of the persistence in the GIT repo.