Part One: Deleting Old Values
Based on examples in this Thread I put all hints and further usefull things together:
If you want to cleanup Old Item-Data all code can be send directly in SQL-Database. For jdbc-mysql-persistence all Commands have to be send in SQL (for example in SQL Window in phpmyadmin). has to be changed to your exact table name, you can see in phpmyadmin.
Following Examples are tested with OH 2.5 SNAPSHOT 1680:
1.1) Alter Table for Items to set maximal history of data as last column
ALTER TABLE `OPENHAB2-TABLE`.`items`
ADD COLUMN `deleteafterdays` INT(11) NOT NULL DEFAULT -1;
Explanation: Adds a new column where you can enter after how many days the data will be deleted. -1 means: No data will deleted, 0 means all Data except newest value will be deleted, values from 1 to x means all values older than x days except newest value will be deleted.
1.2) Create procedure can be scheduled directly in SQL-Database
DELIMITER $$
CREATE PROCEDURE `OPENHAB2-TABLE`.`DeleteOldItemValues`()
BEGIN
# declare variables
DECLARE var_interval VARCHAR(20);
DECLARE var_table, var_itemname VARCHAR(500);
DECLARE var_tableId INT(4) zerofill DEFAULT 0;
DECLARE var_done, var_garbage, var_noEntry INT DEFAULT 0;
# This builds an cursor on all ItemIds, which are marked as "garbage"
DECLARE myCursor CURSOR FOR SELECT ItemId, itemname, deleteafterdays FROM items WHERE deleteafterdays >= 0;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET var_done = 1;
OPEN myCursor;
# go through all ItemIds and delete old values
read_loop: LOOP
FETCH myCursor INTO var_tableId, var_itemname, var_garbage;
IF var_done THEN
LEAVE read_loop;
END IF;
SET var_interval = concat(var_garbage, ' DAY');
SET var_table = concat(lower(var_itemname), '_', var_tableID);
SET @latest_time = concat('SELECT max(time) INTO @latest_var FROM ', var_table);
PREPARE psmt FROM @latest_time;
EXECUTE psmt;
DEALLOCATE PREPARE psmt;
# small workaround as variables won't just fit in as table names - so we have to concat the table_name and then we can PREPARE and EXECUTE the statement
IF @latest_var is not null THEN
SET @sql_text = concat('DELETE FROM ', var_table, ' WHERE time < (NOW() - INTERVAL ', var_interval, ') and time <> "', @latest_var, '"');
PREPARE stmt FROM @sql_text;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
END IF;
END LOOP;
CLOSE myCursor;
END$$
Explanation: This SQL-Script is stored in Database and can be used by SQL-Scheduler / EventHandler. If you want to change this script later, you can find it in phpmyadmin at database OPENHAB2-TABLE under procedures (To see after at least on procedure is stored in database).
1.3) Activate Event-Handler in SQL-Database
SET GLOBAL event_scheduler = ON;
Explanation: Turns on event handler in sql-Database. For more information just google for mysql event handler
1.4) Create “Cron”-Job for calling cleanup-OldValues
CREATE EVENT cleanup_OldValues ON SCHEDULE EVERY 1 DAY STARTS '2018-05-02 00:00:00' DO call DeleteOldItemValues();
Explanation: Starts Procedure to delete old Data every midnight. If you want to change this CRON later, you can find it in phpmyadmin at database OPENHAB2-TABLE under events (To see after at least on event is activated database).
1.5) Find configured and unconfigured delete-times with amount of stored values
SELECT A.itemid, A.itemname, A.deleteafterdays, B.TABLE_ROWS from openhab2.items as A, `information_schema`.`TABLES` AS B WHERE concat(A.itemname,'_',LPAD(A.ItemID,4,'0')) = B.TABLE_NAME COLLATE utf8_general_ci AND B.`TABLE_SCHEMA` LIKE 'openhab2' ORDER BY TABLE_ROWS DESC
Explanation: Requests a list with item id, item name, configured value to delete old data and ammount of stored data to priority times to delete in dependence of used storage
Changes done to original Script
- Changed column names to actual names (uppercase, lowercase,…)
- ‘Speaking name’ for column to configure days after which data is deleted
- Changed script to get right data-table-names for item-data, now with lowercased itemname, underscore and id with zerofill
- Changed length of itemnames, now similar to data-table
- Using -1 for unconfigured delete-times, so Option available to delete everything except last value
- integrated and corrected code of emilfelix (missing double-quotes) to except newest value equal of age
- UPDATE 07.05.2019 - Changed Procedure to prevent error if DeleteData configured and Data-Table is empty
- UPDATE 11.07.2019 - Added SQL-Command for phpmyadmin to find configured values in dependency of stored data
- UPDATE 10.11.2020 - Just minor change of cleanup-event name
- UPDATE 15.11.2020 - editorial changes
Part Two: Deleting tables for inactive items
ADDED 10.11.2020 Please read Part One first, explainations already given there are not repeated.
If you want to cleanup old tables from unuses / now inactive Items, this could be with a mix of SQL and some minor changes in openhab2 (item-configuration and one new rule)
Be careful to avoid data loss or openhab2 didn’t work anymore. All scripts tested on my own openhab2 V2.5 Release System with jdbcSQL Persistence
2.1) Alter Table for Items to set maximal inactive time until tables and item is deleted as last column
ALTER TABLE `OPENHAB2-TABLE`.`items`
ADD COLUMN `inactivecycles` INT NOT NULL DEFAULT 0
Explanation: Adds a new column where the cycles (in my case monthly) are stored, while the item is inactive.
2.2) Create procedure can be scheduled directly in SQL-Database
DELIMITER $$
CREATE PROCEDURE `OPENHAB2-TABLE`.`DeleteOldItems`()
BEGIN
# declare variables
DECLARE var_table, var_itemName VARCHAR(500);
DECLARE var_itemID INT(4) zerofill DEFAULT 0;
DECLARE var_done, var_inactivecycles, var_noEntry INT DEFAULT 0;
# This builds an cursor on all ItemIds, which are marked as "garbage"
DECLARE myCursor CURSOR FOR SELECT ItemId, itemname, inactivecycles FROM items WHERE inactivecycles > 6;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET var_done = 1;
OPEN myCursor;
# go through all ItemIds and delete old ValueTable and then Item from Items_table
read_loop: LOOP
FETCH myCursor INTO var_itemID, var_itemName, var_inactivecycles;
IF var_done THEN
LEAVE read_loop;
END IF;
SET var_table = concat(lower(var_itemName), '_', var_itemID);
# Drops table for inactive item:
SET @sql_text = concat('DROP TABLE ', var_table);
PREPARE stmt FROM @sql_text;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
# Deletes entry for inactive Item from items-table
SET @sql_text = concat('DELETE FROM items WHERE items.ItemId = ', var_itemID);
PREPARE stmt FROM @sql_text;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
END LOOP;
CLOSE myCursor;
END$$
Explanation: In my case the inactivecycles must be higher then 6 (month). You can change in 4th line starting with DECLARE to change it.
2.3) Activate Event-Handler in SQL-Database (only if not activated with Part One 1.3)
SET GLOBAL event_scheduler = ON;
Explanation: Look at 1.3
2.4) Create “Cron”-Job for calling cleanup-OldItems
CREATE EVENT cleanup_OldItems ON SCHEDULE EVERY 1 DAY STARTS '2020-11-10 23:55:00' DO call DeleteOldItems();
Explanation: Look at 1.4
2.5) Add a group for all items in openhab2-configuration.
Example for groups.items:
Group gAll
Explanation: This group all items has to be in is used to lookup if item is active in following rule.
2.6) Add all items and groups to this group
Examples:
String Itemname "Label" <icon> (gAll)
Group Groupname "Label" <icon> (gAll)
Work carefully at this step: Missing Item in this group can cause data loss and openhab2 problems at restart. Be sure to add group at every item and group!
2.7) Add rule, which raises inactive cycle periodically and mark active items with -1 in field inactive cycles
Rule SQLmaintenance.rules:
var mysql_host = "localhost"
var mysql_user = "MYSQL-USER-OPENHAB"
var mysql_pw = "MYSQL-USER-PASSWORD"
var mysql_db = "openhab2"
rule "Mark inactive Items in SQL"
when
// Monthly every 9th at 23:45
Time cron "0 45 23 9 * ? *"
then
logInfo("SQL-Maintenance", "Increase cycle item is inactive for all items in SQL")
var String sql_query = "/bin/sh@@-c@@mysql -h " + mysql_host + " -u " + mysql_user + " -p" + mysql_pw + " -e \"UPDATE items set inactivecycles = inactivecycles + '1';\" -D " + mysql_db
var String mysql = executeCommandLine(sql_query, 5000)
logInfo("SQL-Maintenance", "Mark openhab2-items as active (-1) in SQL now")
gAll?.members.forEach [activeItem|
sql_query = "/bin/sh@@-c@@mysql -h " + mysql_host + " -u " + mysql_user + " -p" + mysql_pw + " -e \"UPDATE items set inactivecycles = '-1' WHERE itemname = '" + activeItem.name + "';\" -D " + mysql_db
mysql = executeCommandLine(sql_query, 5000)
]
end
Explanation: First Part increases every time the rule fires the field inactivecycles. Second Part do a loop through all items and groups in gAll and overwrite inactivecycles with -1 (-1 means active item)
Be sure to check the file rights for the rule because of listed mysql-Credentials. Should be openhab:openhab 640 to avoid reading through others.
2.8) Check inactive items to be sure all active are in group gAll
If you didn’t change cycle you now have 6 month time to check the table items in mysql. All Items with value > 0 will be deleted and their tables locked if the entry reaches 7. If you want to hurry up: Just change rule and fire it minutly two or three times and set it back to month. then you can check immediatly.
To check look items with inactivecycles > 0 and do a search over your item configuration for the item name.
- UPDATE 15.11.2020 - editorial changes, deleted double explanation to Part One
Part Three: Deleting unneeded values
ADDED 15.11.2020 Please read Part One first, explainations already given there are not repeated.
Example:
You historicize values at changes and minutly to get clear changes in graphs. If you have a) 1, b) 1, c) 1, d) 1, e) 2, f) 2, g) 3, so all would look same with only a) 1, d) 1, e) 2, f) 2, g) 3. The following deletes double values in mysql-table except first and last one with same value.
Because of performance-reasons the procedure will only cleanup part of the table every time it runs, starting with newest data, maximal the first 200 values. So prefer set the trigger to every hour. If new data is cleared, old deleted, then after some time your data will be fine.
Be careful to avoid data loss or openhab2 didn’t work anymore. All scripts tested on my own openhab2 V2.5 Release System with jdbcSQL Persistence
Additional information: after one week of running, amount of data rows for openhab2 went down from 9.7m to 4.2m, in storage from 360MB to 140 MB.
3.1) Alter Table for Items to configure this function
ALTER TABLE `OPENHAB2-TABLE`.`items`
ADD COLUMN `deletedouble` INT NOT NULL DEFAULT 0
Explanation: Adds a new column where you can configure which tables should be considered (value manually has to be set to 1).
3.2) Create procedure can be scheduled directly in SQL-Database
DELIMITER $$
CREATE PROCEDURE `OPENHAB2-TABLE`.`DeleteDoubleValues`()
BEGIN
# declare variables
DECLARE var_table, var_itemname VARCHAR(500);
DECLARE var_tableId INT(4) zerofill DEFAULT 0;
DECLARE var_done INT DEFAULT 0;
# This builds an cursor on all Item-Tables
DECLARE myCursor CURSOR FOR SELECT ItemId, itemname FROM items WHERE deletedouble=1;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET var_done = 1;
OPEN myCursor;
# go through all ItemIds and delete double and unneeded values (previousValue and next value are the same)
read_loop: LOOP
FETCH myCursor INTO var_tableId, var_itemname;
IF var_done THEN
LEAVE read_loop;
END IF;
SET var_table = concat(lower(var_itemname), '_', var_tableID);
select concat('DEBUG - ', 'Bearbeite Tabelle: ', var_table) AS 'Debug:';
SET @del_sql = concat('DELETE del FROM ', var_table, ' AS del JOIN (SELECT * FROM (((SELECT ROW_NUMBER() OVER(ORDER BY time DESC) AS rowA, time as timeA, value as valueA from ', var_table, ' LIMIT 200) as A JOIN (SELECT ROW_NUMBER() OVER(ORDER BY time DESC) AS rowB, value as valueB from ', var_table, ' LIMIT 200) as B ON A.rowA=B.rowB+1 ) JOIN (SELECT ROW_NUMBER() OVER(ORDER BY time DESC) AS rowC, value as valueC from ', var_table, ' LIMIT 200) as C ON A.rowA=C.rowC-1 ) where A.valueA=B.valueB and A.valueA=C.valueC) as indel ON del.time=indel.timeA');
PREPARE psmt FROM @del_sql;
EXECUTE psmt;
DEALLOCATE PREPARE psmt;
END LOOP;
CLOSE myCursor;
END$$
Explanation: For performance reasons only first 200 entries are considered. If you want to do a bigger block each run, you have to change LIMIT 200 three times in the Procedure. But be carefull, raise this will increase mysql exponential. You can test increase after some time if all existing tables are optimized.
3.3) Activate Event-Handler in SQL-Database (only if not activated with Part One 1.3 or 2.3)
SET GLOBAL event_scheduler = ON;
Explanation: See 1.3
3.5) Create “Cron”-Job for calling cleanup-OldItems
CREATE EVENT cleanup_OldItems ON SCHEDULE EVERY 1 HOUR STARTS '2020-11-10 23:55:00' DO call DeleteDoubleValues();
3.6) Optional: Check running time of procedure
Following commands could be used to check runtime (Bash and mysql):
bash:#/ mysql -u USER -p
mysql> connect openhab2;
mysql> call DeleteDoubleValues();
Explanation: Logs in to mysql, then connect to openhab-Database. Then it calls the procedure and you have an incrementing value of seconds you can see the full runtime. User and Database name must be changed to yours.
- UPDATE 15.11.2020 - Added information about check the running time
- UPDATE 19.11.2020 - Added Feedback
Hope you like it, hints, failures or wishes just per personell message to avoid tutorial will get lost in thread
Open wishes
- Waiting for further interesting ideas to keep sql perfomant and cleaned up Just write per personally message.