OH2 - saving SD Card Writes

, ,

Been there, done that! :laughing:
Pulling my hairs out a whole nightā€¦ but that was when I was running the Pi from a SD card (and a premium one as a matter of fact, no cheap one) and that was the moment when I decided to go this path with the USB disk and whole system backup. I even considered, for a very brief moment, a PiDrive or a SSD :slight_smile:

I have moved to SSD after having lots of issues with SD cards. Touch wood the SSDs have been working flawlessly for about 12 months.

https://melgrubb.com/2016/12/11/rphs-v2-hdd/

Is the PiDrive a good solution ? Works well and reliable ?

@WayneStronach, @Lolodomo and others:
Youā€™re missing the point. While SD cards are the default and more likely to get hit than e.g. SSDs, ANY device can crash.
You MUST have a recovery concept.
And if you do, you donā€™t need to use SSD, HDD any more but you can stick with SD as well.
Getting back to what this thread is about, the only thing Iā€™d consider is to move volatile data to a separate medium such as a USB stick.

Actually I think you are missing the point - SDs cards are very unreliable when used on RPIs with Openhab. I understand that this is because Openhab writes and rewrites to the same place very quickly - but whatever the reason, I have never had more than about 6 months before the cards crash and get errors. I have used all different types, speeds and qualities and had the same result.

The SSDs on the other hand has worked 100% for about 12 months.

For reliability, I use two RPIs both running and logging the same data, but with only one allowed to write to outputs (and run rules etc). I regularly back both RPIs up to another hard disk. I have also been working towards using a more reliable system using https://www.howtoforge.com/setting-up-a-high-availability-load-balancer-with-haproxy-keepalived-on-debian-lenny

At my knowledge, PiDrive is not SSD. We can expect a good life duration.
On my side, until now, I am only saving manually my OH setup file.

Feel free to invest time and money in your private data center, itā€™s fun (hopefully) and does not do harm.
Redundant HW, loadbalancers, virtualization ā€¦ donā€™t get me wrong, I keep building this stuff in my other (professional) life each day.
But you canā€™t expect the average openHAB user to be able to or willing to do that at home.

This being a public smart home user forum and trying to give advice with my posts applicable to as many users as possible, we need a solution as-simple and as-cheap as possible.

The point is not how reliable SDs ultimately are (your personal experience does not make your statement a generally valid statement, and if you read properly, while I donā€™t agree Iā€™m not saying youā€™re wrong either, I also advocate for offloading writes to another medium).
The point is: no matter if SD or SSD, I need a recovery concept, and when I have one such as a clone SD card, I neednā€™t be afraid of a crash.

However, the less HW and SW the better. Actually I just swap my SD card against the clone card I have prepared and Iā€™m good to go in less than 5 minutes. Even my mom can be instructed to do that.
Thatā€™s a simple, cheap solution applicable to everyone, and great MTTR and even WAF, too :slight_smile:

Except that you lose all your data when you swap SD cards (at least back to the point at which you cloned it). I tried running clones for a while but found that the SD card corruption occurred slowly and this meant that the clone became of little real use.

Agree that my personal experience doesnā€™t mean much, however if you search these forums you will see that my experiences with openhab and RPI/sd cards are not unique.

Well, on average you donā€™t have daily changes any more once youā€™ve arrived at a working smart home setup, so to restore a say one month old version in case of emergency is fine, too.
Then again, that of course was only part of the story. You also should have a daily file based backup mechanism in place to catch at least the changes since you last cloned.
Iā€™m using a backup tool to combine both of these methods. See GitHub issue link above.

Whatā€™s the symptoms of sd card corruption, Iā€™ve been running fine on the same over sized card for around 2 years so far without any issues I know of

On that one ā€¦ now since a Pi only has 1GB of RAM, half of which is usually in use for OH, do you and how do you actually ensure too much logging etc. will not result in RAM shortage and swapping to follow ?

Logrotate and minimising excessive logging mainly. I donā€™t for example run access logs unless Iā€™m troubleshooting. The events.log will be turned off when I work out how to do it.

I donā€™t understand. A filesystem is either intact or it is not. You should ensure it is (fsck, journal verification) before you clone.
But then you restore to a different HW with a completely different (much lower) level of ā€˜worn-out-nessā€™, i.e. in comparison itā€™s a setback in terms of corruption level, thus increase in remaining TTL.

My question was aiming at if thereā€™s something like a recommendation for the ā€˜sizeā€™ tmpfs parameter to restrict RAM usageā€¦

Iā€™d have to look tomorrow - itā€™s night time here.
I basically use a version of the Open Energy Monitor configuration with the bits I donā€™t need taken out. The build card is on GitHub, look for the EMONPI image build guide. It runs mainly read only with an extra partition for data with access times turned off. Also many things are sent to tmpfs. I run their 4GB image expanded to 8GB on a 32GB Sandisk Extreme card with the remaining space free. Itā€™s worked very well so far with zero corruption despite several unplanned power outages

From what Iā€™ve seen and seen reported here:

  • system freezes and remains unresponsive, particularly after a reboot
  • system randomly crashes and reboots
  • writes donā€™t actually get saved on disk permanently (e.g. you change a file and the next day your changes are gone)
  • random services restarts

I currently have a failing SD card in one of my Pis (I donā€™t run OH on a Pi but some things are controlled by remote Pis and the one that is failing is the only one I havenā€™t configured to be read only yet) and I see that it is failing through Tripwire. I get a report showing a bunch of changes. I update the Tripwire DB to accept those changes. Run it again and the changes are still showing up. So either something is constantly changing my config or my writes are not being saved. Iā€™ve also seen a few manual config file edits disappearing after awhile.

I just havenā€™t gotten around to rebuilding it, it has been a very busy month.

Personally, just to point to an alternative approach to backup/restore, I use git to store my configs and Ansible playbooks to build up the machine. When I need to restore one of my Pis I just need to put the stock Raspbian image on it with the wifi config file added so it joins the network and then run my ansible playbooks. It probably takes a little longer than a restore of an SD card image but I get the added benefit of having my Piā€™s config fully documented and configuration controlled and I donā€™t have to do any maintenance to make sure I have recent backups.

But, unlike most, I have no permanent state on these Pis and therefore lose nothing in a crash. I even offload the logs using rsyslog. And I only have to recover the config to be back up and running. And of course I backup my git server where my configs and ansible scripts are stored.

This wouldnā€™t work for everyone and probably wouldnā€™t be a great choice for openHAB, but I post it to get some users thinking about potential other approaches.

3 Likes

Nice setup!

Maybe a development for a general applicable backup script for openhabian would find some interest with @ThomDietrich. I can help also!

BR,
George

Thanks for that, as my cards age Iā€™ll keep an eye out. Judging by the comments of others on this thread my ā€œavoid any disk writes where possibleā€ and use a fast SD Card with most of it unallocated approach has paid off.
Moving forward Iā€™ll keep doing image backups after major system changes, and before my weekly scheduled reboots Iā€™ll stop the openhab2 service and then copy the contents of /var/lib/openhab2 and /etc/openhab2/ onto a memory stick.
I use Open Energy Monitor for my graphing on my secondary Pi, so Iā€™ll add the Openhab backup on that one to the OEM backup routine and automate it.
@mstormi my tmpfs settings from fstab are:

tmpfs 		/tmp 			tmpfs 	nodev,nosuid,size=30M,mode=1777	0 0 
tmpfs	 	/var/log 		tmpfs 	nodev,nosuid,size=50M,mode=1777 0 0
tmpfs	 	/var/lib/dhcp 		tmpfs 	nodev,nosuid,size=1M,mode=1777	0 0
tmpfs		/var/cache/samba	tmpfs 	nodev,nosuid,size=40M,mode=1777	0 0
tmpfs		/var/lib/logrotate 	tmpfs 	nodev,nosuid,size=1M,mode=1777	0 0

I run a read only root, with /etc/openhab2 and /var/lib/openhab2 redirected to an ext2 partition with file and directory access time turned off.

Iā€™ve also just turned the events.log off until I need it by changing line 6 of /var/lib/openhab2/org.ops4j.pax.logging.cfg to

log4j.logger.smarthome.event = ERROR, event, osgi:*

This doesnā€™t effect the card writes as /var/log is in RAM on my system but it does waste clock cycles. This file gets a lot of writing for anyone who hasnā€™t redirected their /var/log folder to tmpfs, on my system it got 2.8MB in around 8 hours, most of which I was asleep for. Not the end of the world in terms of card writes, but every thing adds up.

Whilst we are a bit off topic how have you found the Razberry card? Iā€™m awaiting some Z Wave blind controllers from Indiegogo and need to add a Z Wave interface to my Pi3 for when they arrive, I was thinking of a Aeon Labs Gen 5 Stick, but the Razberry is about half the price. Does it do all the frequencies and is it capable of just working without endless tweaking and bugs?

Well, the USB stick has one big advantage, in my opinion: you can use it with a RPi or with a regular PC. Heck, you can even use it with a NAS and an openHAB docker container or a virtual machine. With the RaZberry you have to stick to the Pi, of course. Also, as far as I know, the Aeon stick has a small battery so you can move it and perform inclusion and exclusion near the device. With the RaZberry, youā€™ll have to bring the device to the Pi :smile:
As for you last questionā€¦ never needed to tweak anything. I just plugged the daughter board and go :wink: