Most statements on SD card wearout and filesystem corruption you can read about on this forum and the internet are missing the important points so I try to de-mystify and correct misconceptions with this post. It is constantly updated and meant to serve as a reference to new users. You probably got directed here by me or some other forum responder in response to a question or post of yours.
If you find any of the information contained here to be incorrect please let me know.
Note I’m assuming you’re using a Raspberry Pi to boot off the internal SD reader.
Information may or may not apply to other SBCs or modified RPi setups.
First, file storage corruption can happen when your server loses power while writing to disk - SD cards in particular - because every flash controller provides some caching memory so not every write command really means that the data was successfully written to the medium. Note I’m not talking about file system level handling.
Power losses happen a lot in home automation setups, particularly if you’re in a build phase and there’s you or others to work on the electrical system of the house. Fortunately there’s a simple solution: get a UPS. Most add surge protection, too, and allow to run the OH server, Internet router and other critical systems on battery for at least a couple of minutes.
If you’re using a RPi, you might use a simple powerbank as a UPS but make sure though to get one that allows for charging and powering at the same time - most do not provide this capability.
Also, it must be able to provide the full amount of power the regular power adapter for your RPi provides. Raspberry foundation recommends 2.5A for a RPi3 with power hungry USB peripherals, and for the RPi4 they even had to move to USB-C and supplies to provide 3A. Common supplies are 1A or 2.1A at most. You usually get away with 1A, but you must not forget to factor in all your hats and USB-attached devices as well as to remember that you need to seize your system for peak power consumption such as at boot or backup time and not for the average value.
Note that if underpowered, a RPi3 or older will power down the USB chipset among the first, and Ethernet is connected via USB so a first symptom of this to happen usually is network problems.
You’ll usually get to see ‘under-voltage’ messages in syslog, too, as well as the lightning symbol on the boot screen. The power LED on newer RPIs (3, 4) will also flicker OFF while input current is insufficient.
SD and other media
Second, with maybe one exception (see #1 below), there’s no way to increase reliability of a SD card. They suffer from wearout leading to corruption, and you can do little about it. Worse even, this is not a SD thing but a memory chip thing. The same technology is used in USB sticks, eMMC cards and even SSDs, so the following applies even if you use one of those.
There’s some variances w.r.t. error-free runtime, some cards or models or brands are better than others, but all but the most cheapest SD cards do wear levelling to some extent already. Read on if you’re interested in details. There’s also ‘industrial grade’ tagged cards to have a larger spare buffer built in.
So do SSDs, but these have a DRAM cache in addition, that effectively results in relatively few writes to flash memory which is why they’re way less affected than SD and eMMC.
Unfortunately, all of this is ignoring the fact that once setup, openHAB keeps writing to the same files again and again, in rapid succession. Wear levelling may not apply or may simply not be enough in this case.
For what’s it’s worth, unlike many believe, SD size is not a good indicator for buffer size - twice the size does not imply twice the buffer. Wear levelling algorithms are proprietary and undisclosed, and you never know how large your safety buffer is (no vendor tells). And even those with a large buffer fail at some point in time. Don’t get fooled by advice to buy a specific card because some guy told you he’s been running that without problems. He was just lucky.
And as we touch this: get an “A1” or even “A2” rated SD card. It’s not more reliable but quite faster than traditional ones under random access conditions as we have them in openHAB(ian).
If you shop on Amazon, use smile.amazon.com and select openHAB foundation to donate to. Thanks !
Either way, to select a ‘better’ card or ‘proper’ medium is no solution to the corruption problem.
You need to take a complementary measure (#2 below).
There’s two real useful things you can do to fight corruption:
- reduce write operations (to SD or flash memory in general)
Ideally, put persistence, logs and swap into RAM and sync them to a permanent medium.
You can use any permanent medium (USB stick, SSD or NFS mount on NAS) to put these on.
Losing RAM (on reboot) or the medium with these files is not critical. openHAB usually keeps working, and you can restore them from backup.
Corruption of the system and data you need to keep on the other hand side is critical.
in a nutshell: use ZRAM.
That’s a RAM disk with compression for swap and the most active directories.
See this thread.
I recommend to keep existing swap as a fallback solution. Note the ZRAM swap is created with a higher priority so it’s used first.
adding an option like
/etc/fstab will result in files being written to only every 60 seconds, greatly reducing the number of writes, but note it doesn’t apply to swap or NFS.
Moving write-intensive files is a small one-time effort and will greatly reduce the risk of a SD card corruption caused crash, but it won’t fully mitigate it. Worst thing to possibly happen is that you loose logs because that medium fails - and eventually this makes OH hang so you need to restart it.
The much more important point is that to offload logging and persistence all by themselves is not sufficient. So either way, you also need to
- make daily backups
This will not increase runtime, but it will mitigate the impact of a SD (or USB stick or USB attached SSD or other disk) crash or accidential admin failure.
openHABian now comes with Amanda, a professional backup system.
You might be unaware that openHABian is not just a RPi disk image - it is a set of scripts that can be installed on top of any Debian like UNIX as well. Once you installed these, you don’t have to migrate to an openHABian based setup - you can choose to only install some of the optional components such as Amanda.
Use the new auto backup feature in openHABian to clone your SD right at installation time or via menu option. In case of crash, you just need to exchange cards and you are good to go.
Find below what it effectively does so you can execute it manually as well. But you will forget about doing things when they’re not automated so only ever use this in addition an automated solution.
The manual way is to attach an USB card writer and clone the SD card. Basically the command to do so is something like
dd if=/dev/mmcblk0 of=/dev/sdX.
X depends on your HW setup, so it’s potentially changing when you attach/detach USB devices.
fdisk -l will list your current devices.
If you have storage mounted, you can also directly send the output file there (
of=/path/to/file) so you have a backup.