I need help/input on recovering from a failing SD card

Tags: #<Tag:0x00007f617bc1c968> #<Tag:0x00007f617bc1c878> #<Tag:0x00007f617bc1c788>

I think my sd-card is failing. I get ext4-fs errors indicating problematic inodes while booting (and afterwards also), which started happening without any change to the system all of a sudden.

I am running a Raspberry Pi 4 with OpenHABian 1.5/OpenHAB 2.5:

openhab-distro : 2.5.2
openhab-core : 2.5.0
openhab1-addons : 1.14.0
openhab2-addons : 2.5.2
karaf : 4.2.7

I was able to still make an image of the failing sdcard with dd and copied the image to a new equally-sized sdcard.
With the new sdcard booting works without issue, but during operation I get an error in the logs, which seems to have come from it attempting to write to the logs (the error said something about log.py, if I remember correctly - I am using Jython).

I am not sure, if the following makes sense, so I would like some input, if my suspicion is feasible and if so how to test/correct it:

I suspect that on the failing sdcard the log-files wore out the card and therefore the file-system got corrupted along with the file (I did not move the logging to e.g. RAM unfortunately). By doing the dd image I copied over any of the corruption of the system and mirrored that to the now sdcard. So in effect, I now have a working sdcard, but with a corrupted file-system/log-file.

With this idea in mind, I have tried running fsck on the partitions of the new sdcard (by mounting it on Linux PC) to fix the broken file. However, running fsck I get an overwhelming number of file-system errors, which makes it seem like the file-system is completely broken - and I think that running fsck to the end on it will mess up the file-system completely (which is not a big issue, since I still have the original FS-image on my PC).

So my question is: Does the above make sense? If so: Is there any way I can recover from this without reinstalling OpenHABian and copying the files over?

One idea that came to mind is:
I could format the partitions of the new sdcard so that that the file-systems should be clean again and then copy the files themselves from the mounted disk-images. That way any corrupted file would “just” be ignored (or I could ignore them) and the FS-corruption of the disk-image would not carry over. Does that make sense?

By the way: I also have backups of created with the OpenHAB backup-script (don’t remember the command I used). So I think I could get everything setup again based on a clean install. But I would like to avoid this if possible.

Any suggestions are greatly appreciated.

Best,
Michael

Stop messing with copies of what may already be corrupted, you’re wasting your time.
Install openHABian to a fresh SD and use openhab-cli restore.
And enable ZRAM this time.

I can’t comment on your idea, as it sounds like you know more about Linux than I do. But even without knowing if it will work, I think you’d be better off just doing a fresh install and copying over a backup from your failing system. The alternative is to put in a lot of work while always leaving some doubt that your system is working properly. That just makes future troubleshooting much harder, because it’s impossible to be certain that the problem doesn’t stem from the previously failed SD card.

Good luck!

What the previous speakers said :wink:
I would look into adding a SSD and only keep the boot partition on the SD card.
Openhab produces quite heavy log files even in info mode.
If you turn on debug for zwave or zigbee the SD card will get a very intensive load it is not designed to handle.

ZRAM takes care of that without needing an SSD. Nothing wrong with adding an SSD, but its easy to enable ZRAM in openhabiab-config and just as effective at avoiding SD writes. Plus, it doesn’t cost anything.

If some files are missing the system won’t boot or it will misbehave in unpredicatable and unreproducible ways. I experienced several SD failures and every time it was different (and as I was less experienced, it took a while to realize it was a failure).
Mstormi suggested you the fastest way to recover, possibly with a new SD card.
After I connected my raspberry pi v3b to a powerbank UPS (capable of simultaneously charging and delivering power without interruption) and activated ZRAM (in the last openhabian versions, it is activated by default), I never experienced corruption anymore. There is a very informative thread, opened by mstormi that is certainly worth reading.

Good luck

1 Like

I agree that ZRAM is the way to go for all lightweight Rpi installations.
I put log2RAM on my rpis that has various use (not openhab) because it is easy to setup.
However, say 40 things (mainly zwave and/or zigbee), mosquitto, influxdb and grafana there is no doubt that one is well into SSD territory.
If one can live without a db and grafana and restrict persistence; fine :slight_smile:

openHABian includes InfluxDB/Grafana, too, and if you make heavy use of it you can easily extend ZRAM to apply to the data directories it’s using. Or any other, FWIW.

Thanks for the input to everyone.

Given these two comments, I agree, that it makes more sense to put the extra effort in now to restore everything - even if it hurts.

So one more question:

Should I restoe with my current version (which was 2.5.2) and then upgrade to the current version (which I now see is already at 2.5.6)? Or is it better to I install 2.5.6 directly and try restoring my backup to that instead (also because ZRAM is enabled by default)?

The former seems probably safer, because I know my setup should work as is. But then the upgrade might turn out “painful”(?).

Thanks for any suggestions.

1 Like

Start with what openHABian will install (2.5.6-2, that is), it should work as AFAIK there have not been any breaking changes since 2.5.2.

1 Like

Thanks for this input!

Could you point me to the thread by mstormi?

Regarding the UPS:
This sounds interessting? Does this work with all powerbanks or is this special hardware? Could you maybe tell me the type that you are using, so that I have something to go on? Do you know, if yours would also work for my Raspberry Pi 4 model B?

ZRAM status
BTW ZRAM is activated on new openHABian installations.

Neither it is. Some powerbanks allow for this. Just pay attention when you order.
Also get one capable of 3A (most allow for 2.1 or 2.4A only).

PS: and if you go for Amazon, use smile.amazon.com and select openHAB foundation to donate to

1 Like

The other thread I was referring to was this :Corrupt Filesystem every 2-3 months.

That’s an important point. In my case (a raspberry pi v3b) the original power supply is rated 2.5A but I have to use the 2.1A port of the power bank ( with a very short USB cable).

Three years ago I bough this powerbank: it delivers power either when it is being charged or not and there is no interruption during a power outage. This feature is not mentioned in the specifications and I’ve read somewhere that these powerbanks are not designed to work in this way. My raspberry can run up to 24h (at least when it was new, now probably less ): I can disconnect the powerbank from the original power supply and bring the raspi with me in a location where I can attach it to a screen.
Alternatively there are mini-UPS, still based on lithium batteries: I bought one like this for a 12V router: it should have also a 5V output, but I do not know it the rating is enough for Raspberry pi v4, which is more power hungry than the raspberry pi v3b. I sent it back as it was not able to power my fritzbox router (but it worked with a less power hungry access point).