I think my sd-card is failing. I get ext4-fs errors indicating problematic inodes while booting (and afterwards also), which started happening without any change to the system all of a sudden.
I am running a Raspberry Pi 4 with OpenHABian 1.5/OpenHAB 2.5:
openhab-distro : 2.5.2
openhab-core : 2.5.0
openhab1-addons : 1.14.0
openhab2-addons : 2.5.2
karaf : 4.2.7
I was able to still make an image of the failing sdcard with
dd and copied the image to a new equally-sized sdcard.
With the new sdcard booting works without issue, but during operation I get an error in the logs, which seems to have come from it attempting to write to the logs (the error said something about
log.py, if I remember correctly - I am using Jython).
I am not sure, if the following makes sense, so I would like some input, if my suspicion is feasible and if so how to test/correct it:
I suspect that on the failing sdcard the log-files wore out the card and therefore the file-system got corrupted along with the file (I did not move the logging to e.g. RAM unfortunately). By doing the
dd image I copied over any of the corruption of the system and mirrored that to the now sdcard. So in effect, I now have a working sdcard, but with a corrupted file-system/log-file.
With this idea in mind, I have tried running fsck on the partitions of the new sdcard (by mounting it on Linux PC) to fix the broken file. However, running fsck I get an overwhelming number of file-system errors, which makes it seem like the file-system is completely broken - and I think that running fsck to the end on it will mess up the file-system completely (which is not a big issue, since I still have the original FS-image on my PC).
So my question is: Does the above make sense? If so: Is there any way I can recover from this without reinstalling OpenHABian and copying the files over?
One idea that came to mind is:
I could format the partitions of the new sdcard so that that the file-systems should be clean again and then copy the files themselves from the mounted disk-images. That way any corrupted file would “just” be ignored (or I could ignore them) and the FS-corruption of the disk-image would not carry over. Does that make sense?
By the way: I also have backups of created with the OpenHAB backup-script (don’t remember the command I used). So I think I could get everything setup again based on a clean install. But I would like to avoid this if possible.
Any suggestions are greatly appreciated.