Corrupt FileSystems every 2-3 month?

I found this in the discussion about log2ram. Looks very interesting.

This is exactly why learned Ansible. I am proficient in Linux and yet I can’t remember everything I do to set up my RPis and VMs. Using Ansible all the setup becomes scripted. The scripts get checked into my source control, like I do with my OH configs. I don’t have to remember what I did because it’s all captured in the YAML. I can go back and look at the history and see how my config has changed over time. And if I have a worse case scenario and lose everything, setting everything back up the way it was is just a matter of setting up the logins on the machines and running one command.

Here is my Ansible role for moving /var/log to a tempfs file system (i.e. logging to RAM).

---
# tasks file for min-writes
# http://www.zdnet.com/article/raspberry-pi-extending-the-life-of-the-sd-card/
- name: Mount /tmp to tmpfs
  mount:
    path: /tmp
    src: tmpfs
    fstype: tmpfs
    opts: defaults,noatime,nosuid,size=100m
    dump: 0
    state: mounted
  become: yes

- name: Mount /var/tmp to tmpfs
  mount:
    path: /var/tmp
    src: tmpfs
    fstype: tmpfs
    opts: defaults,noatime,nosuid,size=30m
    dump: 0
    state: mounted
  become: yes

- name: Mount /var/log to tmpfs
  mount:
    path: /var/log
    src: tmpfs
    fstype: tmpfs
    opts: defaults,noatime,nosuid,mode=0755,size=100m
    dump: 0
    state: mounted
  become: yes

#- name: Mount /var/run to tmpfs
#  mount:
#    path: /var/run
#    src: tmpfs
#    fstype: tmpfs
#    opts: defaults,noatime,nosuid,mode=0755,size=2m
#    dump: 0
#    state: mounted
#  become: yes

- name: Reboot
  include_role:
    name: reboot

There are ways to collapse the above into one task but I haven’t bothered to go back and update my old playbooks yet with new things I’ve learned about Ansible.

I don’t remember why I commented out linking /var/run to a tmpfs.

The script that Russell links to does the same thing using Bash scripting, with some additions to preserve the logs periodically. I just let my logs and tmp folders disappear on a reboot unless I’m actively debugging a problem. Then I’ll enable a cron job like the script linked to above does. If I wanted to write the logs to disk periodically, I’d add a task to create a cron job to do that to the above.

For anyone overwhelmed with needing to maintain more than just a couple a Linux machines, I highly recommend spending the time to learn Ansible. It’s not hard to learn and the little bit of time you spend up front will pay huge dividends in the long run. If you do it right (I don’t yet) and build it to be idempotent (i.e. no changes are made if no changes are needed) you can use the same scripts that build the system to update/upgrade as well. I’ve a separate set of upgrade playbooks. But that means I can upgrade apt, docker images, git cloned software that needs to be built, pulling and deploying updates of my own code, etc to all of my machines with one command. The amount of time this has saved me far outweighs the amount of time I invested learning it.

4 Likes

Note the recent update on zram in my main post and this new post.
I encourage anyone to have a look at https://github.com/openhab/openhabian/pull/576 and help with testing by deploying this on boxes of yours and get me some feedback. See last comment on Github for how to install. Standard disclaimer applies: use at your own risk.

2 Likes

Markus,

I read your post regarding “maximizing” resiliency. I do not currently have the necessary infrastructure (or means) to pursue everything you recommend.

In the interim, I have used a “better quality” (for what that’s worth) USB stick drive and moved root onto it.

I corresponded with you a few weeks ago regarding zRAM and have seen the recent posts regarding getting that fully incorporated and awaiting the PR to be merged. I plan on turning zRAM back on as soon as that’s merged.

I am just confused overall about what all these moving parts are actually doing. I understand that both moving root to USB and using zRAM are intended to minimize the cycles on the SD card to postpone its inevitable failure. But what pieces are where? Particularly, what do I need to do to back up what I need to back up so that I can restore if I have a catastrophic failure.

  • I have moved root to USB
  • I will be using zRAM
  • I have a fairly simple setup. About the only thing besides openHAB and some of its add-ons is mosquitto. I was using an encrypted broker but have transitioned to using myopenhab.org instead (so no certificates to worry about).
  • I have backups of the mosquitto configuration.
  • I regularly back up /srv/openhab2-conf and /srv/openhab2-userdata (“openHABian” references)

So…

  • What remains on the SD card? Should I make image backups of it regularly or is conf and userdata enough? Is there any other recommended approach to back up the information on the SD card?
  • Do I need to make a backup of the root I moved to the USB stick? If so, what is the recommended means?

Many thanks in advance!

Mike

It now is merged.

Historically, “move to USB” was built long before ZRAM (and not by me), and both features are not designed to work with each other so the effect of combining both is unknown even to me. So all I can do is to recommend not to mix them and go with ZRAM only.

I would setup another system from scratch (openHABian image), enable ZRAM and then openhab-cli restore ‘anything OH’ plus whatever you changed beyond OH (mosquitto etc, but there’s no per-application backup/restore for all of these so you have to do that manually).

Yes, use Amanda to backup your SD card. Read the README.

Thanks for your reply.

I’m thinking this may have been the source of some of the weirdness I saw a few weeks ago! Since both options are available in openhabian-config, there probably needs to be some sort of check or warning.

Mike

I just wanted a clarification. I have a rsp pi 3 B, opehab2.4, openhabian 1.5, microsd 16g. I installed zram from the openhab-config tool. Do I have to do anything else? Excuse the stupid question.

In order to get what?

@mstormi
To exploit the function of zram that I thought was to safeguard the SD from over-writing and to prevent it from being often corrupted.

No that should do. Mind the limitations to apply, though.

As an update - I wrote this back in February 2018 and haven’t had a problem since - although there have been household power outages in that time.

That’s about one year and ten months without corruption problems so, in my experience, at least, a pretty robust combination is:

RPi
5V Power bank
USB SSD

I’m also running 2.4

I am running my system on a Raspi3 with a 32GB SD Card now since more than 2 years and no problem!

Holger

1 Like

What do you want to tell us?
That everybody should do like you do and that you will come to rescue their setup when it is hit by wearout?

No, only wanted to say that with a simple SD card setup it must not crash necessarily after 2-3 months.
And if mine would do so, I just can copy back an image backup to a new SD card and everything is good again.

Noone said it must or will crash within a specific short timeframe. It all depends on a number of factors
such as the extent of logging and persistence usage, and ultimately it’s left to chance.
But comments like yours indicate it might not be a necessity to mitigate the risk or even to prepare for disaster. That in turn is hazardous and very bad advice. It’s crucial to proper openHAB operations!

Very nice article and I guess very helpful along my way to make my system more stable.

Does this mean, that Wifi is more stable than Ethernet when using additional USB devices such as zwave sticks of SSD?

Please be aware that this is only true for Raspberry Pi 1 to 3 but not Pi 4.
And even if Ethernet is connected via USB, this is not an issue in question of stability but only tempo. As USB is also used for other USB Devices such as USB Sticks or even external SSDs, this will also affect LAN speed and vice versa.

Please quote comprehensively when you do. Emphasized that again but it was already part of the article.

So, bottom line is, that the ethernet connection on my RPi4 will (most likely) not be affected by low voltage effects, caused by an SSD and / or a USB zwave stick, I assume.

You are right, I apologize :wink: but to be honest, the problem here is a too far shortened quote. It was intended as an Answer to NCO (and I thought I had marked it as a reply… I maybe have deleted that, too?)