ZRAM status

Tags: #<Tag:0x00007f744e57c9b0>

This article is to track and document the current status of the ZRAM feature in openHABian.
I’ll update it if there’s significant news.

What is ZRAM and how does it work ?

The ZRAM config utility for swap, directory & log is an OS enhancement for IoT / maker projects for reducing SD, NAND and eMMC block wear via write operations such as logging and persistence in openHAB. It uses compression to minimise precious memory footprint and extremely infrequent write outs and near RAM speed working dirs with memory compression ratios depending on your data and the compression algorithm you choose.

Uses a table in /etc/ztab where any combination and number of ZRAM drives can be created. This branch uses an OverlayFS mount with ZRAM so that syncFromDisk on start is not needed. This should allow for quicker boots and larger directories as no complete directory copy is needed as it’s all the lower mount in the OverlayFS.

In openHABian implementation, swap, /var/log/* and /var/lib/openhab2/* are moved into ZRAM.
There’s a log in /usr/local/share/zram-config/log/

On proper service shutdown (zram-config stop), zram’ed directories (OverlayFS) will be a) synced to the ‘lower’ filesystem (the same dirname on / filesystem located on default boot medium)
and b) the ‘upper’ filesystem (the part in memory) will be lazily unmounted.
Lazy unmounting means there can still be processes to have open files on this directory. That’s required to run (and keep running) system processes using dirs such as e.g. /var/log.

Do’s and Don’t’s

  • Rebooting … the former recommendation not to use reboot or halt is obsolete because all current OS seem to behave right. But nothing in life is 100% safe and it’s not wrong in principle so I keep it in the ‘known issues’ section for reference.

  • you must not switch off your openHABian server unless you have properly shut it down.
    While this has been a requirement unrelated to ZRAM and essentially ever since UNIX exists, it’s amazing how many people still do this today. Put it on a UPS to safeguard it from power outages.

  • don’t use zram-config sync (even if available in your installed version) unless you know what you’re doing for test purposes.
    This is known to not work in a number of situations and may cause data loss.

  • Make use of a backup solution such as Amanda to have daily backups of your zram’ed directories.

  • double-check your OH logging settings in /var/lib/openhab2/etc/org.ops4j.pax.logging.cfg and make sure it won’t generate more logs than fit into that ZRAM directory as per /etc/ztab (and note there’s system logging to also go there !)

  • you shouldn’t use ZRAM unless you’re on a SBC (small, “single board” computer such as Raspberry Pis) to run its OS off a medium based on flash memory such as SD card or (!) USB stick. USB sticks are no better or safer than internal SD cards - another good reason for ZRAM.

  • don’t run ZRAM on machines to have even less than 1GB of RAM such as a RPi Zero.

  • don’t run it on non-SBCs or modified SBCs to run off “safe” media such as SSD or HDD.
    Well you can do, but there’s just a negligible benefit in doing that so the cost/risk-to-benefit ratio is bad.

  • If (and only if) you have a SBC system with more than 1 GB such as a Raspi 4 with 2 or 4 GB or an Odroid C2 with 2 GB, you may increase the amount of RAM assigned to ZRAM in /etc/ztab, starting with /var/lib/openhab2.
    That’ll help staying away from the ‘cache’ issue mentioned below. Don’t worry ZRAM will not occupy that full maximum amount unless it really needs to.

  • you may change the list and size of directories to cache by changing entries in /etc/ztab .
    For example if you consider it to be too risky to run all of /var/lib/openhab2 off ZRAM because you want say jsondb to not be on there, you can put a comment sign in front of that line or replace it with 2 lines for /var/lib/openhab2/tmp and …/cache. But be aware that on the downside this will increase write load on your SD and thus the likelihood of getting hit by wearout.

  • there’s statistics available: run zramctl without options.
    For advanced stats, the file /sys/block/zram<id>/mm_stat has all the info you need. For the complete set of statistics see the kernel docs.

Known issues

You might run into a misbehavior. Please let me know when you do and let me know the relevant details, have you done anything latety which possibly was a trigger ? Edited a .items or .rules file ? Added a thing or binding that might come with a problem ? Changed openHAB version (incl. milestones, snapshot)?

  • Don’t use reboot and halt commands.
    Yes, this is somewhat overdone, but some Unices (plural of “UNIX”) including openHABian/Raspbian jessie(?) and older still may skip the steps to properly rundown your machine’s services including ZRAM and that can result in all changes since startup being lost.
    Buster (the Raspbian release that the latest openHABian 1.5 image is built on) is known to work, but it only applies to boxes installed using this image or properly upgraded to buster which does not happen automatically if your box was installed before the 1.5 buster based image was available…
    There’s reports of older OS versions to fail here although it can be misleading that even a proper shutdown kills networking before eventually syncing ZRAM to disk so you might believe it skips shutdown scripts although it does not. But you only can see that if you have attached a console.
    By the way ’shutdown -r' is the proper command to use.
    If you want to be on the safe side: stop openHAB using sudo systemctl openhab2 stop, wait for it to finish and then manually stop ZRAM: sudo zram-config stop
    You must use a proper procedure to shutdown/reboot your server to have it sync files to disk - otherwise they get lost.
    That’s /var/log and /var/lib/openhab2 and essentially applies to logs and persistence data.
    Short of undiscovered bugs, the tool/script zram-config will take care of this sync when called with the stop parameter. Note there’s a large number of alternative commands/options to reboot and how they work differs in Raspbian let alone among UNIX implementations.
    We don’t neither know or could document all of them. If you don’t find “your” way of booting mentioned to be known to work then it possibly isn’t safe.

  • after systemctl stop zram-config or /usr/local/bin/zram-config stop, a ZRAM device /dev/zramX might persist (X is a number). You can try removing it via zramctl -r /dev/zramX. If that fails, reboot to get rid of it. Eventually systemctl disable openhab2 before so it doesn’t start automatically after boot.

  • Running openHAB2 off zram’ed directories (which is what we want as it is the whole point about this feature), OH2 makes use of Karaf and that comes with a ‘cache’. Depending on your OH config that’s generating 200+ MBs of changed data that ZRAM needs to hold in RAM right from the OH start. Unfortunately the Karaf ‘cache’ size cannot be controlled to limit use to a specified maximum amount of RAM. There’s no choice to cache some parts while dropping others (so the name ‘cache’ is somewhat misleading here).
    The ZRAM-RAM size assignments are defined in /etc/ztab and they’re a tradeoff between what OH uses (usually ~500-700 MB on ARM, varies depending on OH config) and what Karaf needs for its cache (~200-250 MB, also depending on OH config).
    You’ll quickly notice there’s not much headroom on any SBC to typically have 1GB of RAM, let alone running an OH version to have a memleak (although that may in fact not be a big problem w.r.t. media wearout as it typically will result in paging out RAM pages only once).
    That means that if you start OH from an empty/cleared cache, it’ll use those 200+ MB in (Z)RAM right away, and any change in operations will be on top.

You might run into space issues if you need to clear the cache and OH regenerates it on start.
A (hopefully safe) method to work around such problems is to:

  1. shutdown openHAB2: sudo systemctl stop openhab2

  2. delete the cache: sudo rm -rf /var/lib/openhab2/tmp/* /var/lib/openhab2/cache/* /opt/zram/openhab2.bind/tmp/* /opt/zram/openhab2.bind/cache/*

  3. start openHAB2, have it complete initialization of items, rules etc: sudo systemctl start openhab2

  4. shutdown openHAB2 again: sudo systemctl stop openhab2

  5. stop ZRAM to make it sync to disk (notably the Karaf-generated files in /var/lib/openhab2/cache and …/tmp): sudo /usr/local/bin/zram-config stop

  6. start ZRAM again: sudo /usr/local/bin/zram-config start

  7. start openHAB2 again: sudo systemctl start openhab2

    4-7 should execute if you (see above) properly reboot (shutdown -r) your machine but again if you want to be on the safe side do it manually.

  • zram-config sync is a feature to allow for “online sync” that is being worked on, but there’s no estimate as to when that will be available.
    For now it is said to fail in a number of situations. I encourage everyone to help with testing but be aware that’s at your own risk of data loss. Drop me a note if you’re volunteering.

  • you might see permission denied messages saying the system cannot write to disk into either of these directories or files below because they’re read-only.
    This has not been reported by users but occasionally showed up during testing. There’s a lot of potential reasons for this and it is still subject to investigation as to what’s the most likely/most frequent reason to cause this.
    ZRAM does never sync to disk unless you stop it (doing so syncs to and then unmounts the ZRAM directories).
    Just like on disk, the amount of RAM that ZRAM needs to have available must be larger than the amount of data to store in there but it’s a little more complicated than that because the data also is compressed with a varying compression factor so the exact amount of raw data to fit in is unknown. The amount of memory ZRAM has available is static (defined in /etc/ztab).
    Now if something happens to write more changed data to zram’ed directories than there fits into, ‘permission denied’ is (most often) how the system protects itself against even more severe impacts.
    You can try deleting zram’ed data but usually you would need to (and the safer method clearly is to) shutdown your system.

12 Likes

@mstormi Markus,

My system with zram is misbehaving. I suspect I have corrupted the zram configuration. I tried to re-install zram using openhabian-config but it can no longer find you zram repository in gitHub. Does this mean that this BETA trial is not moving forward at this time? If your repository is no longer available, should the openhabian-config entry be deleted as well?

I have disabled zram (systemctl disable zram-config) for now. Is there anything I need to do to remove the directories/mounts? Any other cleanup?

At this point, utilities cannot seem to be able to write any logs. I have run the “fix permissions” entry in openhabian-config. I’ve had to disable logging for mosquitto for it to function properly. Something is seriously hosed and I need to resolve it. Any guidance is greatly appreciated.

Regards.

Mike

No. I’m waiting for the original author to transfer his repo, he’s slow to answer unfortunately. Meanwhile you can install from his, see this thread.

You might also be able to fix your install, unless you deleted binaries or directories I believe all you need to do is fix /etc/ztab (if you changed it), re-enable and reboot. The original is at /opt/openhabian/includes/ztab. I’d suggest to increase disksize for/var/lib/openhab2 to 600+MB while you’re at it.

No, you need to stop and eventually reboot. See if df and zramctl show any remainders.
Anything else I’m afraid likely is no zram issue.

Thanks Markus. I’ll make note of that thread for future reference.

I have decided to revisit ZRAM at a later date. Right now the Chromecast binding (and others) are in conflict with the REST Docs (Jackson library incompatibility). I am planning to migrate to MQTT2 and leveraging the REST Docs to clone my Things more easily (than hundreds of PaperUI clicks). All this to say that I’ll revisit ZRAM when I revisit MQTT2 (hopefully M4 resolves the Jackson library issues).

Mike

P.S.

I’m not sure what happened with Mosquitto. I had to uninstall it and reinstall it to resolve the issue. My openHAB deployment is stable again… I hope :wink:

Sorry for maybe asking a dumb question, but I have a hard time trying to get Z-RAM to work an Obenhabian. I just updated the config tool (openhabian-config) and tried again (like many weeks before already) to activate Z-RAM support but unfortunately with the same error:
grafik

Am I missing something or are you all still waiting for an update? Thank you very much for any advice or pointing me in the right direction!

I’m waiting for this PR to get merged. You can apply it yourself to make this work (essentially just replace the repo name “zram-config” in /opt/openhabian/functions/zram.bash by “openhabian-zram”).

1 Like

Thank you very much, works like a charm, excellent!
By the way, what takes 27(!) days to get a PR merged? I am just wondering why all users still need to suffer just because no one approves a PR which already is 4 weeks old???

It’s merged now so you should be able to use openhabian-config now to install ZRAM.

1 Like

Hi,

I use the latest openhabian image together with an RPI4 4GB. How is the proper way to shutdown? Only shutdown -r? I use the PiJuice UPS to proper shutdown. The internal scripts are sending SYS_FUNC_HALT and Power Off completelly one minute later. Is it safe to use this script so that all ZRAM content is synched to the SD-Card? Also I use often sudo reboot to restart the RPI. Is this also not allowed anymore?

I hope somebody can describe the best way to do an proper shutdown and reboot a little bit more in detail.

best regards René

RTFP
Yes the full one. I don’t write essays because I like to write.

3 Likes

To properly power off the RPi you should use sudo shutdown -h (halt the system) instead of sudo shutdown -r (reboot the system).

@mstormi
Just some follow up questions in a proper thread:

  • Does this has been tested on RPI 4?
  • Are there any downside of using ZRAM? Like using more CPU or anything like this…

It works in openHABian 1.5 which is buster based. The Pi model does not make a difference.

Read the post #1

I’ve started to play with this some more. It seems to be stable enough to rely upon on a remote machine.

It is probably worth mentioning that if you use openhab-cli backup, you will want to move the zip files out of /var/lib/openhab2/backups. There’s no reason to have these files loaded into zram.

Great, thanks. An upcoming openHABian PR will unmark it from being beta. No not exclusively because of this statement of yours, but thanks nevertheless.

Right. @benjy could you please take care of that ?
Dunno which repo openhab-cli belongs to else would have filed an issue there.

I already implemented a similar thing with openHAB log files:

In essence, you’ll need to schedule a cron job to relocate the backup files to another location outside the control of ZRAM, e.g. somewhere in /home/openhabian).

openhab-cli provides the shortcut for the distro’s backup script, (which has a default directory of ${OPENHAB_USERDATA}/backups). But that’s only if a path isn’t specified.

You can also change the default directory by setting ${OPENHAB_BACKUPS} in the /etc/defaults/openhab2 file. I’d reccomend doing this during the zram setup stage and notifying the user of the change.

1 Like

@mstormi, does zram sync after an apt update changes the contents of the /var/lib/openhab2 directory? I’m not entirely convinced that the whole folder should be included.

Well changes are active but not synced to disk/SD. No you cannot force syncs with the fs online so it’s just “virtual” changes until you shutdown (w/ sync).
I agree we could spend some more thoughts on what to include and what not, but note it’s not possible to exclude subdirs if the upper dir is included so that sort of forces some bundling we eventually wouldn’t want to have.

I think the safest approach would be to check if zram is active and temporarily disable it during the openHAB upgrade process.