[Solved] OH3 Docker Container on Kubernetes permissions issues

Hi all,

I am (trying) to run OpenHAB 3.0.1 on Kubernetes. While things work using NFS storage for persistency, I am now trying to move to Rancher Longhorn as it provides distributed block storage across my 4 nodes K8S cluster and thus my setup would be fully High Availability enabled.

I am already running other workloads like Apache-PHP, Grafana, Prometheus, MQTT and InfluxDB on the cluster but unfortunatelly I cannot get OpenHAB up and running since there is a small issue with permissions:

Longhorn.io v1.1.0 does provide read/write access to the persistent volume mounts with root:root only.

While I got Grafana to run by setting

  securityContext:
       runAsUser: 472
       fsGroup: 472

The same is not working for Openhab using uid: 9001 and gid: 9001 and I have tried all variants.

The error message I get when doing so is

2021-02-22T23:06:51.093692405+01:00 stderr F + echo 'Create group openhab with id 9001'
2021-02-22T23:06:51.093739052+01:00 stdout F Create group openhab with id 9001
2021-02-22T23:06:51.093751274+01:00 stderr F + groupadd -g 9001 openhab
2021-02-22T23:06:51.1007452+01:00 stderr F groupadd: Permission denied.
2021-02-22T23:06:51.100841106+01:00 stderr F groupadd: cannot lock /etc/group; try again later.

Actually this is a bit surprising to me as I expected only the following shares to be on the “root only” volumes:

  • /openhab/addons > openhab-addons-pvc (which is my longhorn PersistentVolumeClaim)
  • /openhab/conf
  • /openhab/userdata

However, "/etc/group" should be residing on the non-persistent container storage. Does the initialization script maybe try to change /etc/group when it already is using uid 9001 gid 9001?

If so, I wonder why the script is not executed as root and only once Openhab is getting started the uid 9001 is getting activated.

I might be completely wrong. Currently I do not see a way to fix this and would be happy with some ideas.

There is a Feature request for Longhorn to define (initial) volume permissions, yet it seems to be planned for v1.1.1 (already got moved from v1.1.0) [FEATURE] add volume attribute for filesystem owner · Issue #1165 · longhorn/longhorn · GitHub

Running a bit out of ideas and would have to go back to NFS where I do have two other challenges:

  • Openhab does not recognize file changes anymore on NFS shares
  • my NFS server is not HA (while the Longhorn Storage system is distributed across my whole K8S cluster and thus is HA enabled

Could someone explain how the OpenHAB init script works? Would there be a possibility that the container first is doing all changes that need root access and only then move to user:group 9001:9001?

Knowing it is a bad idea, running openhab as “root” would be an option, yet the init scrip would rename root with openhab which is not that good of an idea :slight_smile:

Thx for help/ideas.

Cheers,
Jens

The way the container works, if I recall correctly, is it expects to be run as root and the entrypoint script does some maintenance, permissions fixes, and such. Then it switches to the 9001 user (or what ever UID is configured in the USER_ID and GROUP_ID environment variables are set to). All the stuff that openHAB has to read and write to needs to have permissions on user 9001 (or USER_ID).

You can see exactly what the script does at openhab-docker/debian/entrypoint at main · openhab/openhab-docker · GitHub (there is a separate alpine one too but it does pretty much the same things).

You’ll see in the entrypoint script that if the user doesn’t exist it will create it. Based on your logs above the defaults probably don’t already exist. So it does indeed look like it tries to change /etc/group on that first run of the container.

To support being able to set an arbitrary UID:GID to run openHAB as it needs the ability to create that user.

It appears to be doing so by default. However, when you supplied runAsUser you’ve prevented that and it’s running as what ever user you defined the container to run under instead (9001) to do those first steps before switching over to the openhab user. So you are essentially undoing what you want to actually happen.

Hi Rich,

thx for the fast response. I tried to run the system as root (uid:0) now and the error behaviour changes - yet it still has to be a problem with the storage permissions since I can launch openhab if I run w/o any persitent storage attached:

securityContext:
        runAsUser: 0
        runAsGroup: 0
        fsGroup: 0

2021-02-24T21:42:53.040565666+01:00 stderr F ++ test -t 0
2021-02-24T21:42:53.040700943+01:00 stderr F ++ echo false
2021-02-24T21:42:53.040726072+01:00 stderr F + interactive=false
2021-02-24T21:42:53.040746924+01:00 stderr F + set -euo pipefail
2021-02-24T21:42:53.04076585+01:00 stderr F + IFS=’
2021-02-24T21:42:53.040785387+01:00 stderr F ’
2021-02-24T21:42:53.040805442+01:00 stderr F + ‘[’ limited = unlimited ‘]’
2021-02-24T21:42:53.040824961+01:00 stderr F + rm -f /openhab/runtime/instances/instance.properties
2021-02-24T21:42:53.046369784+01:00 stderr F + rm -f /openhab/userdata/tmp/instances/instance.properties
2021-02-24T21:42:53.056163145+01:00 stderr F + NEW_USER_ID=9001
2021-02-24T21:42:53.056273015+01:00 stderr F + NEW_GROUP_ID=9001
2021-02-24T21:42:53.056301163+01:00 stderr F + echo ‘Starting with openhab user id: 9001 and group id: 9001’
2021-02-24T21:42:53.056547994+01:00 stdout F Starting with openhab user id: 9001 and group id: 9001
2021-02-24T21:42:53.056769992+01:00 stderr F + id -u openhab
2021-02-24T21:42:53.067300051+01:00 stderr F ++ getent group 9001
2021-02-24T21:42:53.073324296+01:00 stdout F Create group openhab with id 9001
2021-02-24T21:42:53.073325907+01:00 stderr F + ‘[’ -z ‘’ ‘]’
2021-02-24T21:42:53.073403591+01:00 stderr F + echo ‘Create group openhab with id 9001’
2021-02-24T21:42:53.073439054+01:00 stderr F + groupadd -g 9001 openhab
2021-02-24T21:42:53.119185943+01:00 stderr F + echo ‘Create user openhab with id 9001’
2021-02-24T21:42:53.11924635+01:00 stderr F + adduser -u 9001 --disabled-password --gecos ‘’ --home /openhab --gid 9001 openhab
2021-02-24T21:42:53.119185925+01:00 stdout F Create user openhab with id 9001
2021-02-24T21:42:53.237447291+01:00 stdout F Warning: The home dir /openhab you specified already exists.
2021-02-24T21:42:53.237506291+01:00 stdout F Adding user openhab' ... 2021-02-24T21:42:53.237528013+01:00 stdout F Adding new user openhab’ (9001) with group openhab' ... 2021-02-24T21:42:53.379450141+01:00 stderr F adduser: Warning: The home directory /openhab’ does not belong to the user you are currently creating.
2021-02-24T21:42:53.379576937+01:00 stdout F The home directory /openhab' already exists. Not copying from /etc/skel’.
2021-02-24T21:42:53.432786355+01:00 stderr F + groupadd -g 14 uucp2
2021-02-24T21:42:53.478762687+01:00 stderr F + groupadd -g 16 dialout2
2021-02-24T21:42:53.5370711+01:00 stderr F + groupadd -g 18 dialout3
2021-02-24T21:42:53.610457051+01:00 stderr F + groupadd -g 32 uucp3
2021-02-24T21:42:53.6657551+01:00 stderr F + groupadd -g 997 gpio
2021-02-24T21:42:53.710126038+01:00 stderr F + adduser openhab dialout
2021-02-24T21:42:53.843979198+01:00 stdout F Adding user openhab' to group dialout’ …
2021-02-24T21:42:53.852254757+01:00 stdout F Adding user openhab to group dialout
2021-02-24T21:42:53.891088317+01:00 stdout F Done.
2021-02-24T21:42:53.89406568+01:00 stderr F + adduser openhab uucp
2021-02-24T21:42:54.025838117+01:00 stdout F Adding user openhab' to group uucp’ …
2021-02-24T21:42:54.035756625+01:00 stdout F Adding user openhab to group uucp
2021-02-24T21:42:54.072154057+01:00 stdout F Done.
2021-02-24T21:42:54.07872226+01:00 stderr F + adduser openhab uucp2
2021-02-24T21:42:54.215447136+01:00 stdout F Adding user openhab' to group uucp2’ …
2021-02-24T21:42:54.231915792+01:00 stdout F Adding user openhab to group uucp2
2021-02-24T21:42:54.28046662+01:00 stdout F Done.
2021-02-24T21:42:54.282476788+01:00 stderr F + adduser openhab dialout2
2021-02-24T21:42:54.392775203+01:00 stdout F Adding user openhab' to group dialout2’ …
2021-02-24T21:42:54.401594572+01:00 stdout F Adding user openhab to group dialout2
2021-02-24T21:42:54.437277695+01:00 stdout F Done.
2021-02-24T21:42:54.440739443+01:00 stderr F + adduser openhab dialout3
2021-02-24T21:42:54.548835284+01:00 stdout F Adding user openhab' to group dialout3’ …
2021-02-24T21:42:54.558514146+01:00 stdout F Adding user openhab to group dialout3
2021-02-24T21:42:54.601157321+01:00 stdout F Done.
2021-02-24T21:42:54.607909949+01:00 stderr F + adduser openhab uucp3
2021-02-24T21:42:54.723914556+01:00 stdout F Adding user openhab' to group uucp3’ …
2021-02-24T21:42:54.731631898+01:00 stdout F Adding user openhab to group uucp3
2021-02-24T21:42:54.761221331+01:00 stdout F Done.
2021-02-24T21:42:54.764907837+01:00 stderr F + adduser openhab gpio
2021-02-24T21:42:54.904604076+01:00 stdout F Adding user openhab' to group gpio’ …
2021-02-24T21:42:54.912718229+01:00 stdout F Adding user openhab to group gpio
2021-02-24T21:42:54.963255615+01:00 stdout F Done.
2021-02-24T21:42:54.967758595+01:00 stderr F + initialize_volume /openhab/conf /openhab/dist/conf
2021-02-24T21:42:54.967825409+01:00 stderr F + volume=/openhab/conf
2021-02-24T21:42:54.967846965+01:00 stderr F + source=/openhab/dist/conf
2021-02-24T21:42:54.969082954+01:00 stderr F ++ ls -A /openhab/conf
2021-02-24T21:42:54.980423265+01:00 stderr F + ‘[’ -z lost+found ‘]’
2021-02-24T21:42:54.980824002+01:00 stderr F + initialize_volume /openhab/userdata /openhab/dist/userdata
2021-02-24T21:42:54.98088852+01:00 stderr F + volume=/openhab/userdata
2021-02-24T21:42:54.980911391+01:00 stderr F + source=/openhab/dist/userdata
2021-02-24T21:42:54.982161343+01:00 stderr F ++ ls -A /openhab/userdata
2021-02-24T21:42:54.992189073+01:00 stderr F + ‘[’ -z ‘lost+found
2021-02-24T21:42:54.992309812+01:00 stderr F tmp’ ‘]’
2021-02-24T21:42:54.994466035+01:00 stderr F ++ cmp /openhab/userdata/etc/version.properties /openhab/dist/userdata/etc/version.properties
2021-02-24T21:42:55.018285425+01:00 stderr F cmp: /openhab/userdata/etc/version.properties: No such file or directory
2021-02-24T21:42:55.021402195+01:00 stderr F + ‘[’ ‘!’ -z ‘]’
2021-02-24T21:42:55.021477027+01:00 stderr F + chown -R openhab:openhab /openhab
2021-02-24T21:43:01.781928871+01:00 stderr F + sync
2021-02-24T21:43:01.966363089+01:00 stderr F + ‘[’ -d /etc/cont-init.d ‘]’
2021-02-24T21:43:01.966447311+01:00 stderr F + sync
2021-02-24T21:43:01.983915161+01:00 stderr F + ‘[’ false == false ‘]’
2021-02-24T21:43:01.983984735+01:00 stderr F ++ IFS=’ ’
2021-02-24T21:43:01.984009327+01:00 stderr F ++ echo gosu openhab tini -s ./start.sh
2021-02-24T21:43:01.989877851+01:00 stderr F + ‘[’ ‘gosu openhab tini -s ./start.sh’ == ‘gosu openhab tini -s ./start.sh’ ‘]’
2021-02-24T21:43:01.989947943+01:00 stderr F + command=($@ server)
2021-02-24T21:43:01.99282527+01:00 stderr F + exec gosu openhab tini -s ./start.sh server
2021-02-24T21:43:02.011108762+01:00 stdout F Launching the openHAB runtime…
2021-02-24T21:43:02.492836194+01:00 stdout F KARAF_ETC is not valid: /openhab/userdata/etc

The only directory that I see being created on the volumes is /openhab/userdata/tmp it belongs to 9001:9001 since I did not change the openhab user but the container user…

Maybe I am not seeing the trees in the woods :stuck_out_tongue:
Thx again,
Jens

All I can offer is to study the entrypoint script, maybe modify it to log out more information so you can learn more about what it does and how.

This line stands out though as well as the line above it warning that /openhab belongs to the wrong user. One of the things that the entrypoint does is to copy the default set of configs from /etc/skel to /openhab/userdata which populates it with the stuff openHAB needs to run. If those files are not copied over OH won’t be able to start.

And the last line of the output proves that out. /openhab/userdata/etc isn’t valid (because it’s not there).

Yes, it is this stupid permission issue with Longhorn expecting owner to be root. If I change the container itself to run as user 9001 (which did the trick for Grafana changing it to 472) I run into issues as user 9001 I cannot change /etc/group.

I know this is not a “volume” use case that many users will follow, so I highly appreciate you taking the time to discuss and troubleshoot with me.

I do see 2 options:

  1. waiting for Longhorn providing a fix hopefully with v1.1.1 whicih is expected in a couple of weeks.
  2. is there a reason why you do not pre-populate /etc/group with entry for openhab:x:9001 and if someone provides different user & group name you would only change the ids accordingly?

Trying one more thing which is copying the directories and data from a plain 3.0.1 docker iinit, pushing it to the longhorn volumes and changing permissions to 777.

H Rich,

sometimes the easy is so close. Actually that last step did the trick. Just creating all file structure in a different container, copying all data and setting permissions to 9001:9001.

Container came straight up with no changes to Security Context of the pod nor changing to different uiserid/groupid.

Will close this for now and mark it as a soltuion :slight_smile:

Thx again (especially for all the great stuff you guys do for OpenHAB!

Jens