Ansible Revisited

fex · April 1, 2021, 10:31am

First - again a big thank you for sharing this work with us.

Working on adapting it for my own use, a few question regarding the openHAB role popped up:

Role create-user:
That’s probably a private role you wrote for your own use? What’s the result of the line service: openHAB? At least the build in Ansible user module does not know that option
Add openhab user to the dialout group - What’s the dialout group? Somehow that doesn’t ring a bell for me - neither in the context of openHAB nor docker nor Ansible
Task Pull/update the openHAB docker image

    devices:
      - "/dev/ttyUSB0:/dev/ttyUSB0:rwm"
      - "/dev/ttyUSB1:/dev/ttyUSB1:rwm"

Where do these USB devices come into play? What do they have to do with the docker task?

Step 11 If a new image was pulled, wait file minutes and restart the container
So repeating the task Pull/update the openHAB docker image does restart the openHAB container? Or should the task after Wait a couple minutes if a new image was pulled look differently? Especially the line restart: False in both instances of the task confuses me Also, I don’t get it why you set the variable openhab_pulled again, shouldn’t that be when: openhab_pulled.changed?
Would the following be enough or is it necessary to repeat all the option?

- name: Restart openHAB container
  docker_container:
    name: openhab
    restart: true

(You might have noticed that my understanding of the Ansible docker_container module is a bit lacking, so this is not only pettiness on my part )

4b) In the first version of Pull/update the openHAB docker image you are using hostname: argus.koshak.net, in the second one you are using hostname: "{{ ansible_fqdn }}".

rlkoshak · April 1, 2021, 2:44pm

Yes, this is a custom role. Here is the tasks/main.yml.

---
# tasks file for roles/create-user

- name: "Create {{ user_name }} group so we can control the gid"
  group:
    gid: "{{ gid }}"
    name: "{{ user_name }}"
    state: present
    system: True
  become: True

- name: "Create {{ user_name }} user"
  user:
    comment: "{{ service }} service"
    createhome: "{{ create_home }}"
    group: "{{ user_name }}"
    name: "{{ user_name }}"
    shell: /usr/sbin/nologin
    state: present
    system: True
    uid: "{{ uid }}"
  become: True

- name: "Add {{ default_user }} to the {{ user_name }} group"
  user:
    append: True
    groups: "{{ user_name }}"
    name: "{{ default_user }}"
  become: Tru

I makes sure that the user and group exists and they have the passed in UID and GID and adds the “default user” which is my usual login to the group.

On most Debian based Linux instances a user must be a member of the dialout group to have permission to read and write to serial devices, like a Zwave controller. It’s discussed in the Linux installation instructions: openHAB on Linux | openHAB

They allow the container to access the hardware on the host. In this case ttyUSB0 is the Zwave controller and ttyUSB1 is the Zigbee Coordinator. A container does not have access to any serial devices like this without explicitly giving it permission to access them.

This is not needed in OH 3. It was just a work around to deal with the fact that when the cache was cleared in OH 2, some of us would have openHAB fail to come up properly that first time and require a restart to be operational. I’ve removed that block since moving to OH 3.

The first docker-container task pulls the new version (if there is one) and recreates the container with the given properties. It registers a variable to indicate that a new image was pulled. Only if a new image was pulled it will wait five minutes for OH to completely come back up. Then it runs the pull/restart again. This time there won’t be a new image to pull so the container is just restarted.

But again, this was to work around a bug in OH 2.5 that doesn’t exist in OH 3 so I’ve removed those.

And because this was a temporary fix for a bug, I didn’t spend much time thinking about it or working on it. I just copied the first docker_container task as written.

I’ve also dropped InfluxDB and Grafana so my openHAB tasks have changed significantly. In fact I currently have it so it will install a test version (OH 3) and a production version (2.5) based on the value of a variable. But now that I’m fully on OH 3 all of that stuff is not used any more.

The new openHAB task is:

---
# tasks file for roles/openhab

- name: Debug
  debug:
    msg: |
      openhab_home = {{ openhab_home }}
      repo = {{ openhab_conf_repo }}
      version = {{ openhab_version }}

- name: Create the openhab user and group
  include_role:
    name: create-user
  vars:
    uid: "{{ openhab_uid }}"
    gid: "{{ openhab_uid }}"
    user_name: openhab
    create_home: False
    service: openHAB

- block:

  - name: Add openhab user to the dialout group
    user:
      append: True
      groups: dialout
      name: openhab

  - name: Create if necessary and set the permissions on the openHAB data folder
    file:
      path: "{{ openhab_home }}"
      state: directory
      owner: openhab
      group: openhab
      mode: u=rwx,g=rwx,o=rx

  become: True

- name: See if config is already present
  stat:
    path: "{{ openhab_home }}/userdata/etc/version.properties"
  register: conf_present

- name: Check to see if its is up
  shell: "nc -vz {{ git_host }} {{ git_port }}"
  register: git_running
  changed_when: False
  failed_when: False

- name: Checkout openHAB configs if this is a new install
  git:
    repo: "{{ openhab_conf_repo }}"
    dest: "{{ openhab_home }}"
    accept_hostkey: True
  when: (git_running['stderr'] is match(".* succeeded!")) and
        (not conf_present.stat.exists)

- name: Create missing folders
  file:
    path: "{{ item }}"
    state: directory
    owner: openhab
    group: openhab
    mode: u=rwx,g=rwx,o=rx
  loop:
    - "{{ openhab_home }}/userdata/cache"
    - "{{ openhab_home }}/userdata/logs"
    - "{{ openhab_home }}/userdata/persistence"
    - "{{ openhab_home }}/userdata/tmp"
  become: True

- name: Change ownership of openHAB configs
  file:
    path: "{{ openhab_home }}"
    owner: openhab
    group: openhab
    recurse: yes
  become: True
  when: (git_running['stderr'] is match(".* succeeded!")) and
        (not conf_present.stat.exists)

# Kept for reference but in OH 3 I've moved to rrd4j and built in charting
#- name: Create the InfluxDB database
#  influxdb_database:
#    hostname: "{{ influxdb_ip_address }}"
#    database_name: "{{ openhab_influxdb_database_name }}"
#    state: present
#    username: "{{ influxdb_admin_user }}"
#    password: "{{ influxdb_admin_password }}"
#
#- name: Create the InfluxDB openHAB user and grant permissions
#  influxdb_user:
#    hostname: "{{ influxdb_ip_address }}"
#    user_name: "{{ influxdb_openhab_user }}"
#    user_password: "{{ influxdb_openhab_password }}"
#    login_username: "{{ influxdb_admin_user }}"
#    login_password: "{{ influxdb_admin_password }}"
#    grants:
#      - database: "{{ openhab_influxdb_database_name }}"
#        privilege: 'ALL'
#
#- name: Create InfluxDB Grafana user and grant read permissions
#  influxdb_user:
#    hostname: "{{ influxdb_ip_address }}"
#    user_name: "{{ influxdb_grafana_user }}"
#    user_password: "{{ influxdb_grafana_password }}"
#    login_username: "{{ influxdb_admin_user }}"
#    login_password: "{{ influxdb_admin_password }}"
#    grants:
#      - database: "{{ openhab_influxdb_database_name }}"
#        privilege: 'READ'

- name: Check the current version of openHAB
  shell: grep openhab-distro {{ openhab_home }}/userdata/etc/version.properties | cut -d ' ' -f 4
  register: old_version
  when: conf_present.stat.exists
  changed_when: False

- name: Pull/update the openHAB docker image
  docker_container:
    detach: True
    devices:
      - "/dev/ttyUSB0:/dev/ttyUSB0:rwm"
      - "/dev/ttyUSB1:/dev/ttyUSB1:rwm"
    env:
      CRYPTO_POLICY: unlimited
    hostname: "{{ ansible_fqdn }}"
    image: openhab/openhab:{{ openhab_version }}
    log_driver: syslog
    name: openhab
    network_mode: host
    pull: True
    restart: False
    restart_policy: always
    tty: True
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro
      - "{{ openhab_home }}/conf:/openhab/conf"
      - "{{ openhab_home }}/userdata:/openhab/userdata"
      - "{{ openhab_home }}/addons:/openhab/addons"
  register: openhab_pulled

All those openhab variables are defined in defaults. And when I want to deploy a test version on a different machine I override the defaults, primarily changing the openhab_version variable.

fex · April 5, 2021, 12:38pm

rlkoshak:

- name: Checkout openHAB configs if this is a new install
  git:
    repo: "{{ openhab_conf_repo }}"
    dest: "{{ openhab_home }}"
    accept_hostkey: True
  when: (git_running['stderr'] is match(".* succeeded!")) and
        (not conf_present.stat.exists)

And I’m running once again into my old friend java.lang.NumberFormatException: null when I try to start the container with my existing config:

Launching the openHAB runtime...
null
Error occurred shutting down framework: java.lang.NumberFormatException: null
java.lang.NumberFormatException: null
        at java.lang.Integer.parseInt(Integer.java:542)
        at java.lang.Integer.parseInt(Integer.java:615)
        at org.apache.karaf.main.ConfigProperties.<init>(ConfigProperties.java:235)
        at org.apache.karaf.main.Main.updateInstancePidAfterShutdown(Main.java:227)
        at org.apache.karaf.main.Main.main(Main.java:192)

How do you @rlkoshak manage to start a new container with an existing git repo? You once mentioned that I would need the contend of the userdata folder in addition to the config folder.
So what files / folders in the userdata folder do you include into your repo? For example userdata/logs/* wouldn’t make any sense…
What else are auto-generated / often changing files that could and should be ignored by a git repo?

rlkoshak · April 5, 2021, 2:55pm

I run in Docker so both my config and userdata folders are in the same parent folder. This parent folder gets checked in to git. Here is my .gitignore:

userdata/*
!userdata/secrets
!userdata/etc
!userdata/jsondb
!userdata/config
!userdata/habot
!userdata/openhabcloud
!userdata/uuid
!userdata/zigbee
!userdata/zwave
userdata/jsondb/backup
userdata/backup
.Trash-1000
conf/html/hum.jpg
conf/html/light.jpg
conf/html/power.jpg
conf/html/temp.jpg
*py.class
*.swp
*~
*.old

If you want the persisted data to be checked in too (MapDB and rrd4j) add a !userdata/persistence to the list.

But that isn’t related to your problem. If you see that error after copying your configs over, the problem is one of your config files you’ve copied over has a parameter that expects a number but doesn’t have a parsable number as the value.

fex · April 12, 2021, 1:35pm

Maybe that’s related to just trowing my openHAB 2.5 config into a new openHAB 3 Docker instance
At least now that I decided to start with a fresh instance & repo for OH3 and just integrate the old config piece by piece into the new repo, I haven’t had that problem again

But leaving the null pointer issue aside, I have a more general question about your workflow. As far as I get it you check out your config-repo via Ansible and set your openhab service account as owner of these files.
I would guess you are not logging in with the openhab service account? Do you always use Ansible to pull changes? Do you also use Ansible to push changes from your openHAB machine (aka machine on which the openHAB Docker instance is running) to your git host?

Background for my questions: if I ssh into the host of my openHAB instance with a separate account (which is a member of the openhab group), let’s call it default account, and try to do any git actions manually, I’m not allowed to execute them.
If I use sudo to force the issue, I’m running into a bunch of different problems…

So how do you commit changes, for example to the jsondb, happening on argus?
In the past I circumvented that issue by skipping an openhab service account and just setting the openhab container to the U/GID of my default user. But as far as I understand it that’s not really the best idea from a security standpoint?

rlkoshak · April 12, 2021, 3:16pm

In practice yes, I pretty much only pull through Ansible. That’s primarily because I only pull when I’m moving the config to another location or another machine. And when I do that there is a ton of other stuff that has to be done so I always do that sort of thing through Ansible.

But there is nothing technically preventing me from pulling at any time. I just don’t have times where I need to pull manually.

As for pushes, I just use my login account. When I create the openhab user, I add the openhab group to my main login account. That’s the last task in the create-user role. {{ default_user }} is my main login.

But what I don’t remember is if I’ve ever had to manually add group write permissions to the files and folders to make commits and pushes possible. I don’t think so because when I moved to OH 3 I moved around quite a bit but I have no recollection of doing anything manually like that on each move.

I only push manually because I don’t necessarily want a work in progress to be committed.

Check the permissions on the files and folders. In particular, the ownership and permissions on the .git folder and files/folders under that. Make sure they are owned by openhab or your login user and that if they are owned by openhab that they have group read/write permissions.

If you’ve ever run an operation as root some of those files and folders may be owned by root now.

In general though it works pretty straight forward for me. When I’m moving to a new location it pulls the configs from git and sets the ownership to openhab:openhab. As I make changes I git add and commit locally using my login account and finally when I’m ready I’ll push, again as my login account.

It’s not a best practice. Typically you want to have a service like this, especially one exposed to the network, not to have a shell account. It you try to log in as that user you can’t because it doesn’t have a password. If someone compromises the service and gets dropped to a shell, they will not actually have a shell to work in. If you use your login account that means if they break out of the service they will have a shell to work in.

Is the risk large? Probably not. But it’s a relatively simple way to improve security over all, which is why that’s the standard.

fex · April 12, 2021, 4:01pm

You are correct, the cause for my troubles where missing write permissions for the group. The interesting thing is, that your task Create if necessary and set the permissions on the openHAB data folder should actually take care of the problem - at least that what I was thinking.
But following the task execution via Ansible Debugger, it seems like the task is setting permissions for the openhab_home folder - but not recursively. Thats fixable by adding recurse: yes to the task and running it after the checkout / update task

  - name: Create if necessary and set the permissions on the openHAB data folder
    file:
      path: "{{ openhab_home }}"
      state: directory
      owner: openhab
      group: openhab
      mode: u=rwx,g=rwx,o=rx
      recurse: yes
    become: true

rlkoshak · April 12, 2021, 4:25pm

That happens before the configs are pulled from the repo though so it only applies to the home folder.

So it won’t have a chance to correct the files checked out from git until the second time you run the playbook. Did you run the playbook after the configs were first checked back out?

I may have never noticed this before because I run my playbooks a couple times a week as I use the same playbooks and roles to update/upgrade as I do to install.

Anyway, if you want the permissions to be correct the very first time that the playbook is run, add another recursive change of the permissions task after the “Change ownership of openHAB configs”.

fex · April 12, 2021, 4:50pm

I changed the order of the tasks. I didn’t see a reason to set file permissions before checking out the git repo, so I just moved the file permission task to after the git update / checkout.

rlkoshak · April 12, 2021, 4:54pm

If you don’t create the folder first then the git checkout will fail because it doesn’t have a folder to checkout into.

fex · April 12, 2021, 5:05pm

Nope - the following runs just fine even on first run, the git checkout will create the necessary folder:

---
# tasks file for openhab

- name: Debug
  debug:
    msg: |
      openhab_home = {{ openhab_home }}
      repo = {{ openhab_repo }}
      version = {{ openhab_version }}

- name: "create user {{ openhab_user }} and add {{ system_user }} to group {{ openhab_user }}"
  include_role:
    name: create-service-account
  vars:
    uid: "{{ openhab_uid }}"
    gid: "{{ openhab_uid }}"
    service_user: "{{ openhab_user }}"
    create_home: False
    service: openHAB

- name: "checkout / update openHAB config"
  git:
    accept_hostkey: yes
    dest: "{{ openhab_home }}/openhab"
    force: "{{ openhab_repo_force }}"
    key_file: "/home/{{ system_user }}/.ssh/id_rsa"
    repo: "{{ openhab_repo }}"
    version: "{{ openhab_repo_branch }}"
  when: openhab_repo|length > 0

- name: Create if necessary and set the permissions on the openHAB data folder
  file:
    path: "{{ openhab_home }}/openhab"
    state: directory
    owner: "{{ openhab_user }}"
    group: "{{ openhab_user }}"
    mode: u=rwx,g=rwx,o=rx
    recurse: yes
  become: True

- name: Create missing folders
  file: 
    path: "{{ item }}"
    state: directory
    owner: "{{ openhab_user }}"
    group: "{{ openhab_user }}"
    mode: u=rwx,g=rwx,o=rx
  loop:
    - "{{ openhab_home }}/openhab/userdata/cache"
    - "{{ openhab_home }}/openhab/userdata/logs"
    - "{{ openhab_home }}/openhab/userdata/persistence"
    - "{{ openhab_home }}/openhab/userdata/tmp"
  become: True

- name: Change ownership of openHAB configs
  file:
    path: "{{ openhab_home }}/openhab"
    owner: "{{ openhab_user }}"
    group: "{{ openhab_user }}"
    recurse: yes
  become: True

- name: See if config is already present
  stat:
    path: "{{ openhab_home }}/openhab/userdata/etc/version.properties"
  register: conf_present

- name: Check the current version of openHAB
  shell: grep openhab-distro {{ openhab_home }}/userdata/etc/version.properties | cut -d ' ' -f 4
  register: old_version
  when: conf_present.stat.exists
  changed_when: False

- name: Pull/update the openHAB docker image
  tags: openhab
  docker_container:
    detach: True
    env:
      CRYPTO_POLICY: unlimited
      OPENHAB_HTTP_PORT: "8080"
      OPENHAB_HTTPS_PORT: "8443"
      EXTRA_JAVA_OPTS: "-Duser.timezone={{ openhab_timezone }}"
    hostname: "{{ openhab_hostname }}"
    image: openhab/openhab:{{ openhab_version }}
    name: openhab
    network_mode: host
    pull: True
    state: started
    interactive: yes
    restart: False
    restart_policy: always
    tty: True
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro
      - "{{ openhab_home }}/openhab/conf:/openhab/conf"
      - "{{ openhab_home }}/openhab/userdata:/openhab/userdata"
      - "{{ openhab_home }}/openhab/addons:/openhab/addons"
  register: openhab_pulled

rlkoshak · April 12, 2021, 5:16pm

It may be a permissions thing.

For example, I have all my OH configs in /srv/. /srv/ is owned by root with write permissions only for the user root.

And of course root is not configured with an account on my git repo nor have I configured it to access my git repo using my main log’s credentials (ssh cert in this case).

So only root can create a new folder in /srv/ but only my main login can checkout the configs.

Therefore the root folder for the configs is created by root and then immediately changed to use openhab:openhab as the owner with write permissions for group. That will give my main login permission to git checkout into that folder. Without that though, only root would be able to checkout into that folder.

The disadvantage of writing all this stuff into playbooks is you forget the whys sometimes. But the advantage is you don’t have to remember it.

fex · December 12, 2021, 7:26pm

Small update: I moved my (heavily inspired by this post) openHAB role into it’s own repo and published it on Github as ansible-openhab.

At some (undefined) point in the future I might actually find the time to document and publish my whole play to setup openHAB on a Rasperry Pi from scratch…

rlkoshak · December 13, 2021, 3:16pm

Consider publishing it to Ansible Galaxy too and people will be able to pull it using the tools build into Ansible itself.

One addition that you might want to add

      - "{{ openhab_home }}/cont-init.d:/etc/cont-init.d"

to the volumes. Any scripts that exist in {{ openhab_home }}/cont-init.d will be executed prior to openHAB starting up inside the container. It’s a great way to add stuff to the container (e.g. ffmpeg) without needing to build a custom Image.

fex · December 13, 2021, 5:39pm

Considering yes, but I have to familirize myself first with how to publish to Ansible Galaxy. For now my aim is to publish all relevant roles publicly to enable others to build up an openHAB server from scratch And that means writing documentation

On a sidenote - might I also publish your little helper rules where necessary to enable this? For example create-service-account & * min-writes*? Or would you do it yourself so that I could reference them as a submodule/dependency?

Oh, that sounds good

rlkoshak · December 13, 2021, 5:49pm

Run ansible-lint on the roles and it will point out how to document it for ansible-galaxy publication. I think ultimately the code needs to be in a git service somewhere so you are part of the way there already. Then ansible-lint will show you where you deviate from ansible best practices (I’ve only 10000 more findings to go).

How to write Ansible Roles and publish them on Ansible Galaxy? provides a good tutorial for how to publish to Galaxy. It’s way easier for end users than requiring a git clone and copy type stuff. The docs are mostly captured in the README.md and in role/meta/main.yml (where you define stuff like license, versions, dependencies, etc.

You can publish the other ones too if you like, though I don’t always run min-writes any more. I think zram works better over all and plan on moving to that at some point.

I do have plans to publish stuff to Ansible Galaxy at some point (I have 47 active roles in my system and another dozen or so more that still work but I don’t use any more) but it’s a long slog to make them ready and I’ve higher priority things going right now.

Just include attribution and I’m good with what ever you want to do.

fex · December 13, 2021, 7:33pm

Jep - and most (all?) of the files are actually created for you when you create the role with ansible-galaxy init - but it’s still a big step from ‘the role works for me’ to ‘all Galaxy requirements are satisfied AND the documentation is well enough to make the role useful for others’…

I feel you…

Of course