Ansible Revisited

Bruce_Osborne · October 4, 2020, 11:35am

There is a typo somewhere. That should be minutes.

rlkoshak · October 5, 2020, 2:37pm

I did test these but somehow this one must have escaped testing. That task should be

- name: Wait if container restarted
  pause:
    minutes: 1
  when: mosquitto_pulled.changed

I don’t care what passwd_file is at that point. I only need to wait if the Mosquitto container restarted which is indicated by the mosquitto_pulled variable. And as Bruce points out, I misstyped minutes.

The intent is to only wait for Mosquitto to come back up when the container changed or was restarted for some reason.

marcel_erkel · October 5, 2020, 5:21pm

rlkoshak:

- hosts: homeauto
  vars:
    - update: False
  roles:
    - common
    - { role: vm, when: not update }
    - firemotd
    - mount-data
    - nut-client
    - msmtp
    - fail2ban
    - { role: multitail, openhab_logs: '/srv/openhab2/userdata/logs/' }
    - { role: docker, when not update }
    - role: portainer
      vars:
        - portainer_server: False 
        - pport: 9002
    - openzwave
    - mosquitto
    - influxdb
    - grafana
    - openhab
    - docker-prune
    - tripwire

Shouldn’t that be:

  - { role: docker, when: not update }

(I added a colon after when)

rlkoshak · October 5, 2020, 6:30pm

Yep, I wonder how that isn’t throwing a syntax error when I run it. I run this playbook about every couple of days to update. Good catch!

ysc · October 6, 2020, 11:48am

You might be interested to know that the OH3 logins have a basic built-in progressive delay mechanism to counter brute force attacks. IIRC it adds 1 second to a global lockout delay where nobody can login - it’s not account-based - after every failed login. You’ll see a message in the log when this happens (but not on the login page):

[WARN ] [tp.auth.internal.AuthorizePageServlet] - Authentication failed: Too many consecutive login attempts

If you get locked out for too long out after a number of failed attempts, the solution is to restart openHAB -or the “HTTP Interface Authentication” bundle responsible for logins.

Great article btw!

Bruce_Osborne · October 6, 2020, 12:19pm

@ysc
Its there any documentation or hints to help me here?

eric1905 · December 1, 2020, 9:07am

How are you installing your bindings? By hand?
There was a tutorial by @ads where I can use the rest api to install bindings which I used for ansible with openhab2. But with openhab3 I am not able to do this because I need to authenticate first. Do you already have a role for this one?
Is it also possible to create a configuration file to create my admin user after the installation process or automate the admin creation within the ansible script as well? Otherwise I would need 2 Ansible playbooks. One for the server setup and the openhab installation and one for the openhab configuration with all the bindings and so on.
Maybe @ysc can tell something regarding my second point with the user creation.

Thank you in advance.
Eric

ads · December 1, 2020, 10:00am

I haven’t found time to look into OH3, unfortunately.

ads · December 1, 2020, 10:02am

That is an excellent way to create a Denial of Service - for everyone.

Bruce_Osborne · December 1, 2020, 10:56am

You have 2 choices for that. I think the best is to generate and use an api token. If you enable it basic authentication is also available. I use an api token with HABapp.

rlkoshak · December 1, 2020, 3:06pm

Yes. What bindings get installed is configuration (see /var/lib/openhab/config/org/openhab/addons.conf I think). All configuration gets saved to the git repo. So when I restore the config from git, it also installs the bindings. Actually creating openHAB configuration from Ansible feels a lot like writing a program to write a program. An interesting exercise but ultimately a waste of time. But that’s just my opinion.

There is a chicken and egg problem. You can’t generate the token until you have an admin login. You don’t have an admin login until after the first time you go to MainUI after installation.

I recently saw mention that a you can define users in /var/lib/openhab/etc/users.cfg but don’t know anything more than that. But it might be an avenue for research.

eric1905 · December 5, 2020, 11:59pm

I think @ysc developed the new UI. So maybe he has more information if the timezone and the admin user can be configured by cfg files

ysc · December 6, 2020, 8:21am

It’s not really an UI problem. Well, it can have consequences in the UI but in short, the users obey the provider/registry pattern that we have for things, items, rules and so on.

So you have a UserProvider interface and right now the only implementation of it is the ManagedUserProvider which will source the users from the JSONDB.
If someone ever makes another UserProvider, say from a CSV file, then the resulting users as exposed by the UserRegistry will be the union of all these users provided by all UserProviders.

(but disclaimer, I believe there will be problems, there were shortcuts to assume all users come from the ManagedUserProvider, but it doesn’t mean it has to stay that way).

fex · April 1, 2021, 10:31am

First - again a big thank you for sharing this work with us.

Working on adapting it for my own use, a few question regarding the openHAB role popped up:

Role create-user:
That’s probably a private role you wrote for your own use? What’s the result of the line service: openHAB? At least the build in Ansible user module does not know that option
Add openhab user to the dialout group - What’s the dialout group? Somehow that doesn’t ring a bell for me - neither in the context of openHAB nor docker nor Ansible
Task Pull/update the openHAB docker image

    devices:
      - "/dev/ttyUSB0:/dev/ttyUSB0:rwm"
      - "/dev/ttyUSB1:/dev/ttyUSB1:rwm"

Where do these USB devices come into play? What do they have to do with the docker task?

Step 11 If a new image was pulled, wait file minutes and restart the container
So repeating the task Pull/update the openHAB docker image does restart the openHAB container? Or should the task after Wait a couple minutes if a new image was pulled look differently? Especially the line restart: False in both instances of the task confuses me Also, I don’t get it why you set the variable openhab_pulled again, shouldn’t that be when: openhab_pulled.changed?
Would the following be enough or is it necessary to repeat all the option?

- name: Restart openHAB container
  docker_container:
    name: openhab
    restart: true

(You might have noticed that my understanding of the Ansible docker_container module is a bit lacking, so this is not only pettiness on my part )

4b) In the first version of Pull/update the openHAB docker image you are using hostname: argus.koshak.net, in the second one you are using hostname: "{{ ansible_fqdn }}".

rlkoshak · April 1, 2021, 2:44pm

Yes, this is a custom role. Here is the tasks/main.yml.

---
# tasks file for roles/create-user

- name: "Create {{ user_name }} group so we can control the gid"
  group:
    gid: "{{ gid }}"
    name: "{{ user_name }}"
    state: present
    system: True
  become: True

- name: "Create {{ user_name }} user"
  user:
    comment: "{{ service }} service"
    createhome: "{{ create_home }}"
    group: "{{ user_name }}"
    name: "{{ user_name }}"
    shell: /usr/sbin/nologin
    state: present
    system: True
    uid: "{{ uid }}"
  become: True

- name: "Add {{ default_user }} to the {{ user_name }} group"
  user:
    append: True
    groups: "{{ user_name }}"
    name: "{{ default_user }}"
  become: Tru

I makes sure that the user and group exists and they have the passed in UID and GID and adds the “default user” which is my usual login to the group.

On most Debian based Linux instances a user must be a member of the dialout group to have permission to read and write to serial devices, like a Zwave controller. It’s discussed in the Linux installation instructions: openHAB on Linux | openHAB

They allow the container to access the hardware on the host. In this case ttyUSB0 is the Zwave controller and ttyUSB1 is the Zigbee Coordinator. A container does not have access to any serial devices like this without explicitly giving it permission to access them.

This is not needed in OH 3. It was just a work around to deal with the fact that when the cache was cleared in OH 2, some of us would have openHAB fail to come up properly that first time and require a restart to be operational. I’ve removed that block since moving to OH 3.

The first docker-container task pulls the new version (if there is one) and recreates the container with the given properties. It registers a variable to indicate that a new image was pulled. Only if a new image was pulled it will wait five minutes for OH to completely come back up. Then it runs the pull/restart again. This time there won’t be a new image to pull so the container is just restarted.

But again, this was to work around a bug in OH 2.5 that doesn’t exist in OH 3 so I’ve removed those.

And because this was a temporary fix for a bug, I didn’t spend much time thinking about it or working on it. I just copied the first docker_container task as written.

I’ve also dropped InfluxDB and Grafana so my openHAB tasks have changed significantly. In fact I currently have it so it will install a test version (OH 3) and a production version (2.5) based on the value of a variable. But now that I’m fully on OH 3 all of that stuff is not used any more.

The new openHAB task is:

---
# tasks file for roles/openhab

- name: Debug
  debug:
    msg: |
      openhab_home = {{ openhab_home }}
      repo = {{ openhab_conf_repo }}
      version = {{ openhab_version }}

- name: Create the openhab user and group
  include_role:
    name: create-user
  vars:
    uid: "{{ openhab_uid }}"
    gid: "{{ openhab_uid }}"
    user_name: openhab
    create_home: False
    service: openHAB

- block:

  - name: Add openhab user to the dialout group
    user:
      append: True
      groups: dialout
      name: openhab

  - name: Create if necessary and set the permissions on the openHAB data folder
    file:
      path: "{{ openhab_home }}"
      state: directory
      owner: openhab
      group: openhab
      mode: u=rwx,g=rwx,o=rx

  become: True

- name: See if config is already present
  stat:
    path: "{{ openhab_home }}/userdata/etc/version.properties"
  register: conf_present

- name: Check to see if its is up
  shell: "nc -vz {{ git_host }} {{ git_port }}"
  register: git_running
  changed_when: False
  failed_when: False

- name: Checkout openHAB configs if this is a new install
  git:
    repo: "{{ openhab_conf_repo }}"
    dest: "{{ openhab_home }}"
    accept_hostkey: True
  when: (git_running['stderr'] is match(".* succeeded!")) and
        (not conf_present.stat.exists)

- name: Create missing folders
  file:
    path: "{{ item }}"
    state: directory
    owner: openhab
    group: openhab
    mode: u=rwx,g=rwx,o=rx
  loop:
    - "{{ openhab_home }}/userdata/cache"
    - "{{ openhab_home }}/userdata/logs"
    - "{{ openhab_home }}/userdata/persistence"
    - "{{ openhab_home }}/userdata/tmp"
  become: True

- name: Change ownership of openHAB configs
  file:
    path: "{{ openhab_home }}"
    owner: openhab
    group: openhab
    recurse: yes
  become: True
  when: (git_running['stderr'] is match(".* succeeded!")) and
        (not conf_present.stat.exists)

# Kept for reference but in OH 3 I've moved to rrd4j and built in charting
#- name: Create the InfluxDB database
#  influxdb_database:
#    hostname: "{{ influxdb_ip_address }}"
#    database_name: "{{ openhab_influxdb_database_name }}"
#    state: present
#    username: "{{ influxdb_admin_user }}"
#    password: "{{ influxdb_admin_password }}"
#
#- name: Create the InfluxDB openHAB user and grant permissions
#  influxdb_user:
#    hostname: "{{ influxdb_ip_address }}"
#    user_name: "{{ influxdb_openhab_user }}"
#    user_password: "{{ influxdb_openhab_password }}"
#    login_username: "{{ influxdb_admin_user }}"
#    login_password: "{{ influxdb_admin_password }}"
#    grants:
#      - database: "{{ openhab_influxdb_database_name }}"
#        privilege: 'ALL'
#
#- name: Create InfluxDB Grafana user and grant read permissions
#  influxdb_user:
#    hostname: "{{ influxdb_ip_address }}"
#    user_name: "{{ influxdb_grafana_user }}"
#    user_password: "{{ influxdb_grafana_password }}"
#    login_username: "{{ influxdb_admin_user }}"
#    login_password: "{{ influxdb_admin_password }}"
#    grants:
#      - database: "{{ openhab_influxdb_database_name }}"
#        privilege: 'READ'

- name: Check the current version of openHAB
  shell: grep openhab-distro {{ openhab_home }}/userdata/etc/version.properties | cut -d ' ' -f 4
  register: old_version
  when: conf_present.stat.exists
  changed_when: False

- name: Pull/update the openHAB docker image
  docker_container:
    detach: True
    devices:
      - "/dev/ttyUSB0:/dev/ttyUSB0:rwm"
      - "/dev/ttyUSB1:/dev/ttyUSB1:rwm"
    env:
      CRYPTO_POLICY: unlimited
    hostname: "{{ ansible_fqdn }}"
    image: openhab/openhab:{{ openhab_version }}
    log_driver: syslog
    name: openhab
    network_mode: host
    pull: True
    restart: False
    restart_policy: always
    tty: True
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro
      - "{{ openhab_home }}/conf:/openhab/conf"
      - "{{ openhab_home }}/userdata:/openhab/userdata"
      - "{{ openhab_home }}/addons:/openhab/addons"
  register: openhab_pulled

All those openhab variables are defined in defaults. And when I want to deploy a test version on a different machine I override the defaults, primarily changing the openhab_version variable.

fex · April 5, 2021, 12:38pm

rlkoshak:

- name: Checkout openHAB configs if this is a new install
  git:
    repo: "{{ openhab_conf_repo }}"
    dest: "{{ openhab_home }}"
    accept_hostkey: True
  when: (git_running['stderr'] is match(".* succeeded!")) and
        (not conf_present.stat.exists)

And I’m running once again into my old friend java.lang.NumberFormatException: null when I try to start the container with my existing config:

Launching the openHAB runtime...
null
Error occurred shutting down framework: java.lang.NumberFormatException: null
java.lang.NumberFormatException: null
        at java.lang.Integer.parseInt(Integer.java:542)
        at java.lang.Integer.parseInt(Integer.java:615)
        at org.apache.karaf.main.ConfigProperties.<init>(ConfigProperties.java:235)
        at org.apache.karaf.main.Main.updateInstancePidAfterShutdown(Main.java:227)
        at org.apache.karaf.main.Main.main(Main.java:192)

How do you @rlkoshak manage to start a new container with an existing git repo? You once mentioned that I would need the contend of the userdata folder in addition to the config folder.
So what files / folders in the userdata folder do you include into your repo? For example userdata/logs/* wouldn’t make any sense…
What else are auto-generated / often changing files that could and should be ignored by a git repo?

rlkoshak · April 5, 2021, 2:55pm

I run in Docker so both my config and userdata folders are in the same parent folder. This parent folder gets checked in to git. Here is my .gitignore:

userdata/*
!userdata/secrets
!userdata/etc
!userdata/jsondb
!userdata/config
!userdata/habot
!userdata/openhabcloud
!userdata/uuid
!userdata/zigbee
!userdata/zwave
userdata/jsondb/backup
userdata/backup
.Trash-1000
conf/html/hum.jpg
conf/html/light.jpg
conf/html/power.jpg
conf/html/temp.jpg
*py.class
*.swp
*~
*.old

If you want the persisted data to be checked in too (MapDB and rrd4j) add a !userdata/persistence to the list.

But that isn’t related to your problem. If you see that error after copying your configs over, the problem is one of your config files you’ve copied over has a parameter that expects a number but doesn’t have a parsable number as the value.

fex · April 12, 2021, 1:35pm

Maybe that’s related to just trowing my openHAB 2.5 config into a new openHAB 3 Docker instance
At least now that I decided to start with a fresh instance & repo for OH3 and just integrate the old config piece by piece into the new repo, I haven’t had that problem again

But leaving the null pointer issue aside, I have a more general question about your workflow. As far as I get it you check out your config-repo via Ansible and set your openhab service account as owner of these files.
I would guess you are not logging in with the openhab service account? Do you always use Ansible to pull changes? Do you also use Ansible to push changes from your openHAB machine (aka machine on which the openHAB Docker instance is running) to your git host?

Background for my questions: if I ssh into the host of my openHAB instance with a separate account (which is a member of the openhab group), let’s call it default account, and try to do any git actions manually, I’m not allowed to execute them.
If I use sudo to force the issue, I’m running into a bunch of different problems…

So how do you commit changes, for example to the jsondb, happening on argus?
In the past I circumvented that issue by skipping an openhab service account and just setting the openhab container to the U/GID of my default user. But as far as I understand it that’s not really the best idea from a security standpoint?

rlkoshak · April 12, 2021, 3:16pm

In practice yes, I pretty much only pull through Ansible. That’s primarily because I only pull when I’m moving the config to another location or another machine. And when I do that there is a ton of other stuff that has to be done so I always do that sort of thing through Ansible.

But there is nothing technically preventing me from pulling at any time. I just don’t have times where I need to pull manually.

As for pushes, I just use my login account. When I create the openhab user, I add the openhab group to my main login account. That’s the last task in the create-user role. {{ default_user }} is my main login.

But what I don’t remember is if I’ve ever had to manually add group write permissions to the files and folders to make commits and pushes possible. I don’t think so because when I moved to OH 3 I moved around quite a bit but I have no recollection of doing anything manually like that on each move.

I only push manually because I don’t necessarily want a work in progress to be committed.

Check the permissions on the files and folders. In particular, the ownership and permissions on the .git folder and files/folders under that. Make sure they are owned by openhab or your login user and that if they are owned by openhab that they have group read/write permissions.

If you’ve ever run an operation as root some of those files and folders may be owned by root now.

In general though it works pretty straight forward for me. When I’m moving to a new location it pulls the configs from git and sets the ownership to openhab:openhab. As I make changes I git add and commit locally using my login account and finally when I’m ready I’ll push, again as my login account.

It’s not a best practice. Typically you want to have a service like this, especially one exposed to the network, not to have a shell account. It you try to log in as that user you can’t because it doesn’t have a password. If someone compromises the service and gets dropped to a shell, they will not actually have a shell to work in. If you use your login account that means if they break out of the service they will have a shell to work in.

Is the risk large? Probably not. But it’s a relatively simple way to improve security over all, which is why that’s the standard.

fex · April 12, 2021, 4:01pm

You are correct, the cause for my troubles where missing write permissions for the group. The interesting thing is, that your task Create if necessary and set the permissions on the openHAB data folder should actually take care of the problem - at least that what I was thinking.
But following the task execution via Ansible Debugger, it seems like the task is setting permissions for the openhab_home folder - but not recursively. Thats fixable by adding recurse: yes to the task and running it after the checkout / update task

  - name: Create if necessary and set the permissions on the openHAB data folder
    file:
      path: "{{ openhab_home }}"
      state: directory
      owner: openhab
      group: openhab
      mode: u=rwx,g=rwx,o=rx
      recurse: yes
    become: true