EDIT: Fixed some bugs identified in the comments
Note: I talk a lot about Gogs below. This weekend (10/4/2020 I gave up on Gogs and am now using GitLab)
I was asked to provide some more information on Ansible and using it with openHAB. To start, see A quick intro to Ansible and An Ansible 'Getting Started' Guide and https://docs.ansible.com/ for the basics of Ansible and getting started overall. This post isn’t going to rehash the basics and will assume some knowledge of Ansible.
Instead I’m going to go over my openHAB related roles, how they work, and the thought process behind them.
Overall Environment
My overall IT environment is much larger than just home automation. Those parts are out of scope for this post for the most part, though I might bring some roles in for discussion and example.
My overall hardware environment consists of a server class machine running VMWare ESXi and a number of Raspberry Pis of various models (1s, 0Ws, and 3s) connected to the same network either via ethernet, WiFi, or VPN.
On the ESXi machine I have four VMs:
Machine Name | Purpose |
---|---|
huginn |
Virtual Desktop running Ubuntu 20.04 Desktop (if anyone has a recommendation for a good distro that performs well as a VM I’m open to suggestions) |
fafnir |
NAS running OpenMediaVault sharing file systems using NFS and SAMBA in some cases |
medusa |
Media server hosting stuff like NextCloud (and related services), Plex, and Guacamole |
argus |
The home automation server |
fenrir |
I’ve turned this one off but it used to run Nightscout and related services |
The RPis include
Machine Name | Type | Purpose |
---|---|---|
manticore |
RPi 1B | NUT server, sensor_reporter that pushes Govee BT Temp/Humi sensor readings to MQTT and various other odd jobs. |
hydra |
RPi 0w | Connected to door and window sensors that were left behind when the alarm system was ripped out before we bought the house, runs sensor_reporter to report the changes as MQTT |
cerberos |
RPi 3 | Connected to reed sensors and relays to control the garage door, has a Raspicam attached to verify the doors are open or not and monitor the garage. Uses sensor_reporter to publish and subscribe for commands to control and report the status of the doors. |
norns |
RPi 3 | Located over 100 miles away at my dad’s house, runs openHAB with a small Zwave network of devices; this openHAB is configured with MQTT 2.5 Event Bus to report the status of the sensors to the openHAB on argus . |
hestiapi |
RPi 0w | HestiaPi thermostat, this machine is not managed by Ansible (yet?), also running openHAB but limited to just the thermostat. |
For a small business this isn’t so much but for a single person to establish and maintain in their spare time it would be pretty hard to manage manually. Enter Ansible.
Ground Rules
When working with Ansible there are a few ground rules to keep in mind.
-
It only really works if everything possible that is done to configure the machines is done through Ansible. Don’t manually edit config files (openHAB is a special case, more on that later), install software, etc. Everything is done through Ansible.
-
Make the playbooks and tasks idempotent. If there is no change required, no change should be made. This is important because it means that the playbooks make the minimum amount of changes required, which means the same scripts can be used to update/upgrade as are used to install everything in the first place.
-
DRY (Do not Repeat Yourself). If you find yourself writing the same code over and over, create a new role out of it and import that into your other roles.
-
Make liberal use of variables. Paths, user IDs, version numbers (when you don’t want latest) etc. should be in variables.
-
Keep secrets safe. Ansible Vault is a good way to store sensitive information like passwords and API keys and the like. NOTE: I’m not yet using this but plan to move to it.
Prerequisites
Unfortunately, Ansible can’t do everything. In particular:
-
ssh
logins need to be enabled -
python3
needs to be installed
Beyond that, because I’m lazy and/or haven’t figured out how to accomplish:
- create user
rich
(mostly needed on the RPis as I always remove thepi
user) - configure user
rich
with ssh certs - install tripwire (it uses a curses based UI during installation and I haven’t figured out how to interact with it from Ansible yet.
Especially with the RPis, if you have a lot of them you could grab a stock image and make these few small changes and thenresave the image. I just do them manually before running Ansible.
Inventory
The Inventory is pretty straight forward.
[desktops]
huginn
[media]
medusa
[homeauto]
argus
[nas]
fafnir
[pis]
manticore
cerberos
hydra
norns-vpn
[sensor_reporter]
cerberos
hydra
manticore
This is where a given machine is given a job. You can also use this to tag machine types (e.g. sensor_reporter) and it’s a place where you can define host specific variables. The categories we care aboutare home-auto
and pis
.
The Playbooks
Next we have the playbooks. I have a separate playbook for each category so let’s look at the home-auto.yml
file.
---
# Prerequisites:
# - install ssh
# - set up ssh for certificate only logins
# - install python3
# - all hosts added to the inventory under [homeauto]
# TODO set up root with ssh certs for git interactions
- hosts: homeauto
vars:
- update: False
roles:
- common
- { role: vm, when: not update }
- firemotd
- mount-data
- nut-client
- msmtp
- fail2ban
- { role: multitail, openhab_logs: '/srv/openhab2/userdata/logs/' }
- { role: docker, when: not update }
- role: portainer
vars:
- portainer_server: False
- pport: 9002
- openzwave
- mosquitto
- influxdb
- grafana
- openhab
- docker-prune
- tripwire
Takeaways:
-
I have a common role that does stuff like do an apt update and apt upgrade, install stuff like vim, install fish (my preferred shell) and, if my Gogs server is running checkout the fish configs.
-
There is an
update
variable set toFalse
. Whenupdate
isTrue
some of the roles will be skipped. Those roles that are skipped are roles that pretty much just install stuff through apt, in which cases those will be handled by theapt upgrade
that is already performed incommon
. This is primarily a time saver when running the roles to update the machines. -
mout-data
mounts the shares fromfafnir
as NFS mounts. -
While some roles that have software that is just installed through
apt
likemsmtp
andfail2ban
, they are still run during an update as the configuration might need to change during an update. -
You will see in the
multitail
role that I don’t always follow my own advice.openhab_logs
should be a variable. -
The
portainer
role can be use to install either the server or the agent. I need to do this more for some of my other roles (e.g. calibre-desktop and calibre-docker, nut-server and nut-client, etc.)
Here is the pis.yml
file.
---
- hosts: pis
vars:
- update: False
roles:
- common
- firemotd
- { role: min-writes, when: "inventory_hostname != 'norns-vpn'" }
- msmtp
- { role: nut-server, when: "inventory_hostname == 'manticore'" }
- { role: sensor_reporter, when: "'sensor_reporter' in group_names" }
- tripwire
Takeaways include:
-
There are a lot of the same roles. These are indeed the exact same roles as were used for home-auto.
-
We skip some some roles depending on which host it’s running against. This is probably not a best practice but it was simpler and made things simpler than trying to handle this in the inventory.
-
I skip the
min-writes
role fornorns
because I set up zram on that machine throughopenhabian-config
.
Creating a role
As you should already know, an Ansible role follows a specific directory structure. I use ansible-galaxy
to initialize a new role with a command like
ansible-galaxy init roles/new-role
This will create all the directors and populate them with initial yml files to be edited.
Every file and folder above was created for me. All I had to do was edit them.
Showing every single role here would be too large. If there are any roles you would like to see that I don’t show, let me know.
Mosquitto
---
# tasks file for roles/mosquitto
- name: Create the mosquitto user
include_role:
name: create-user
vars:
uid: "{{ mosquitto_uid }}"
gid: "{{ mosquitto_uid }}"
user_name: mosquitto
create_home: False
service: Mosquitto
- name: Create mosquitto data folders
file:
path: "{{ item }}"
state: directory
owner: mosquitto
group: mosquitto
mode: u=rwx,g=rwx,o=rx
loop:
- "{{ mosquitto_home }}"
- "{{ mosquitto_home }}/config"
- "{{ mosquitto_home }}/data"
- "{{ mosquitto_home }}/log"
become: True
- name: Copy the prepared mosquitto.conf folder
copy:
src: mosquitto.conf
dest: "{{ mosquitto_home }}/config/mosquitto.conf"
mode: u=rw,g=rw
become: True
become_user: mosquitto
- name: check to see if the passwd file exists
stat:
path: "{{ mosquitto_home }}/config/passwd"
changed_when: False
register: passwd_file
- name: Create an empty passwd file
file:
path: "{{ mosquitto_home }}/config/passwd"
state: touch
owner: mosquitto
group: mosquitto
mode: u=rw,g=r,o=r
when: not passwd_file.stat.exists
become: True
- name: Pull/update and start the Mosquitto service
docker_container:
detach: True
exposed_ports:
- "1883"
- "9001"
- "8883"
image: eclipse-mosquitto
log_driver: syslog
name: mosquitto
published_ports:
- "1883:1883"
- "9001:9001"
- "8883:8883"
pull: True
restart: False
restart_policy: always
state: started
user: "{{ mosquitto_uid }}"
volumes:
- /etc/passwd:/etc/passwd:ro
- /etc/localtime:/etc/localtime:ro
- /usr/share/zoneinfo:/usr/share/soneinfo:ro
- "{{ mosquitto_home }}/config:/mosquitto/config"
- "{{ mosquitto_home }}/log:/mosquitto/log"
- "{{ mosquitto_home }}/data:/mosquitto/data"
register: mosquitto_pulled
- name: Wait if container restarted
pause:
minutes: 1
when: mosquitto_pulled.changed
- name: Check to see if we can log in
shell:
cmd: docker exec mosquitto mosquitto_sub -h localhost -u {{ mosquitto_user }} -P {{ mosquitto_password }} -t test -E
register: mosquitto_sub
changed_when: False
failed_when: False
- name: Update the user and password if loggin in failed or the passwd file doesn't exist
shell:
cmd: docker exec mosquitto mosquitto_passwd -c -b /mosquitto/config/passwd {{ mosquitto_user }} {{ mosquitto_password }}
when: (not passwd_file.stat.exists) or
('not authorized' in mosquitto_sub['stdout'])
- name: Restart mosquitto if password changed
docker_container:
name: mosquitto
restart: True
when: (not passwd_file.stat.exists) or
('not authorized' in mosquitto_sub['stdout'])
How it works:
- Create a user for the service to run under
- Create the mosquitto data folders
- Under
roles/mosquitto/files
I’ve a savedmosquitto.conf
that contains the full mosquitto configuration. This get’s copied from where we are running Ansible to where we are deploying. - Create an empty user/password file if one doesn’t already exist.
- Pull down and run the mosquitto image from DockerHub
- If the container restarted (only happens when the Image changed on DockerHub) we wait enough time for the container to come back up
- Try to connect to mosquitto using the configured username/password (configured in environment variables)
- If logging in failed, create/update the mosquitto user and password. Notice this is part of the idempotency. A non-idempotent way to do it would be to update the username/password every time whether or not it needs to be changed.
- Finally, if the username/password were updated restart the container to pickup the changes.
Takeaways:
- Actions are only performed when there is a change; idempotent
- It’s a TODO of mine to deploy and use certificates for SSL/TLS connections
- A lot of the fields listed in the docker_contianer task are not required as they are set to the default and/or are redundant. I keep them to document things about the container. For example, ports 1883, 8883, and 9001 are exposed by the Dockerfile and not required in the docker_container command
InfluxDB
---
# tasks file for roles/influxdb
- name: Create the influxdb user
include_role:
name: create-user
vars:
uid: "{{ influxdb_uid }}"
gid: "{{ influxdb_uid }}"
user_name: influxdb
create_home: False
service: InfluxDB
- name: Check to see if this is a new install
stat:
path: "{{ influxdb_home }}/data"
register: new_install
- name: Resotre from backup
include_role:
name: generic-restore
vars:
service_name: influxdb
service_home: "{{ influxdb_home }}"
service_user: influxdb
service_group: influxdb
when: not new_install.stat.exists
- name: Create the needed directories
file:
path: "{{ item }}"
state: directory
owner: influxdb
group: influxdb
mode: u=rwx,g=rx,o=rx
become: true
register: dir_created
loop:
- "{{ influxdb_home }}/config"
- "{{ influxdb_home }}/data"
- "{{ influxdb_home }}/logs"
when: (not new_install.stat.exists) and (not 'tgz' in backup['stdout'])
- name: Populate the conf file
ini_file:
create: True
dest: "{{ influxdb_home }}/config/influxdb.conf"
section: "{{ item.section }}"
option: "{{ item.option }}"
value: "{{ item.value }}"
mode: u=rw,g=rw
owner: influxdb
group: influxdb
loop:
- { section: "meta", option: "dir", value: '"/var/lib/influxdb/meta"' }
- { section: "data", option: "dir", value: '"/var/lib/influxdb/data"' }
- { section: "data", option: "engine", value: '"tsml"' }
- { section: "data", option: "wal-dir", value: '"/var/lib/influxdb/wal"' }
become: True
- name: Pull and start InfluxDB container
docker_container:
detach: True
exposed_ports:
- "8086"
hostname: "{{ ansible_fqdn }}"
image: influxdb
name: influxdb
log_driver: syslog
published_ports:
- "8086:8086"
pull: True
restart: False
restart_policy: always
user: "{{ influxdb_uid }}:{{ influxdb_uid }}"
volumes:
- "{{ influxdb_home }}/config:/etc/influxdb"
- "{{ influxdb_home }}/data:/var/lib/influxdb"
- "{{ influxdb_home }}/logs:/var/log/influxdb"
- /etc/localtime:/etc/localtime:ro
- /etc/passwd:/etc/passwd:ro
register: influxdb_pulled
- name: Sleep for a few to give the container a chance to come up
pause:
seconds: 20
when: influxdb_pulled.changed
- name: Install influxdb python module
pip:
name: influxdb
state: present
become: True
- name: Create the admin user and grant permissions
influxdb_user:
admin: True
hostname: "{{ influxdb_ip_address }}"
user_name: "{{ influxdb_admin_user }}"
user_password: "{{ influxdb_admin_password }}"
- name: Install the backup script
include_role:
name: generic-backup
vars:
service: influxdb
service_home: "{{ influxdb_home }}"
service_user: influxdb
service_group: influxdb
frequency: daily
email: "{{ email_login }}"
How it works:
- Create the user
- Look to see if this is a new install or not. We determine this if the data folder exists. If not than we need to do a restore from backup.
- If the data folder doesn’t exist, restore the backup using a common restore role (see below)
- Next make sure the InfluxDB folder and subfolders exist and has the right ownership and permissions.
- Modify the cofig file as required by editing it in place instead of copying a premade file over like was done with Mosquitto.
- Pull the latest image from Dockerhub and if it’s changed run a new container.
- If the container was updated/restarted, give it time to come up.
- Install prerequisites and create the admin user and grant permissions.
9 Finally, deploy the backup script using a common role.
Takeways:
- The Ansible provided roles are idempotent but when doing actions using shell or command it’s up to you to make it idempotent.
Backup and Restore
The generic-backup role creates a script to automatically tgz up a given folder and save it to the mounted backups folder.
---
# tasks file for roles/generic-backup
- name: Make sure the backup folder exists
file:
path: "{{ backup_home }}/{{ service }}"
state: directory
owner: "{{ service_user }}"
group: "{{ service_group }}"
mode: u=rwx,g=rwx,o=rx
become: True
- name: Install the backup script
template:
dest: /etc/cron.{{ frequency }}/backup-{{ service }}
mode: u=rwx,g=rx,o=rx
src: backup-script
become: True
A third way to deploy config files using Ansible is via templates. The template file backup-script in this case is
#!/bin/bash
echo "Backing up {{ service }}"
file={{ backup_home }}/{{ service }}/{{ service }}-$(date +%Y-%m-%d_%H%M).tgz
cd {{ service_home }}
tar cfz $file .
sendmail=/usr/sbin/sendmail
email={{ email }}
to='To: '$email'\n'
from='From: '$email'\n'
subject='Subject: {{ service }} Backed Up\n\n'
body=${file}
msg=${to}${from}${subject}${body}
echo -e "$msg" | $sendmail $email
The big difference between just copying the file like was done in the mosquitto role and a template is we can modify the file with Ansible variables before it’s copied. This script gets copied to one of the cron.* folders depending on the desired frequency of updates. When it runs it creates a tgz with the date and time in the file name.
In another thread I mentioned that I need to do more than just send an email indicating the backup ran and add some checks like the tgz file size and maybe a table of contents. When I do that, I’ll add that to this template file above and then the next time I run an update Ansible will deploy the changed file for all roles that use this common role.
For restoration I use:
---
# tasks file for roles/generic-restore
- name: Get the most recent backup, if there is one
shell: ls -t {{ backup_home }}/{{ service_name }}/{{ service_name }}*.tgz | head -n 1
register: backup
changed_when: False
become: True
- name: Print backup file
debug:
msg: Restoring from {{ backup['stdout'] }}
when: ('tgz' in backup['stdout'])
- name: Create the home folder
file:
path: "{{ service_home }}"
state: directory
owner: "{{ service_user }}"
group: "{{ service_group }}"
mode: "u=rwx,g=rwx,o=rx"
become: True
- name: Restore the backup if it exists
unarchive:
src: "{{ backup['stdout'] }}"
dest: "{{ service_home }}"
remote_src: True
when: ('tgz' in backup['stdout'])
become: True
become_user: "{{ service_user }}"
These two roles assume the other one. The restore looks in the backup folder for the latest backup file and unzips it to the destination folder.
Grafana
---
# tasks file for roles/grafana
- name: Create grafana user
include_role:
name: create-user
vars:
uid: "{{ grafana_uid }}"
gid: "{{ grafana_uid }}"
user_name: grafana
create_home: False
service: Grafana
- name: Check to see if this is a new install
stat:
path: "{{ grafana_home }}/grafana.db"
register: new_install
- name: Restore from backup
include_role:
name: generic-restore
vars:
service_name: grafana
service_home: "{{ grafana_home }}"
service_user: grafana
service_group: grafana
when: not new_install.stat.exists
- name: Pull/update Grafana docker container
docker_container:
detach: True
env:
GF_USERS_ALLOW_SIGN_UP: "false"
GF_AUTH_ANONYMOUS_ENABLED: "true"
GF_SECURITY_ALLOW_EMBEDDING: "true"
GF_SECURITY_COOKIE_SECURE: "false"
GF_SECURITY_COOKIE_SAMESITE: "lax"
GF_SECURITY_ADMIN_USER: "admin"
GF_SECURITY_ADMIN_PASSWORD: "admin"
GF_RENDERING_SERVER_URL: http://{{ inventory_hostname }}:3001/render
GF_RENDERING_CALLBACK_URL: http://{{ inventory_hostname }}:3000/
GF_LOG_FILTERS: rendering:debug
exposed_ports:
- "3000"
hostname: grafana.koshak.net
image: grafana/grafana
log_driver: syslog
name: grafana
published_ports:
- "3000:3000"
pull: True
restart: False
restart_policy: always
user: "{{ grafana_uid }}:{{ grafana_uid }}"
volumes:
- "{{ grafana_home }}:/var/lib/grafana"
- /etc/localtime:/etc/localtime:ro
- /etc/passwd:/etc/passwd:ro
- name: Start the image renderer
docker_container:
detach: True
exposed_ports:
- "8081"
hostname: grafana-renderer.koshak.net
image: grafana/grafana-image-renderer
log_driver: syslog
name: grafana-image-renderer
published_ports:
- "3001:8081"
pull: True
restart: False
restart_policy: always
user: "{{ grafana_uid }}:{{ grafana_uid }}"
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/passwd:/etc/passwd:ro
- name: Install the backup script
include_role:
name: generic-backup
vars:
service: grafana
service_home: "{{ grafana_home }}"
service_user: grafana
service_group: grafana
frequency: daily
email: "{{ email_login }}"
How it works:
- You might start to see a pattern here. First create the user.
- Check to see if we need to restore from backup
- Restore from backup if required.
- Pull the latest Grafana image from DockerHub and if it’s changed recreate the container.
- Pull the latest Grafana Image Renderer from DockerHub and if it’s changed recreate the container.
- Deploy the backup script.
This role posed a bit of a challeng for me. I probably could have made it work but gave up. the problem was if I used the docker’s hostname as the hostname for the container, it resolved it as localhost which made it so Grafana couldn’t reach the image renderer. That’s why they have different hostnames.
By this point you should start to see the rhythm of these roles. You should also start to see that everything is documented and to change the configuration is pretty simple; just edit the role.
openHAB
The role you’ve all be waiting for! This one is a little more complicated because:
- the openHAB configs are stored in Gogs
- InfluxDB database and users need to be created
So why do I not use the same backup script as I do above? I’m strongly of the opinion that the openHAB configuration is more than just configuration. It’s fully fledged software development. Even if you do everything through PaperUI, I recommend using some sort of source control to keep and store your OH configs if you can manage it.
---
# tasks file for roles/openhab
- name: Create the openhab user and group
include_role:
name: create-user
vars:
uid: "{{ openhab_uid }}"
gid: "{{ openhab_uid }}"
user_name: openhab
create_home: False
service: openHAB
- block:
- name: Add openhab user to the dialout group
user:
append: True
groups: dialout
name: openhab
- name: Create if necessary and set the permissions on the openHAB data folder
file:
path: "{{ openhab_home }}"
state: directory
owner: openhab
group: openhab
mode: u=rwx,g=rwx,o=rx
become: True
- name: See if config is already present
stat:
path: "{{ openhab_home }}/conf"
register: conf_present
- name: Check to see if Gogs is up
shell: "nc -vz {{ gogs_host }} {{ gogs_port }}"
register: gogs_running
changed_when: False
- name: Checkout openHAB configs if this is a new install
git:
repo: "{{ openhab_conf_repo }}"
dest: "{{ openhab_home }}"
accept_hostkey: True
when: (gogs_running['stderr'] is match(".* succeeded!")) and
(not conf_present.stat.exists)
- name: Create missing folders
file:
path: "{{ item }}"
state: directory
owner: openhab
group: openhab
mode: u=rwx,g=rwx,o=rx
loop:
- "{{ openhab_home }}/userdata/cache"
- "{{ openhab_home }}/userdata/logs"
- "{{ openhab_home }}/userdata/persistence"
- "{{ openhab_home }}/userdata/tmp"
become: True
- name: Change ownership of openHAB configs
file:
path: "{{ openhab_home }}"
owner: openhab
group: openhab
recurse: yes
become: True
when: (gogs_running['stderr'] is match(".* succeeded!")) and
(not conf_present.stat.exists)
- name: Create the InfluxDB database
influxdb_database:
hostname: "{{ influxdb_ip_address }}"
database_name: "{{ openhab_influxdb_database_name }}"
state: present
username: "{{ influxdb_admin_user }}"
password: "{{ influxdb_admin_password }}"
- name: Create the openHAB user and grant permissions
influxdb_user:
user_name: "{{ influxdb_openhab_user }}"
user_password: "{{ influxdb_openhab_password }}"
login_username: "{{ influxdb_admin_user }}"
login_password: "{{ influxdb_admin_password }}"
grants:
- database: "{{ openhab_influxdb_database_name }}"
privilege: 'ALL'
- name: Create Grafana user and grant read permissions
influxdb_user:
user_name: "{{ influxdb_grafana_user }}"
user_password: "{{ influxdb_grafana_password }}"
login_username: "{{ influxdb_admin_user }}"
login_password: "{{ influxdb_admin_password }}"
grants:
- database: "{{ openhab_influxdb_database_name }}"
privilege: 'READ'
# TODO check version numbers to determine if we need to restart
- name: Pull/update the openHAB docker image
docker_container:
detach: True
devices:
- "/dev/ttyUSB0:/dev/ttyUSB0:rwm"
- "/dev/ttyUSB1:/dev/ttyUSB1:rwm"
env:
CRYPTO_POLICY: unlimited
hostname: argus.koshak.net
image: openhab/openhab:{{ openhab_version }}
log_driver: syslog
name: openhab
network_mode: host
pull: True
restart: False
restart_policy: always
tty: True
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- "{{ openhab_home }}/conf:/openhab/conf"
- "{{ openhab_home }}/userdata:/openhab/userdata"
- "{{ openhab_home }}/addons:/openhab/addons"
register: openhab_pulled
- name: Wait a couple minutes if a new image was pulled
pause:
minutes: 5
when: openhab_pulled.changed
- name: Pull/update the openHAB docker image
docker_container:
detach: True
devices:
- "/dev/ttyUSB0:/dev/ttyUSB0:rwm"
- "/dev/ttyUSB1:/dev/ttyUSB1:rwm"
env:
CRYPTO_POLICY: unlimited
hostname: "{{ ansible_fqdn }}"
image: openhab/openhab:{{ openhab_version }}
log_driver: syslog
name: openhab
network_mode: host
pull: True
restart: False
restart_policy: always
tty: True
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- "{{ openhab_home }}/conf:/openhab/conf"
- "{{ openhab_home }}/userdata:/openhab/userdata"
- "{{ openhab_home }}/addons:/openhab/addons"
register: openhab_pulled
How it works:
- Create the user
- Add the user to the right groups
- Create the folder
- Check to see if we needto restore from Gogs.
- If needed, checkout the latest configs from Gogs.
- Make sure all the needed folders exists and have the right ownership and permissions
- Create the InfluxDB database for openHAB
- Create the InfluxDB openHAB user
- Create the InfluxDB grafana user
- Pull the latest openHAB image and recreate the container if it changed
- If a new image was pulled, wait file minutes and restart the container
So why is there step 11? As of right now, OH causes a problem for some users, myself included, where when the cache is cleared OH comes back up without recognizing some Items. However, if you wait long enough and then restart the problem goes away. You may notice that this role is not purely idempotent. Step 11 is only required when the actual version of OH changes. The Image changes far more frequently than there is a new release of OH (unless you are running the snapshots). Consequently step 11 should only be performed is the version from before step 10 and after step 10 are different. It’s on my todo to add that. $OH_HOME/userdata/version.properties can be looked at by the role to do this.
min-writes
---
# tasks file for roles/min-writes
# http://www.zdnet.com/article/raspberry-pi-extending-the-life-of-the-sd-card
- name: Mount /tmp to tmpfs
mount:
path: "{{ item.path }}"
src: tmpfs
fstype: tmpfs
opts: defaults,noatime,nosuid,size={{ item.size }}
dump: "0"
state: mounted
loop:
- { "path": "/tmp", "size": "100m" }
- { "path": "/var/tmp", "size": "30m" }
- { "path": "/var/log", "size": "100m" }
register: mounted
become: True
- name: Reboot if changed
include_role:
name: reboot
when: mounted.changed
This one is pretty simple. Create tempfs and mount them to places where the RPi commonly writes out stuff. This takes the writes off of the SD card and puts it in memory. It’s quick, dirty but effective.
Reboot
---
# tasks file for roles/reboot
- name: Restart the machine
shell: sleep 2 && shutdown -r now "Ansible updates triggered"
async: 1
poll: 0
become: True
ignore_errors: True
- name: Wait for machine to come back from reboot
local_action: wait_for host={{ ansible_hostname }} state=started delay=30 timeout=300
I include this role here because it shows an interesting capability. We restart the remote machine and then wait for it to come back online before continuing.
sensor_reporter
---
# tasks file for roles/sensor_reporter
- name: Create sensor_reporter user
include_role:
name: create-user
vars:
uid: "{{ sensor_reporter_uid }}"
gid: "{{ sensor_reporter_uid }}"
user_name: sensor_reporter
create_home: False
service: sensor_reporter
- name: Check to see if the gpio group exists
shell: grep gpio /etc/group
register: rpi_groups
failed_when: False
changed_when: False
- name: Add the sensor_reporter user to the gpio group
user:
append: True
groups: gpio
name: sensor_reporter
become: True
when: ('gpio' in rpi_groups['stdout'])
- name: Install the prerequisites
apt:
name: [libglib2.0-dev, bluetooth, bluez, python3-bluez, libgpiod2, net-tools]
update_cache: no
become: yes
- name: Install python libraries
pip:
name: [bluepy, bleson, RPI.GPIO, adafruit-blinka, adafruit-circuitpython-dht, paho-mqtt, scapy, requests, sseclient-py]
become: True
- name: Create the home folder
file:
path: "{{ sensor_reporter_home }}"
owner: sensor_reporter
group: sensor_reporter
state: directory
mode: u=rwx,g=rwx,o=rx
become: True
- name: Checkout sensor_reporter from github
git:
accept_hostkey: True
ssh_opts: -o StrictHostKeyChecking=no
dest: "{{ sensor_reporter_home }}"
repo: "{{ sensor_reporter_repo }}"
update: True
register: checked_out
become: True
- name: Change ownership of checked out files
file:
path: "{{ sensor_reporter_home }}"
state: directory
owner: sensor_reporter
group: sensor_reporter
recurse: True
become: True
when: checked_out.changed
- name: Install the start script
copy:
src: "{{ sensor_reporter_home }}/sensor_reporter.service"
dest: /etc/systemd/system
remote_src: True
mode: a+rwx
become: True
when: checked_out.changed
- name: Make sure the service uses the user, home directory, and .ini file
lineinfile:
dest: "/etc/systemd/system/sensor_reporter.service"
regexp: "{{ item.regex }}"
line: "{{ item.value }}"
state: present
loop:
- { "regex": "^User=.*", "value": "User=sensor_reporter" }
- { "regex": "^WorkingDirectory=.*", "value": "WorkingDirectory={{ sensor_reporter_home }}" }
- { "regex": "^ExecStart=.*", "value": "ExecStart=python3 sensor_reporter.py {{ sensor_reporter_configs_home }}/{{ ansible_hostname }}.ini" }
become: True
- name: If this is manticore, run it as root
lineinfile:
dest: "/etc/systemd/system/sensor_reporter.service"
regexp: "^User=.*"
line: "User=root"
state: present
become: True
when: ansible_hostname == "manticore"
- name: Create the sensor_reporter config folder
file:
path: "{{ sensor_reporter_configs_home }}"
owner: sensor_reporter
group: sensor_reporter
state: directory
mode: u=rwx,g=rwx,o=rx
become: True
- name: Checkout the sensor_reporter configs from Gogs
git:
dest: "{{ sensor_reporter_configs_home }}"
repo: "{{ sensor_reporter_configs_repo }}"
update: True
register: configs_checkedout
become: True
- name: Change the ownership of the checked out configs
file:
path: "{{ sensor_reporter_configs_home }}"
state: directory
owner: sensor_reporter
group: sensor_reporter
recurse: True
become: True
when: configs_checkedout.changed
- name: Enable and start sensor_reporter
systemd:
name: sensor_reporter
state: started
enabled: True
daemon_reload: True
become: True
- name: Restart sensor_reporter if there was a change
systemd:
name: sensor_reporter
state: restarted
become: True
when: (checked_out.changed) or (configs_checkedout.changed)
This is a little python script I wrote almost five years ago and have kept up since then and in fact just finished a rewrite recently to make it run in Python 3, improve it’s overall structure and maintainability, and make it support openHAB two-way communication (previously it only supported push data to openHAB) in addition to MQTT. I include it here as it shows how you can create and deploy your own services.
How it works (Ansible, not sensor_reporter):
- Create the user
- Add the user to the gpio group if it exists
- Installs requires software and Python libraries (need to make this work on non-RPis)
- Create the home folder and make sure the ownership and permissions are correct.
- Checkout the code from github
- Deploy the systemd service file to start it as a service
- Modify the systemd service. This shows yet another way to handle configs. Copy a default copy and then modify as necessary.
- Create the config folder and check out the configs from the private Gogs server. The service file will look in that configs folder for a
<hostname>.ini
file. As is clear from above, I’ve several hosts runningsensor_reporter
, each with a different configuration. - Finally, if there wasa change that requires a restart, restart the service.
code-server
We all should know about editing our openHAB configs using VSCode. What if you could access VSCode through a browser? code-server
provides just that capability. I include this role because many openHAB users can find this pretty handy for remote access and editing of openHAB configs.
---
# tasks file for roles/code-server
- name: Get {{ default_user }} uid
getent:
database: passwd
key: "{{ default_user }}"
- name: Create coder user
user:
comment: "code-server user"
createhome: no
name: coder
shell: /usr/sbin/nologin
state: present
system: no
uid: "{{ item.value[1] }}"
group: "{{ default_user }}"
non_unique: yes
with_dict: "{{ getent_passwd }}"
become: True
- name: Make sure config voulmes exist on the host
file:
path: /home/{{ default_user }}/.local/share/code-server
state: directory
owner: "{{ default_user }}"
group: "{{ default_user }}"
- name: Install/update code-server
docker_container:
detach: True
log_driver: syslog
image: codercom/code-server:{{ code_server_version }}
name: code-server
published_ports:
- "8080:8080"
pull: True
restart: False
restart_policy: always
state: started
user: coder
volumes:
- /home/{{ default_user }}/.local/share/code-server:/home/coder/.local/share/code-server:rw
- /home/{{ default_user }}/code:/home/coder/project:rw
- /home/{{ default_user }}/.gitconfig:/home/coder/.gitconfig:rw
- /home/{{ default_user }}/.ssh:/home/coder/.ssh:rw
- /etc/passwd:ro
- block:
- name: Add code-server to fail2ban's jail.local
blockinfile:
path: /etc/fail2ban/jail.conf
state: present
insertafter: EOF
block: |
[code-server]
port = 8080
enabled = true
backend = auto
logpath = /var/log/syslog
register: jail_changed
- name: Create/update the filter for fail2ban
blockinfile:
create: True
path: /etc/fail2ban/filter.d/code-server.conf
state: present
block: |
# Fail2Ban filter for code-server
[Definition]
failregex = ^.*: Failed login attempt\s+{\"xForwardedFor\":\"<HOST>\"
ignoreregex =
datepattern = ^%%b %%d %%H:%%M:%%S
register: filter_changed
- name: Restart fail2ban if there were changes
systemd:
name: fail2ban
state: reloaded
when: (jail_changed.changed) or (filter_changed.changed)
become: True
How it works:
- Inside the container everything runs as a
coder
user. So we create acoder
users and give it the same uid as the login user. - Then we make sure the code-server config volumes exist and have the right permissions.
- Pull and run the code-server Docker image. For a long time the
latest
tag was not working with their images so I set the version using a varaible and subscribe to the release announcements on github. - This task assumes that fail2ban is installed and adds to the fail2ban config so it bans IPs when users appear to try to brute force the login to code-server.
fail2ban
This isn’t as important for openHAB now, but when OH 3 comes out with login support, it will become more useful, especially for those who choose to expose their openHAB directly to the internet (not recommended).
---
# tasks file for roles/fail2ban
- block:
- name: Install fail2ban
apt:
name: fail2ban
update_cache: no
- name: Copy the prepared jail.local
copy:
src: jail.local
dest: /etc/fail2ban/jail.local
mode: u=rw,g=r,o=r
- name: Restart fail2ban
systemd:
daemon_reload: yes
enabled: yes
name: fail2ban.service
state: started
become: yes
When I write the OH3 role, I’ll add tasks to deploy filter and configs to block failed logins for openHAB to that role similar to the code-server rolde above.
Guacamole
Guacamole is actually pretty awesome! It is a web based RDP/VNC/SSH service. It lets you log into a web page and then select one of your machines and interact with them right through the browser. In my use of it, especially since 1.2 was release, it’s much more responsive than tunneling VNC through SSH. I leave this here as it might be useful for those who want a way to remotely administer their machines.
NOTE: adding guacamole to fail2ban is on my TODO list but I’m not too worried about security as it supports ToTP 2FA (i.e. you have to enter a code from Authy or Google Authenticator or the like as a second factor in addition to your login).
---
# tasks file for roles/guacamole
- name: Create the guacamole user
include_role:
name: create-user
vars:
uid: "{{ guacamole_uid }}"
gid: "{{ guacamole_uid }}"
user_name: guacamole
create_home: False
service: Guacamole
# uses port 4822 but we don't need to expose it
- name: Pull/update guacd
docker_container:
detach: True
image: guacamole/guacd
log_driver: syslog
name: guacd
env:
GUACD_LOG_LEVEL: "info"
pull: True
restart: False
restart_policy: always
- name: Install psycopg2
pip:
name: psycopg2-binary
- name: Create guacamole PostgreSQL database
postgresql_db:
login_host: "{{ postgresql_host }}"
login_password: "{{ postgresql_password }}"
login_user: "{{ postgresql_user }}"
name: "{{ guacamole_db_name }}"
- name: Create guacamole PostgreSQL user to database
postgresql_user:
db: "{{ guacamole_db_name }}"
login_host: "{{ postgresql_host }}"
login_password: "{{ postgresql_password }}"
login_user: "{{ postgresql_user }}"
name: "{{ guacamole_db_user }}"
password: "{{ guacamole_db_password }}"
- name: Give guacamole_user permissions on tables and sequences
postgresql_privs:
database: "{{ guacamole_db_name }}"
grant_option: True
login_host: "{{ postgresql_host }}"
login_password: "{{ postgresql_password }}"
login_user: "{{ postgresql_user }}"
objs: ALL_IN_SCHEMA
privs: "{{ item.privs }}"
roles: "{{ guacamole_db_user }}"
schema: public
state: present
type: "{{ item.type }}"
loop:
- { "privs": "SELECT,INSERT,UPDATE,DELETE", "type": "table"}
- { "privs": "SELECT,USAGE", "type": "sequence"}
register: db_changed
# Untested, assumes this is running on the same host as PostgreSQL
- name: Initialize the dataabse
block:
- name: Create initdb.sql script
shell: docker run --rm guacamole/guacamole /opt/guacamole/bin/initdb.sh --postgres > /tmp/initdb.sql
args:
creates: /tmp/initdb.sql
- name: Create the databases from initdb.sql
shell: docker exec -i postgres psql -d {{ guacamole_db_name }} {{ guacamole_db_user }} < /tmp/initdb.sql
when: db_changed.changed
- name: Create the guacamole folder
file:
path: "{{ guacamole_home }}"
state: directory
owner: guacamole
group: guacamole
mode: u=rwx,g=rwx,o=rx
become: True
- name: Get the version of guacamole
shell: docker run --rm guacamole/guacamole sh -c 'ls /opt/guacamole/postgresql/guacamole*.jar'
register: guacamole_version_cmd
changed_when: False
- name: Extract the version
set_fact:
guac_version: "{{ guacamole_version_cmd['stdout'] | regex_replace('^/opt/guacamole/postgresql/guacamole-auth-jdbc-postgresql-(.*).jar', '\\1') }}"
- name: Get the current installed TOTP extension name
shell: ls "{{ guacamole_home }}/extensions/guacamole-auth-totp-*.jar"
register: totp_file
changed_when: False
failed_when: False
- name: Extract the totp version
set_fact:
totp_version: "{{ totp_file['stdout'] | regex_replace('^{{ guacamole_home }}/extensions/guacamole-auth-totp-(.*).jar', '\\1') }}"
- name: Debug
debug:
msg: guac version = {{ guac_version }} totp version= {{ totp_file['stdout'] }}
- name: Delete old and download new if totp version doesn't match guacamole version
block:
- name: Make sure extensions exists
file:
path: "{{ guacamole_home }}/extensions"
state: directory
owner: guacamole
group: guacamole
become: True
- name: Delete the old jar file
file:
path: "{{ totp_file['stdout'] }}"
state: absent
become: True
- name: Download and extract the TOTP extension
unarchive:
src: http://apache.org/dyn/closer.cgi?action=download&filename=guacamole/{{ guac_version }}/binary/guacamole-auth-totp-{{ guac_version }}.tar.gz
dest: "{{ guacamole_home }}/extensions"
remote_src: True
become: True
become_user: guacamole
- name: Copy the jar file to the right location
copy:
src: "{{ guacamole_home }}/extensions/guacamole-auth-totp-{{ guac_version }}/guacamole-auth-totp-{{guac_version }}.jar"
dest: "{{ guacamole_home }}/extensions"
remote_src: True
become: True
become_user: guacamole
when: guac_version != totp_version
- name: Start guacamole
docker_container:
detach: True
exposed_ports:
- "8080"
image: guacamole/guacamole
links:
- "guacd:guacd"
log_driver: syslog
name: guacamole
env:
GUACAMOLE_HOME: /etc/guacamole
POSTGRES_HOSTNAME: "{{ postgresql_ip }}"
POSTGRES_PORT: "5432"
POSTGRES_DATABASE: "{{ guacamole_db_name }}"
POSTGRES_USER: "{{ guacamole_db_user }}"
POSTGRES_PASSWORD: "{{ guacamole_db_password }}"
published_ports:
- "9999:8080"
pull: True
restart: False
restart_policy: always
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/tec/timezone:ro
- /etc/hosts:/etc/hosts:ro
- "{{ guacamole_home }}:/etc/guacamole:rw"
It should look pretty familiar be now how it works:
- Create the user
- Pull and start the guacd container (I don’t know why it has two separate services)
- Create the guacamole PostgreSQL database and user and permissions
- It’s awkward but there is a script inside the container that needs to be run on PostgreSQL to initialize the database. Get the script and then execute it on PostgreSQL. These tasks assume Guacamole and PostgreSQL are running on the same host.
- Create the guacamole folder to store the addons.
- Find out the current version of Guacamole and download the ToTP addon for that version. Delete the old addon as it won’t startup if the wrong version is present.
- Finally pull the latest image from DockerHub and restart the container if it’s changed.
Lessons Learned
-
If you create a lot of users like I do, put the uids for them all in the global
all
file so that you can ensure that each and every one has a unique UID. That way you will always know the UID for these users and there will never be a conflict if more than one has the same UID. -
It takes a lot of work to write idempotent roles. For example, I use Calibre with the DeDRM plugin. For Calibre I build my own docker image. To make it idempotent I need to check the online URL for the latest released version, call calibre --version to determine the currently installed version, and only rebuild the Docker Image when they are different. Then I need to do it all over again with the DeDRM plugin. As a result, the container is only restarted or otherwise changes when there is a reason to.
-
Preparing a host and deploying a docker container is way less work most of the time compared to installing the software natively. But sometimes it makes sense to install natively. For particularly complicated stuff (e.g. something where you have to checkout and compile code) doing that through Ansible means you only have to figure that out once.
-
Periodically check back for new Docker images for things that didn’t have one before (I’m still surprised there isn’t a good Calibre image on DockerHub that just runs it to provide web access which is all I really need).