Influxdb stops on OH startup

I’m exepriencing similar issues to this topic:

The system is a fresh install of openHABian 1.6.5, on a RPi 4. Like in the topic I had issues installing influxdb from the openhabian-config, where influxdb install would report failing, although I could see that it had installed, but no users or databases had been created, so did that manually afterwards, ran the script again to install grafana.

I can see that on boot influxdb starts but is stopped again, and so i have to start it manually.
The output of > sudo journalctl -u influxdb

Jul 09 15:17:32 openhabian systemd[1]: Started InfluxDB is an open-source, distributed, time series database.
Jul 09 15:17:34 openhabian influxd[438]: ts=2021-07-09T13:17:34.629482Z lvl=info msg="InfluxDB starting" log_id=0VErVp9G000 version=1.8.6 branch=1.8 commit=v1.8.6
Jul 09 15:17:34 openhabian influxd[438]: ts=2021-07-09T13:17:34.629575Z lvl=info msg="Go runtime" log_id=0VErVp9G000 version=go1.13.8 maxprocs=4
Jul 09 15:17:34 openhabian influxd[438]: ts=2021-07-09T13:17:34.775855Z lvl=info msg="Using data dir" log_id=0VErVp9G000 service=store path=/var/lib/influxdb/data
Jul 09 15:17:34 openhabian influxd[438]: ts=2021-07-09T13:17:34.775991Z lvl=info msg="Compaction settings" log_id=0VErVp9G000 service=store max_concurrent_compactions=2 throughput_bytes_per_second=50331648 throug
Jul 09 15:17:34 openhabian influxd[438]: ts=2021-07-09T13:17:34.785056Z lvl=info msg="Open store (start)" log_id=0VErVp9G000 service=store trace_id=0VErVplG000 op_name=tsdb_open op_event=start
Jul 09 15:17:35 openhabian influxd[438]: ts=2021-07-09T13:17:35.148170Z lvl=info msg="Opened file" log_id=0VErVp9G000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/_internal/monitor/1/000000001-000000
Jul 09 15:17:35 openhabian influxd[438]: ts=2021-07-09T13:17:35.240082Z lvl=info msg="Reading file" log_id=0VErVp9G000 engine=tsm1 service=cacheloader path=/var/lib/influxdb/wal/openhab/autogen/2/_00001.wal size=
Jul 09 15:17:35 openhabian influxd[438]: ts=2021-07-09T13:17:35.274411Z lvl=info msg="Opened shard" log_id=0VErVp9G000 service=store trace_id=0VErVplG000 op_name=tsdb_open index_version=inmem path=/var/lib/influx
Jul 09 15:17:35 openhabian influxd[438]: ts=2021-07-09T13:17:35.304261Z lvl=info msg="Opened shard" log_id=0VErVp9G000 service=store trace_id=0VErVplG000 op_name=tsdb_open index_version=inmem path=/var/lib/influx
Jul 09 15:17:35 openhabian influxd[438]: ts=2021-07-09T13:17:35.304662Z lvl=info msg="Open store (end)" log_id=0VErVp9G000 service=store trace_id=0VErVplG000 op_name=tsdb_open op_event=end op_elapsed=519.598ms
Jul 09 15:17:35 openhabian influxd[438]: ts=2021-07-09T13:17:35.311201Z lvl=info msg="Opened service" log_id=0VErVp9G000 service=subscriber
Jul 09 15:17:35 openhabian influxd[438]: ts=2021-07-09T13:17:35.311282Z lvl=info msg="Starting monitor service" log_id=0VErVp9G000 service=monitor
Jul 09 15:17:35 openhabian influxd[438]: ts=2021-07-09T13:17:35.311313Z lvl=info msg="Registered diagnostics client" log_id=0VErVp9G000 service=monitor name=build
Jul 09 15:17:35 openhabian influxd[438]: ts=2021-07-09T13:17:35.311337Z lvl=info msg="Registered diagnostics client" log_id=0VErVp9G000 service=monitor name=runtime
Jul 09 15:17:35 openhabian influxd[438]: ts=2021-07-09T13:17:35.311356Z lvl=info msg="Registered diagnostics client" log_id=0VErVp9G000 service=monitor name=network
Jul 09 15:17:35 openhabian influxd[438]: ts=2021-07-09T13:17:35.311406Z lvl=info msg="Registered diagnostics client" log_id=0VErVp9G000 service=monitor name=system
Jul 09 15:17:35 openhabian influxd[438]: ts=2021-07-09T13:17:35.311447Z lvl=info msg="Starting precreation service" log_id=0VErVp9G000 service=shard-precreation check_interval=10m advance_period=30m
Jul 09 15:17:35 openhabian influxd[438]: ts=2021-07-09T13:17:35.311474Z lvl=info msg="Starting snapshot service" log_id=0VErVp9G000 service=snapshot
Jul 09 15:17:35 openhabian influxd[438]: ts=2021-07-09T13:17:35.311877Z lvl=info msg="Starting continuous query service" log_id=0VErVp9G000 service=continuous_querier
Jul 09 15:17:35 openhabian influxd[438]: ts=2021-07-09T13:17:35.324208Z lvl=info msg="Starting HTTP service" log_id=0VErVp9G000 service=httpd authentication=true
Jul 09 15:17:35 openhabian influxd[438]: ts=2021-07-09T13:17:35.324270Z lvl=info msg="opened HTTP access log" log_id=0VErVp9G000 service=httpd path=stderr
Jul 09 15:17:35 openhabian influxd[438]: ts=2021-07-09T13:17:35.324291Z lvl=info msg="Auth is enabled but shared-secret is blank. BearerAuthentication is disabled." log_id=0VErVp9G000 service=httpd
Jul 09 15:17:35 openhabian influxd[438]: ts=2021-07-09T13:17:35.335754Z lvl=info msg="Listening on HTTP" log_id=0VErVp9G000 service=httpd addr=127.0.0.1:8086 https=false
Jul 09 15:17:35 openhabian influxd[438]: ts=2021-07-09T13:17:35.335846Z lvl=info msg="Starting retention policy enforcement service" log_id=0VErVp9G000 service=retention check_interval=30m
Jul 09 15:17:35 openhabian influxd[438]: ts=2021-07-09T13:17:35.336641Z lvl=info msg="Listening for signals" log_id=0VErVp9G000
Jul 09 15:17:35 openhabian influxd[438]: ts=2021-07-09T13:17:35.343305Z lvl=info msg="Sending usage statistics to usage.influxdata.com" log_id=0VErVp9G000
Jul 09 15:18:34 openhabian systemd[1]: influxdb.service: Main process exited, code=killed, status=13/PIPE
Jul 09 15:18:34 openhabian systemd[1]: influxdb.service: Succeeded.

I’m pretty confident it is a zram issue or zram causing the issue, as I have tried uninstalling zram from openhabian-config, which enables influx to start on boot, and also fixing permissions from openhabian-config.
I have also tried uninstalling zram, installing influx and reinstalling zram, but then the issue persists.
I could run the system without zram, but after reading posts about pro vs con I think that mstormi makes some convincing arguements. Apart from that, seeing that it is a standard feature nowadays, I don’t see why I shouldn’t not having it working, because clearly it is supposed to. Just apparently something in my system is off…
PS. When i manually do start influx everything is running fine.

Hope someone can tell me some tips as to what I can try, or where I can look for what is going wrong.

How does the journalctl entry look like in case of a manual start ?

Thanks for taking an interest - much appreciated!

Log is as follows:

Jul 09 15:52:38 openhabian systemd[1]: Started InfluxDB is an open-source, distributed, time series database.
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.424539Z lvl=info msg="InfluxDB starting" log_id=0VEtWE60000 version=1.8.6 branch=1.8 commit=v1.8.6
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.424594Z lvl=info msg="Go runtime" log_id=0VEtWE60000 version=go1.13.8 maxprocs=4
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.527566Z lvl=info msg="Using data dir" log_id=0VEtWE60000 service=store path=/var/lib/influxdb/data
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.527723Z lvl=info msg="Compaction settings" log_id=0VEtWE60000 service=store max_concurrent_compactions=2 throughput_bytes_per_second=50331648 throughput_bytes_per_second
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.527838Z lvl=info msg="Open store (start)" log_id=0VEtWE60000 service=store trace_id=0VEtWEVl000 op_name=tsdb_open op_event=start
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.652130Z lvl=info msg="Opened file" log_id=0VEtWE60000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/_internal/monitor/1/000000001-000000001.tsm id=0 duration
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.769745Z lvl=info msg="Opened shard" log_id=0VEtWE60000 service=store trace_id=0VEtWEVl000 op_name=tsdb_open index_version=inmem path=/var/lib/influxdb/data/_internal/mon
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.776033Z lvl=info msg="Reading file" log_id=0VEtWE60000 engine=tsm1 service=cacheloader path=/var/lib/influxdb/wal/openhab/autogen/2/_00001.wal size=32394
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.785186Z lvl=info msg="Opened shard" log_id=0VEtWE60000 service=store trace_id=0VEtWEVl000 op_name=tsdb_open index_version=inmem path=/var/lib/influxdb/data/openhab/autog
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.785482Z lvl=info msg="Open store (end)" log_id=0VEtWE60000 service=store trace_id=0VEtWEVl000 op_name=tsdb_open op_event=end op_elapsed=257.651ms
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.785638Z lvl=info msg="Opened service" log_id=0VEtWE60000 service=subscriber
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.785672Z lvl=info msg="Starting monitor service" log_id=0VEtWE60000 service=monitor
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.785702Z lvl=info msg="Registered diagnostics client" log_id=0VEtWE60000 service=monitor name=build
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.785729Z lvl=info msg="Registered diagnostics client" log_id=0VEtWE60000 service=monitor name=runtime
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.785756Z lvl=info msg="Registered diagnostics client" log_id=0VEtWE60000 service=monitor name=network
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.785793Z lvl=info msg="Registered diagnostics client" log_id=0VEtWE60000 service=monitor name=system
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.785832Z lvl=info msg="Starting precreation service" log_id=0VEtWE60000 service=shard-precreation check_interval=10m advance_period=30m
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.785861Z lvl=info msg="Starting snapshot service" log_id=0VEtWE60000 service=snapshot
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.785906Z lvl=info msg="Starting continuous query service" log_id=0VEtWE60000 service=continuous_querier
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.785951Z lvl=info msg="Starting HTTP service" log_id=0VEtWE60000 service=httpd authentication=true
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.785977Z lvl=info msg="opened HTTP access log" log_id=0VEtWE60000 service=httpd path=stderr
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.786000Z lvl=info msg="Auth is enabled but shared-secret is blank. BearerAuthentication is disabled." log_id=0VEtWE60000 service=httpd
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.786999Z lvl=info msg="Listening on HTTP" log_id=0VEtWE60000 service=httpd addr=127.0.0.1:8086 https=false
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.787084Z lvl=info msg="Starting retention policy enforcement service" log_id=0VEtWE60000 service=retention check_interval=30m
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.787396Z lvl=info msg="Listening for signals" log_id=0VEtWE60000
Jul 09 15:52:38 openhabian influxd[4111]: ts=2021-07-09T13:52:38.788119Z lvl=info msg="Sending usage statistics to usage.influxdata.com" log_id=0VEtWE60000
Jul 09 15:52:39 openhabian influxd[4111]: [httpd] 127.0.0.1 - openhab [09/Jul/2021:15:52:39 +0200] "POST /write?db=openhab&rp=autogen&precision=n&consistency=one HTTP/1.1 " 204 0 "-" "okhttp/3.14.4" eccf29d8-e0bc-11eb-8001-dca63250597
Jul 09 15:52:46 openhabian influxd[4111]: [httpd] 127.0.0.1 - openhab [09/Jul/2021:15:52:46 +0200] "POST /write?db=openhab&rp=autogen&precision=n&consistency=one HTTP/1.1 " 204 0 "-" "okhttp/3.14.4" f0ad7759-e0bc-11eb-8002-dca63250597
Jul 09 15:52:52 openhabian influxd[4111]: [httpd] 127.0.0.1 - openhab [09/Jul/2021:15:52:52 +0200] "POST /write?db=openhab&rp=autogen&precision=n&consistency=one HTTP/1.1 " 204 0 "-" "okhttp/3.14.4" f480a26c-e0bc-11eb-8003-dca63250597

With zram enabled and influxdb running, check if the contents of /var/lib/influxdb and /opt/zram/influxdb.bind are the same. If there’s a dir missing in zram try creating it (also set same owner + permissions) to see if that helps.

at current boot, I had zram enabled, and had to start influx manually. Both directories are identical:

openhabian@openhabian:/var/lib/influxdb $ ls -al
total 32
drwxr-xr-x  1 influxdb influxdb 4096 Jul  9 15:17 .
drwxr-xr-x 34 root     root     4096 Jul  9 13:30 ..
drwxr-xr-x  1 influxdb influxdb 4096 Jul  9 13:48 data
drwxr-xr-x  2 influxdb influxdb 4096 Jul  9 13:48 meta
drwx------  1 influxdb influxdb 4096 Jul  9 13:48 wal
openhabian@openhabian:/var/lib/influxdb $ cd /opt/zram/influxdb.bind/
openhabian@openhabian:/opt/zram/influxdb.bind $ ls -a
.  ..  data  meta  wal
openhabian@openhabian:/opt/zram/influxdb.bind $ ls -al
total 20
drwxr-xr-x 5 influxdb influxdb 4096 Jul  9 13:31 .
drwxrwxr-x 9 root     root     4096 Jul  9 15:17 ..
drwxr-xr-x 4 influxdb influxdb 4096 Jul  9 13:48 data
drwxr-xr-x 2 influxdb influxdb 4096 Jul  9 13:48 meta
drwx------ 4 influxdb influxdb 4096 Jul  9 13:48 wal
openhabian@openhabian:/opt/zram/influxdb.bind $

Get us a log of that that shows why it’s stopped, else this is pure guessing in the dark.

I’m not sure where to find it :thinking: or where to look…
From the other posts I read on the forum all that was asked to see where the 2 logs already posted here - and I was hoping the journalctl would show something someone more knowledgable than me could decypher what was wrong.
The only thing I can see is what the logs show, I rebooted at 15:16, at 15:17 influx starts…and stops. OH starts, and at 15:52 i manually start influx.
If I disable zram, influx starts and keeps running.
Where can I look to find information as to why?
The log where it shuts down mentions

Jul 09 15:18:34 openhabian systemd[1]: influxdb.service: Main process exited, code=killed, status=13/PIPE

“code=killed, status=13/PIPE” - I’m not that skilled with linux nor OH, but this status code 13 may hold some secrets as to why the process gets killed? But where can I find out about what 13 means, and what PIPE means in this regard.
Hope you can point me in the right direction.

that means that influxdb process writes to another process via pipe or socket but the other process does not read any longer.

there is not that info that let’s the bells ring …

You could try to increase the log level to debug in /etc/influxdb/influxdb.conf.
But I can’t promise that it will help.

Increasing the log level did not show anything…but I have had some success:

  1. Your pointers led me to what I think Markus wanted - something that may show why influx does not start. I should add that grafana also does not start on reboot.

journalctl -xe provided the following clues:

Jul 10 22:02:36 openhabian systemd[1]: grafana-server.service: Main process exited, code=killed, status=13/PIPE
-- Subject: Unit process exited
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- An ExecStart= process belonging to unit grafana-server.service has exited.
--
-- The process' exit code is 'killed' and its exit status is 13.
Jul 10 22:02:36 openhabian systemd[1]: grafana-server.service: Succeeded.
-- Subject: Unit succeeded
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- The unit grafana-server.service has successfully entered the 'dead' state.
Jul 10 22:02:40 openhabian frontail[931]: tail: '/var/log/openhab/openhab.log' has been replaced;  following new file
Jul 10 22:02:40 openhabian frontail[931]: tail: '/var/log/openhab/events.log' has been replaced;  following new file
Jul 10 22:02:50 openhabian systemd[1]: systemd-fsckd.service: Succeeded.
-- Subject: Unit succeeded
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- The unit systemd-fsckd.service has successfully entered the 'dead' state.
Jul 10 22:03:12 openhabian systemd[1]: systemd-hostnamed.service: Succeeded.
-- Subject: Unit succeeded
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- The unit systemd-hostnamed.service has successfully entered the 'dead' state.
Jul 10 22:03:24 openhabian systemd[1]: influxdb.service: Main process exited, code=killed, status=13/PIPE
-- Subject: Unit process exited
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- An ExecStart= process belonging to unit influxdb.service has exited.
--
-- The process' exit code is 'killed' and its exit status is 13.

So somethings in the ExecStart process of each of the programs makes them shut down.
Running the commands

/usr/bin/influxd -config /etc/influxdb/influxdb.conf

that influx ExecStart uses gives the following output:

`/usr/bin/influxd -config /etc/influxdb/influxdb.conf

 8888888           .d888 888                   8888888b.  888888b.
   888            d88P"  888                   888  "Y88b 888  "88b
   888            888    888                   888    888 888  .88P
   888   88888b.  888888 888 888  888 888  888 888    888 8888888K.
   888   888 "88b 888    888 888  888  Y8bd8P' 888    888 888  "Y88b
   888   888  888 888    888 888  888   X88K   888    888 888    888
   888   888  888 888    888 Y88b 888 .d8""8b. 888  .d88P 888   d88P
 8888888 888  888 888    888  "Y88888 888  888 8888888P"  8888888P"

2021-07-10T20:06:12.496634Z     info    InfluxDB starting       {"log_id": "0VGWHi40000", "version": "1.8.6", "branch": "1.8", "commit": "v1.8.6"}
2021-07-10T20:06:12.496700Z     info    Go runtime      {"log_id": "0VGWHi40000", "version": "go1.13.8", "maxprocs": 4}
2021-07-10T20:06:12.599770Z     info    Using data dir  {"log_id": "0VGWHi40000", "service": "store", "path": "/var/lib/influxdb/data"}
2021-07-10T20:06:12.599986Z     info    Compaction settings     {"log_id": "0VGWHi40000", "service": "store", "max_concurrent_compactions": 2, "throughput_bytes_per_second": 50331648, "throughput_bytes_per_second_burst": 50331648}
2021-07-10T20:06:12.600159Z     info    Open store (start)      {"log_id": "0VGWHi40000", "service": "store", "trace_id": "0VGWHiU0000", "op_name": "tsdb_open", "op_event": "start"}
2021-07-10T20:06:12.600669Z     info    Open store (end)        {"log_id": "0VGWHi40000", "service": "store", "trace_id": "0VGWHiU0000", "op_name": "tsdb_open", "op_event": "end", "op_elapsed": "0.518ms"}
run: open server: open tsdb store: mkdir /var/lib/influxdb/data/_internal/_series: permission denied
`

This permission denied seems to be the culprit, I think so I hope that this provides an answer as to why influxdb shuts down?
I have tried running sudo chown -R influxdb:influxdb /var/lib/influxdb but that did not resolve the issue.

The second thing I’ve had some success with is a workaround to making influxdb and grafana start at boot. However I’m the type of person that rarely settles for workarounds, rather than finding an actual solution, and cause to the problem, because that implies me learning and understanding new things.
The workaround is as follows:
Add this: RestartForceExitStatus=SIGPIPE to the files /lib/systemd/system/influxdb.service and /lib/systemd/system/grafana-server.service under the section [Service]

I hope the above can show you why influxdb and grafana don’t start, and guide me as to what I should research or try next to actually find and solve the issue.

trying to access the permission denied folder from console is also denied:

openhabian@openhabian:/var/lib/influxdb/data $ ls -al
total 16
drwxr-xr-x 4 influxdb influxdb 4096 Jul  9 13:48 .
drwxr-xr-x 1 influxdb influxdb 4096 Jul 10 22:02 ..
drwx------ 4 influxdb influxdb 4096 Jul  9 13:31 _internal
drwx------ 4 influxdb influxdb 4096 Jul  9 13:48 openhab
openhabian@openhabian:/var/lib/influxdb/data $ cd _internal/
-bash: cd: _internal/: Permission denied

you cannot cd into the directory because only influxdb has rx access. It needs to have rx for others as well for openhabian to be able to cd into that directory.
But that seems to be correct.
You need to be user root then you can cd into that directory.

You overlooked that systemd starts it as user influxdb, and that user has write access rights so that isn’t the cause.

I didn’t overlook that fact - rather I didn’t know it. :wink:
As I mentioned, I’m not that skilled in Linux, so the intricacies of how linux works are still quite unknown to me. With that in mind, I’ll stop wasting your time, and mine, and settle for the workaround I found, since this ExecStart entry was really my only indication of what might have been wrong.

Gentlemen, thank you for your help! I’ll tame my OCD and settle for the workaround, and get on with learning more about OH3.

The workaround (that does not actually solve the underlying problem!) to making influxdb and grafana start at boot is as follows: