OH 3.1.0 InfluxDB stopped working after update to 1.8.7-1

Wondering if it is somehow caused by openhabian-config ?
I managed to get InfluxDB working after a manual re-install (not via openhabian-config).
All was fine.
Untill I did a 50.Backup Openhab config in the openhabian-config tool… after reboot, same problem again.
Then I realised that this is the second time I had this phenomena (InfluxDb no longer working after 50. Backup Openhab config) …

What I did with openhabian was

02 Upgrade System (which was automatically asked for on starting openhabian conf)

After an reboot an relogin I noticed the I had lots of apt updates open, so I started apt-get update + apt-get upgrade and rebooted again.

Then I noticed that influxdb is not working anymore.

apt reinstall influxdb

did not work for me.

The issues seems to be very common if I read through the internet. But no one has a real solutuion for it. Additional I have the problem that I allready got the version “1.8.10” all other threads at the moment have the problem with “1.8.9”

I have doubts at the moment with doing a complete new install of openhabian as I think I will again run into this trouble. And I hope at the moment to find somewhere a solution which fixes the problem.

I think it is a problem of missing/wrong permission, group or users.

I have no idea why this does not work and if it is important for influxDB

As far as I understand you use version

[main]{2021-10-14T12:07:14+02:00}(9bbfa35)

it looks like there is an update available since yesterday:

I am not sure if this will fix your problem with:

according to the description it could be.

Looking at messages like

it looks like there is a problem with the filesystem. Strange that it seems that it was fixed by fix permissions menu entry.

I am now on openHABian Configuration Tool — [main]{2021-10-24T08:54:39+02:00}(855cb4e)

Influx is 1.8.10…

When I did sudo apt reinstall influxdb, followed by

sudo systemctl restart influxdb, then Influxdb was running again :grinning:, untill I did a sudo reboot then the same problem came back :grimacing:.

I also confirm that Applying file permissions recommendations... FAILED (setgid backups folder)

… ?

is there anything mentioned in journalctl output about the root cause like missing directory of missing permissions after the reboot ?

journalctl output :

-- A stop job for unit influxdb.service has finished.
--
-- The job identifier is 11072 and the job result is done.
Oct 26 10:39:37 openhabian systemd[1]: influxdb.service: Start request repeated too quickly.
Oct 26 10:39:37 openhabian systemd[1]: influxdb.service: Failed with result 'exit-code'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- The unit influxdb.service has entered the 'failed' state with result 'exit-code'.
Oct 26 10:39:37 openhabian systemd[1]: Failed to start InfluxDB is an open-source, distributed, time series database.
-- Subject: A start job for unit influxdb.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- A start job for unit influxdb.service has finished with a failure.
--
-- The job identifier is 11072 and the job result is failed.
Oct 26 10:39:39 openhabian systemd[1]: systemd-timesyncd.service: Succeeded.
-- Subject: Unit succeeded
-- Defined-By: systemd
-- Support: https://www.debian.org/support

Applying file permissions recommendations... FAILED (setgid backups folder) is the result when selecting Fix File Permissions in Openhabian-config tool

This

just shows that there are too many trials to start the service.
Unfortunately I do not see any hint about the root cause why the individual trials failed.

As there is nothing shown in journalctl output is there may be something shown in
/var/log/syslog or in /var/log/influxdb/influxdb.log

Oct 28 14:48:57 openhabian systemd[1]: Starting InfluxDB is an open-source, distributed, time series database...
Oct 28 14:48:57 openhabian influxd-systemd-start.sh[935]: /usr/lib/influxdb/scripts/influxd-systemd-start.sh: line 5: /var/lib/influxdb/influxd.pid: Permission denied
Oct 28 14:48:57 openhabian systemd[1]: influxdb.service: Control process exited, code=exited, status=1/FAILURE
Oct 28 14:48:57 openhabian systemd[1]: influxdb.service: Failed with result 'exit-code'.
Oct 28 14:48:57 openhabian systemd[1]: Failed to start InfluxDB is an open-source, distributed, time series database.

have a look to the directory and file permissions of ( following lines show how it should look like ):

ls -ld /var/lib/influxdb /var/lib/influxdb/influxd.pid
drwxr-xr-x 1 influxdb influxdb 4096 Oct 16 23:10 /var/lib/influxdb
-rw-r--r-- 1 influxdb influxdb    5 Oct 16 23:10 /var/lib/influxdb/influxd.pid

Either the directory belongs to root or the file belongs to root and the pid file cannot be created by the influxdb user. It looks like this seems to be the reason for aborting the startup of influxdb.
It could be that because of a manual start as user root file or directory were created with root’s permissions.

Hi,

Wolfang, first, thank you very much for your support so far.
I’m a semi-noob, meaning I know a few linux commands but I don’t know how to troubleshoot this kind of problems… :innocent:

I can confirm that the influxdb directory was owned by root as you suggested…

ls: cannot access '/var/lib/influxdb/influxd.pid': No such file or directory
drwxr-sr-x 1 root root 4096 Oct 29 11:21 /var/lib/influxdb

When I chown it to influxdb, this did not solve the problem.

But when again I do sudo apt reinstall influxdb then this solves the issue …
Directory /var/lib/influxdb is then also owned by influxdb as you suggested and everything works…

only untill i do a sudo reboot. Then the same problem comes back and strangely the /var/lib/influxdb folder is again owned by root !!!

Am I missing something here about rebooting? Should I first stop all services like openhab, influxdb, grafana and then reboot ? … I’m still lost :roll_eyes:

Or could it be that installing the Influxdb addon in openhab (which I also did before the reboot) changes the directory owner to root ?

the addon is not the problem.
Do you run openhabian ? Do you use zram ?

I can fully confirm that behavior.

I did today a apt-get update + apt-get upgrade. → Still not working.
I opened openhabian-conf → the openhabian changes to “main” where applied → influxdb still not working.

sudo reboot → influx stil not working
After

influxdb started to work again → I am happy that I did not have reinstalled openhabian

influxdb stopped working again, so I now made an apt reinstall influxdb followed by the restart and I leave my system for the moment.

I am running openhabian

###############################################################################
###############  openhabian  ##################################################
###############################################################################
##        Ip = 192.168.1.205
##   Release = Raspbian GNU/Linux 10 (buster)
##    Kernel = Linux 5.10.63-v7l+
##  Platform = Raspberry Pi 4 Model B Rev 1.4
##    Uptime = 0 day(s). 0:0:27
## CPU Usage = 50.25% avg over 4 cpu(s) (4 core(s) x 1 socket(s))
##  CPU Load = 1m: 1.32, 5m: 0.32, 15m: 0.11
##    Memory = Free: 6.77GB (87%), Used: 0.99GB (13%), Total: 7.76GB
##      Swap = Free: 2.43GB (100%), Used: 0.00GB (0%), Total: 2.43GB
##      Root = Free: 218.73GB (97%), Used: 6.18GB (3%), Total: 234.48GB
##   Updates = 0 apt updates available.
##  Sessions = 1 session(s)
## Processes = 140 running processes of 32768 maximum processes
###############################################################################

                          _   _     _     ____   _
  ___   ___   ___   ___  | | | |   / \   | __ ) (_)  ____   ___
 / _ \ / _ \ / _ \ / _ \ | |_| |  / _ \  |  _ \ | | / _  \ / _ \
| (_) | (_) |  __/| | | ||  _  | / ___ \ | |_) )| || (_) || | | |
 \___/|  __/ \___/|_| |_||_| |_|/_/   \_\|____/ |_| \__|_||_| | |
      |_|                          3.1.0 - Release Build

Looking for a place to get started? Check out 'sudo openhabian-config' and the
documentation at https://www.openhab.org/docs/installation/openhabian.html
The openHAB dashboard can be reached at http://hab3:8080
To interact with openHAB on the command line, execute: 'openhab-cli --help'

I am using the main branch

You are currently using the "main" openHABian environment version.                     │
                                             │                                                                                        │
                                             │ The openHABian version to contain the very latest code for openHAB 3 is called "main". │
                                             │ This is providing you with the latest (openHAB3!) features but less people have tested │
                                             │ it so it is a little more likely that you run into errors.                             │
                                             │ You can step back a little and switch to use the stable version now called "openHAB3". │
                                             │ You can switch at any time by selecting this menu option again or by setting the       │
                                             │ 'clonebranch=' parameter in '/etc/openhabian.conf'.                                    │
                                             │                                                                                        │
                                             │    ( ) openHAB3  recommended standard version of openHABian (openHAB 3)                │
                                             │    (*) main      very latest version of openHABian (openHAB 3)                         │
                                             │    ( ) stable    old version of openHABian (openHAB 2)                                 │
                                             │                                                                                        │
                                             │                                                                                        │
                                             │                                                                                        │
                                             │                        <Ok>                            <Cancel>                        │
                                             │                                                                                        │
                                             └────────────────────────────────────────────────────────────────────────────────────────

and I have no idea how to find out if I use ZRAM or not. My RPI 4 boots directly from an SSD I have no SD-Card in the slot.

I hope anybody is capable to find out why some of us run into this problem.

As far as I understand you will see devices like /dev/zram* in case zram is enabled and used.
Besides that you will have entries in /etc/ztab which describe how different directories are managed/bound.
Also the output of mount | grep zram whill show entries in case zram is in use.

The output of this

openhabian@hab3:~ $  mount | grep zram
/dev/sda2 on /opt/zram/persistence.bind type ext4 (rw,noatime)
/dev/zram1 on /opt/zram/zram1 type ext4 (rw,noatime)
overlay1 on /var/lib/openhab/persistence type overlay (rw,relatime,lowerdir=/opt/zram/persistence.bind,upperdir=/opt/zram/zram1/upper,workdir=/opt/zram/zram1/workdir,redirect_dir=on)
/dev/sda2 on /opt/zram/influxdb.bind type ext4 (rw,noatime)
/dev/zram2 on /opt/zram/zram2 type ext4 (rw,noatime)
overlay2 on /var/lib/influxdb type overlay (rw,relatime,lowerdir=/opt/zram/influxdb.bind,upperdir=/opt/zram/zram2/upper,workdir=/opt/zram/zram2/workdir,redirect_dir=on)
/dev/sda2 on /opt/zram/log.bind type ext4 (rw,noatime)
/dev/zram3 on /opt/zram/zram3 type ext4 (rw,noatime)
overlay3 on /var/log type overlay (rw,relatime,lowerdir=/opt/zram/log.bind,upperdir=/opt/zram/zram3/upper,workdir=/opt/zram/zram3/workdir,redirect_dir=on)

the “ls” output is

openhabian@hab3:~ $ ls /dev/zram* -l
brw-rw---- 1 root disk 254, 0 Oct 30 15:52 /dev/zram0
brw-rw---- 1 root disk 254, 1 Oct 30 15:52 /dev/zram1
brw-rw---- 1 root disk 254, 2 Oct 30 15:52 /dev/zram2
brw-rw---- 1 root disk 254, 3 Oct 30 15:52 /dev/zram3

Entries of ztab

# swap  alg             mem_limit       disk_size       swap_priority   page-cluster    swappiness
swap    lzo-rle         200M            450M            75              0               80

# dir   alg             mem_limit       disk_size       target_dir                      bind_dir
dir     zstd            150M            350M            /var/lib/openhab/persistence    /persistence.bind
dir     zstd            150M            350M            /var/lib/influxdb               /influxdb.bind

# log   alg             mem_limit       disk_size       target_dir              bind_dir                oldlog_dir
log     zstd            200M            450M            /var/log                /log.bind

Yes, ZRAM is enabled in your case.
Content of /etc/ztab describes how you can stop it.
I then would do the reinstall of inflxudb; start the zram service again and then try a reboot.
In case that does not help you may add an entry in /usr/lib/influxdb/scripts/influxdb.service ( ExecStartPre=/usr/bin/chown -R influxdb:influxdb /var/lib/influxdb ).
Create a backup of the original file in case that does not work and you would like to go back to the original version of the file.

Thank you this solved my problem.
I did the follwoing:

sudo zram-config "stop"
sudo apt reinstall influxdb
sudo systemctl start zram-config.service
sudo systemctl restart influxdb
sudo reboot

The restart influxdb I did, because after the restart of the zram-configservice it did not run.

Now after the reboot everything is working like a few weeks ago.

Thank you very much for helping me to solve that nasty issue.

5 Likes

Hello,
I’m having similar problems. But in my case after few hours of working I see that everything stops writing data to influx.
I see this on grafana that data is no longer writing to database.
I updated everything, reinstalled influxdb but still after about 18h it stops writing data?
Today it was at 7:20:30. After this time there is no data.
I have checked logs: daemon.log:

Nov 10 07:20:10 openHABianDevice rngd[389]: stats: bits received from HRNG source: 600064

Nov 10 07:20:10 openHABianDevice rngd[389]: stats: bits sent to kernel pool: 543072

Nov 10 07:20:10 openHABianDevice rngd[389]: stats: entropy added to kernel pool: 543072

Nov 10 07:20:10 openHABianDevice rngd[389]: stats: FIPS 140-2 successes: 30

Nov 10 07:20:10 openHABianDevice rngd[389]: stats: FIPS 140-2 failures: 0

Nov 10 07:20:10 openHABianDevice rngd[389]: stats: FIPS 140-2(2001-10-10) Monobit: 0

Nov 10 07:20:10 openHABianDevice rngd[389]: stats: FIPS 140-2(2001-10-10) Poker: 0

Nov 10 07:20:10 openHABianDevice rngd[389]: stats: FIPS 140-2(2001-10-10) Runs: 0

Nov 10 07:20:10 openHABianDevice rngd[389]: stats: FIPS 140-2(2001-10-10) Long run: 0

Nov 10 07:20:10 openHABianDevice rngd[389]: stats: FIPS 140-2(2001-10-10) Continuous run: 0

Nov 10 07:20:10 openHABianDevice rngd[389]: stats: HRNG source speed: (min=326.872; avg=473.195; max=494.212)Kibits/s

Nov 10 07:20:10 openHABianDevice rngd[389]: stats: FIPS tests speed: (min=8.997; avg=23.060; max=31.371)Mibits/s

Nov 10 07:20:10 openHABianDevice rngd[389]: stats: Lowest ready-buffers level: 2

Nov 10 07:20:10 openHABianDevice rngd[389]: stats: Entropy starvations: 0

Nov 10 07:20:10 openHABianDevice rngd[389]: stats: Time spent starving for entropy: (min=0; avg=0.000; max=0)us

Nov 10 07:22:47 openHABianDevice karaf[8583]: Exception in thread "OH-items-2687" java.io.IOError: java.io.EOFException

Nov 10 07:22:47 openHABianDevice karaf[8583]: #011at org.mapdb.Volume$FileChannelVol.getByte(Volume.java:1000)

Nov 10 07:22:47 openHABianDevice karaf[8583]: #011at org.mapdb.Volume.getUnsignedShort(Volume.java:109)

Nov 10 07:22:47 openHABianDevice karaf[8583]: #011at org.mapdb.StoreWAL.longStackGetPage(StoreWAL.java:1046)

Nov 10 07:22:47 openHABianDevice karaf[8583]: #011at org.mapdb.StoreWAL.longStackTake(StoreWAL.java:913)

Nov 10 07:22:47 openHABianDevice karaf[8583]: #011at org.mapdb.StoreDirect.freePhysTake(StoreDirect.java:1098)

Nov 10 07:22:47 openHABianDevice karaf[8583]: #011at org.mapdb.StoreDirect.physAllocate(StoreDirect.java:666)

Nov 10 07:22:47 openHABianDevice karaf[8583]: #011at org.mapdb.StoreWAL.update(StoreWAL.java:404)

Nov 10 07:22:47 openHABianDevice karaf[8583]: #011at org.mapdb.Caches$HashTable.update(Caches.java:270)

Nov 10 07:22:47 openHABianDevice karaf[8583]: #011at org.mapdb.EngineWrapper.update(EngineWrapper.java:63)

Nov 10 07:22:47 openHABianDevice karaf[8583]: #011at org.mapdb.BTreeMap.put2(BTreeMap.java:707)

Nov 10 07:22:47 openHABianDevice karaf[8583]: #011at org.mapdb.BTreeMap.put(BTreeMap.java:643)

Nov 10 07:22:47 openHABianDevice karaf[8583]: #011at org.openhab.persistence.mapdb.internal.MapDbPersistenceService.store(MapDbPersistenceService.java:186)

Nov 10 07:22:47 openHABianDevice karaf[8583]: #011at org.openhab.core.persistence.internal.PersistenceManagerImpl.handleStateEvent(PersistenceManagerImpl.java:152)

Nov 10 07:22:47 openHABianDevice karaf[8583]: #011at org.openhab.core.persistence.internal.PersistenceManagerImpl.stateChanged(PersistenceManagerImpl.java:473)

Nov 10 07:22:47 openHABianDevice karaf[8583]: #011at org.openhab.core.items.GenericItem.lambda$1(GenericItem.java:259)

Nov 10 07:22:47 openHABianDevice karaf[8583]: #011at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)

Nov 10 07:22:47 openHABianDevice karaf[8583]: #011at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

Nov 10 07:22:47 openHABianDevice karaf[8583]: #011at java.base/java.lang.Thread.run(Thread.java:829)

Nov 10 07:22:47 openHABianDevice karaf[8583]: Caused by: java.io.EOFException

Nov 10 07:22:47 openHABianDevice karaf[8583]: #011at org.mapdb.Volume$FileChannelVol.readFully(Volume.java:947)

Nov 10 07:22:47 openHABianDevice karaf[8583]: #011at org.mapdb.Volume$FileChannelVol.getByte(Volume.java:997)

Nov 10 07:22:47 openHABianDevice karaf[8583]: #011... 17 more

I did a upgrade some days ago but hadn´t done a reboot after that.
Today i made a reboot and InfluxDB stopped working.
InfluxDB 1.8.10-1 was installed with the upgrade and i found several threads with issues.
The workarounds didn´t worked for me although the new version should already fix the problems.
I finally had enough and installed 1.8.5-1 manually.
That´s not the best solution but i just don´t want to troubleshoot this for several hours…

fixed it for me to, thanks

This fixed the problem for me.
After following the above procedure I had to re-install InfluxDB and Grafana using openhabian menu 2.4 and this time there were no errors.
Now everything working fine.
Thanks

1 Like