InfluxDB doesn't work after reboot, problem with rights

Hello,
I have a problem with InfluxDB running on Raspberry.
Initially I installed it following this tutorials: InfluxDB+Grafana persistence and graphing and InfluxDB+Grafana persistence and graphing - #86 by ThomDietrich
and everything was working perfectly. Signals from sensors were going to InfluxDB on the same Raspberry and I could view them with Grafana on my laptop. However, until the first reboot (sorry, did it by simply unplugging the power, but this must be permitted, since there can be and are electricity cut-offs and system must handle this without problems).
After reboot - no new data in the InfluxDB.
I found that the influxd service is not running:

openhabian@openhabian:~ $ influx
Failed to connect to http://localhost:8086: Get http://localhost:8086/ping: dial tcp [::1]:8086: connect: connection refused
Please check your connection settings and ensure 'influxd' is running.

Then, after some research I found problem with rights:

openhabian@openhabian:~ $ systemctl stop influxd
openhabian@openhabian:~ $ /usr/bin/influxd -config /etc/influxdb/influxdb.conf &> /tmp/influxdb.log
openhabian@openhabian:~ $ cat /tmp/influxdb.log
..... some lines here ...
run: open server: open tsdb store: mkdir /var/lib/influxdb/data/_internal/_series: permission denied

Then I found possible solution here: docs.influxdata. com/influxdb/v1.8/troubleshooting/systemd/ (sorry, not allowed to insert more than two links) and here: github. com/influxdata/influxdb/issues/8912#issuecomment-431896205) (to change the owner of /var/lib/influxdb to influxdb). Tried but still error when tried to restart the service with restart command:

openhabian@openhabian:~ $ sudo chown -R influxdb:influxdb /var/lib/influxdb/*
openhabian@openhabian:~ $ sudo systemctl restart influxdb.service
Job for influxdb.service failed because the control process exited with error code.
See "systemctl status influxdb.service" and "journalctl -xe" for details.

Rebooted the system - no luck. So I deleted the whole data (with sudo rm -rf /var/lib/influxdb/).
Still not running. Then after tries and errors I found how to start the service:

openhabian@openhabian:~ $ sudo systemctl daemon-reload
openhabian@openhabian:~ $ sudo systemctl enable influxdb.service
openhabian@openhabian:~ $ sudo systemctl start influxdb.service
openhabian@openhabian:~ $ influx
Connected to http://localhost:8086 version 1.8.9
InfluxDB shell version: 1.8.9

Still do not understand why it simply doesn’t start itself after reboot - that was the idea of enabling the service with commands above, but instead I have to enter these commands again. And why restart command doesn’t work but sequence above does work.

But OK, this is some progress…
I re-created all tables, gave access rights like in an initial tutorial but OH doesn’t write to Influx. Then I tried to write something to InfluxDB from CLI but can’t due to authorization problem:

openhabian@openhabian:~ $ influx -username openhab -password 'my_password' -host localhost
Connected to http://localhost:8086 version 1.8.9
InfluxDB shell version: 1.8.9
> use test
WARN: authorization failed
Using database test
> sine_wave value=0.0
ERR: authorization failed
> exit

Tried also to write to DB from the python script: gist.github. com/ThomDietrich/ff836dbe0f0eaa2c5270a846a963893b) from the tutorial which worked fine after initial installation, but this time it returns with authorization error: Failed to add point to influxdb (401) - aborting.

I’ve checked attributes:

openhabian@openhabian:~ $ cd /var/lib
openhabian@openhabian:/var/lib $ ls -l influxdb
total 16
drwxr-xr-x 3 influxdb influxdb 4096 Sep 13 16:37 data
-rw-r--r-- 1 influxdb influxdb    5 Sep 14 13:41 influxd.pid
drwxr-xr-x 2 influxdb influxdb 4096 Sep 14 01:39 meta
drwx------ 3 influxdb influxdb 4096 Sep 13 16:37 wal
openhabian@openhabian:/var/lib/influxdb/data $ sudo ls -l ./_internal/_series
total 32
drwxr-xr-x 2 influxdb influxdb 4096 Sep 10 14:30 00
drwxr-xr-x 2 influxdb influxdb 4096 Sep 10 14:30 01
drwxr-xr-x 2 influxdb influxdb 4096 Sep 10 14:30 02
drwxr-xr-x 2 influxdb influxdb 4096 Sep 10 14:30 03
drwxr-xr-x 2 influxdb influxdb 4096 Sep 10 14:30 04
drwxr-xr-x 2 influxdb influxdb 4096 Sep 10 14:30 05
drwxr-xr-x 2 influxdb influxdb 4096 Sep 10 14:30 06
drwxr-xr-x 2 influxdb influxdb 4096 Sep 10 14:30 07

Now I need help, don’t know how to fix this issue. Seems this is an issue with access rights. I’m not strong in Linux and do not understand what should be sufficient rights (who should be the owner) since systemctrl starts the influx process.
I’d like to know how to fix this problem without re-installing influxdb from scratch, because re-installing won’t be a stable solution. In this case, I’ll fall into the same problem after any reboot.

P.S. My grafana connects to the database itself (so SELECT query works), but gets empty return.
P.P.S. Sorry, I understand that this question is not related to OpenHAB itself, but rather to Influx. But since I use it in context of OpenHAB and installed using tutorials from OpenHAB - I ask for help here.

  • Platform information:
    • Hardware: Raspberry Pi 3 Model B Plus Rev 1.3 / 1GB RAM / 32GB Flash
    • OS: Raspbian GNU/Linux 10 (buster) / Linux 5.10.52-v7+
    • Java Runtime Environment: which java platform is used and what version
    • openHAB version: 3.1.0

Sorry, it’s not permitted and it runs a very high risk of damaging your file system. It is especially a problem when using any flash based memory. Pulling the plug while a write is taking place will not only lose the file being written but any part of a file in the same sector. With wear leveling that could be anything. To avoid this you either need to not use flash based storage, use an UPS to avoid sudden loss of power, or both since pulling the plug can damage any file system, even those not on flash.

In all likelihood your file system was damaged and it most likely was damaged beyond just InfluxDB. I wouldn’t trust it. Your best bet will be to rebuild/restore from backup. Even if you get past this one problem, you will almost certainly run into another, and then another. It’ll be less work over all to rebuild/restore than to try to fix all these problems.

I’d bet that what ever runs the services at start was also broken by the power loss.

Thanks for the answer, I’ve also read your post to another question today, but I’ve asked my one almost the same time so didn’t see your answer.
It’s very sad about such a situation with power demand. Professionally I work with industrial controllers which often control critical processes and it sits deep in my mind that such systems must handle power outages correctly. Also, I believed that the smart home system must be protected from such issues because power outage always happens. I can’t imagine big UPS next to 20 times smaller Raspberry Pi. But it is like it is. Of course, openHAB has nothing to do with this.
Is it possible to use something else than sd card with Raspberry to protect from such problems?
Also, let’s say I use sd-card and UPS. At some point UPS will also discharge. What to do then? Let’s say I can detect power outage and provide digital signal (Raspberry IO tolerant), is there a way to react on it and shut-down the system automatically? Can it start automatically after power is back (using some IO etc.)?

Regarding backup, I did a backup using CLI: sudo openhab-cli backup but as I understand, this is just some config files, so I’ll have to install Influx and similar services again. At the moment I have no experience with Amanda (just started with openHAB).

Who said anything about a huge battery? They make tiny little UPS that sit right on top of the RPi. Search for “Raspberry Pi UPS” and you’ll find lots of options.

But to your prior point, the RPi was never designed nor intended to be used in industrial contexts.

An HDD. Even an external SDD is often safer even though it uses flash because there are built in buffers and capacitors to let it finish the current write before all the power drains away.

Those are pretty basic UPS functions. Network UPS Tool (NUT) works with most UPS including many that work with RPis. They also will often come with their own software to do all of these.

The only one I’m not positive is covered by all UPS is the automatic turning the machines back on when power is restored. Usually the computer will run on battery for a time. Once the battery is depleted to a certain percentage the machine(s) will shutdown normally. Then it’s up to you to bring them back online in the right order.

Correct, that only backs up openHAB’s configs.

UPDATE

Reinstalled everything but the problem persists. InfluxDB stops working each time after reboot. But now I know how to fix it.
This is some problem with access rights. I change ownership of data files to non-root user and then back to influxdb user:

sudo chown -R 1000:1000 /var/lib/influxdb
sudo chown -R influxdb:influxdb /var/lib/influxdb

and the InfluxDB comes back. Even no need to restart the service (just need to wait couple of seconds).

So, I have workaround to the problem but have no idea whats going on there. And my workaround doesn’t solve the problem completely because after each reboot the database is not starting automatically, so I need to enter the two commands above.

I can just imagine that there might be a problem with data buffering in RAM, so the process which flushes data to the sd card influences meta-data. But it’s just guessing, I’m not competent in Linux.

If anyone can explain what’s going on or can help to fix it - I’ll appreciate!

1 Like

If the permissions are reverting back after a reboot that might point to the problem being a worn out SD card. Each SD card has a finite number of writes that it will support. When you reach that limit weird things start to happen, one of which is changes made before a boot are reverted after the boot.

If you are using ZRAM that too could cause a problem like this because all your changes are made in RAM and only flushed to the card on a reboot. However, assuming you are rebooting and not just pulling the power that’s probably not what’s happening here.

But I’ll say it again. If a system behaves strangely after a power loss, your best bet is to rebuild/restore from backup. You can’t trust the file system after that point. You’ve no idea what files have become corrupted and what that corruption will cause.

SD card is fresh (SanDisk Ultra, 32GB), just some days in use on Raspberry.
Also, I’ve deleted all Influx data, re-installed Influx completely several times. So all files should be new (means written to another cells of card).
I have a doubt that SD-card is a source of the problem.
But I’ll follow your suggestion. I’ll take another sd-card (never used with Raspberry) and restore the complete system from the image.

And I’ll never pull the power from the raspberry any more :slight_smile:

By the way, regarding the UPS. I looked for them and found several devices which stack on the IOs.
The info I needed was how to shut-down Raspberry automatically using signal from that UPS. So I just want to share what I found. Some devices send info about battery level over I2C interface. Then, there are applications to shut-down Raspberry on low battery level. Here is example of it. I ordered such a UPS and plan to install it. Will see how it works.

You failed to mention you run on openHABian. /var/lib/influxdb is on zram there and on boot rights get reset to what they are on SD source.
You can run menu option 14 or manually change /opt/zram/influxdb.bind/ to fix this.

But there should be no need to change permissions at least if you are using latest openhabian so I suspect you don’t.

I have the smallest UPS I could find that will work with Network UPS Tools. My goal is simply to keep my RPi and modem running when there are short outages (a few minutes or so) and shut down cleanly when there are long outages. In my case, there’s no point to openHAB being available when nothing else in my house has power.

Using NUT to monitor the UPS, openHAB notifies me when the power goes out, when it comes back, and if the RPi gets shut down.

I’ve never thought about this, but the added benefit of having my modem on the UPS is that it drains the battery until it’s dead (after the RPi has shut down). So when the power returns, the RPi boots up again on its own. But yeah, if the the battery doesn’t fully deplete, you have to manually reboot the RPi.

1 Like

Thank you very much for your input! Unfortunately menu option 14 didn’t help and there is no file /opt/zram/influxdb.bind exists. Maybe my influx instance is not running on zram, how can I find this out? In /opt/zram/persistence.bind there is no influxdb-related folders, but there are db4o, mapdb and rrd4j.
My /etc/ztab:

# swap  alg             mem_limit       disk_size       swap_priority   page-cluster    swappiness
swap    lzo-rle         200M            450M            75              0               80

# dir   alg             mem_limit       disk_size       target_dir                      bind_dir
dir     zstd            150M            350M            /var/lib/openhab/persistence    /persistence.bind

# log   alg             mem_limit       disk_size       target_dir              bind_dir                oldlog_dir
log     zstd            200M            450M            /var/log                /log.bind

Sorry if I didn’t mention something. In my first post I wrote Platform information, I run openHAB v.3.1.0, currently it is latest stable version. I strictly followed this and this tutorials on openHAB when installing Influx. No “special” modifications to the system, I’ve started with openHAB just week ago, trying to build simple logging system, currently I have just one temperature/humidity sensor based on esp8266 (tasmota), I use UI to add bindings, channels and things. So, I mean my installation is quite simple without any “tricks”.

Any other ideas?

You needlessly manually installed it on top of openhabian which has the option to install influx.
And those instructions are 5 years old and pretty sure outdated.
Install Influx from the menu.

Recommend you don’t use an SDCard - it will invariably get corrupt.

You are right, power gets interrupted in the real world but your industrial controller isn’t using SD Cards either.

I use a SATA/USB Adapter + SSD. It’s rock solid even after an unscheduled power outage.

Did two attempts.

  1. Made full file system backup (with Win32DiskImager) and restored to another SD-card that has never been used with Raspberry Pi before. First time after boot I had to change owner for the InfluxDB data folder and it worked. But again only some time. After couple reboots (with sudo reboot or sudo halt), it stopped working with the same behavior.

  2. Did clean installation: image to fully formatted SD-card using Balena Etcher, then installed Mosquitto, InfluxDB+Grafana from menu system, then restored openHAB backup using menu system. Result - the same as above. Worked until the third or fourth reboot. Then again I have to change ownership of data folder and restart the service every time the system boots up.

I have no ideas more :frowning:

  1. Well, you took a byte for byte backup of a system that is already corrupted. You need to restore from a backup from before the power loss. From before the problem started happening. Making a copy of a broken system and restoring that is only going to make a copy of the broken system.

  2. Restore just openHAB or restored from an Amanda backup? When was the backup taken? Are you using ZRAM? There could be something going on with ZRAM.

I usually do not quote myself but here goes (again):

If this time you installed from menu that directory should exist.

Yes, it exists.
I tried to apply menu option 14 but it didn’t help. I have to say that I tried a lot to solve the problem before applying menu option 14, so I can’t be sure if it helped or not.
But then I switched off the authorization in /etc/influxdb/influxdb.conf and Influxdb started, everything was working.
Then I changed a string bind-address = "[192.168.0.2:8086](http://192.168.0.2:8086)" to bind-address = ":8086" and Influxdb works also with http authorization. I think I changed this line to the wrong state in the past when trying to solve the problem.

So far everything work. Even after couple reboots and shut-downs. Hope it will remain.

@rlkoshak I installed the openHAB from the official image to the new sd-card, then installed Mosquitto, Influxdb+Grafana from menu, and then restored settings from openHAB backup (not Amanda, not byte-to-byte). By default it used zram. Now it works with- and without zram.

For the final installation I’m going to switch off the zram and use commit=1800 option in /etc/fstab. The project is to record temperature/humidity at the remote site without internet connection. There will be 4-6 sensors sending data every 30 minutes, so the writing frequency to the sd-card is relatively low. Duration of the project is about 1 year. With such requirements I hope not to wear out the card. I’ll try to install the UPS, but it’s quite possible that somebody will pull the system out of the power socket. Then, as I understand, in case I use zram, the data will be flushed to the sd-card only in case of proper reboot (or am I wrong?). If so, then it’s possible to lose the data for months if there will be long power outage and for some reason the system will not shut down properly. The data is the core in this project. That’s why I want to disable zram.

Thank you very much Rich and Markus for your help. Thank all who commented. I learned a lot.

Duh. My belief was the fix was in there but it wasn’t. Added.

That’s right and reasonable not to use zram in this case… However zram does not exclusively apply to and help with DB data but also logs and swap (few know and everyone is overlooking this).
I would therefore not switch off zram completely but rather remove the Influx storage directory from zram. Just edit /etc/ztab to accomplish that.

PS: you should have told upfront about your use case. You made us victim to the XY problem:
How to ask a good question / Help Us Help You - Tutorials & Examples - openHAB Community

1 Like

This is great! Thanks so much! I had the very same problem and this fixed it. So cool that you found a workaround AND (!) documented it here. I’m so gratefull. :slight_smile: