UI unaccessible after some reboots

angusc · May 20, 2018, 6:11am

I have a reoccurring issue that sometimes after a sudo reboot the RaspberryPI will restart as expected but I’m unable to access the UI via the web port 8080 or the phone OH app. However I just noted that while in this state that MQTT service is actually running and if I operate a physical switch that the action is recorded in the MQTT.fx subscribed topic…so it seems that it is just the UI that is somehow not rebooting bck up nor accessibly.
After pulling the power plug to the RaspberryPi and replugging the power can resolve the issue, but as I’m working away on a remote installation that is not always possible.

Would anyone have any suggestions for me to look at that might pin point down this issue? Any way that I can force a system reboot from the MQTT.fx client monitor?

Thanx and Regards
Chris

sihui · May 20, 2018, 7:14am

THAT is a really bad idea:

Read this on how to properly shutdown the RPi:

You most likely already have crashed your operating system.

You need to use ssh, see link above and:
https://docs.openhab.org/administration/console.html

vzorglub · May 20, 2018, 8:18am

Mosquitto is a separate application than openHAB (Not sure that is gramatically corret!)
You reboot your Raspi and mosquitto starts. Good
OH doesn’t. Not good…

Log in with ssh and check the status of OH

sudo systemctl status openhab2.service
You can reboot with
sudo reboot
or even shut down (Which will require removing the power afterwards)
sudo shutdown -r now

angusc · May 20, 2018, 9:11am

There lies the problem as I can not log in using SSH as the connection is disconnected by the remote after i enter the user name “openhabian” so is impossible to reboot again without pulling the power. I guess that is the problem with a headless system…

I did install TeamViewer hoping to be able to reboot from their interface but unfortunately doesn’t seem that Reboot is supported on the RPi TeamViewer interface. Plus in this state the Teamviewer can not login either.

Maybe i should try installing VNC to see if that works.

Thanx I had set this up a couple of days ago to allow remote logging, but again without access to SSH I’m unable to access the logs.

P.S. My wife still not playing fair and has not powered off and on again…so I’m stuck waiting in limbo…

So how can we do a Openhabian system check to find any corruption? Also what is corrupted in OH by pulling the power out? Is it reversible and could be repaired?

Thanx Chris

sihui · May 20, 2018, 10:00am

You are sure you are using openHABian? Otherwise the password is different … check the docs again:
https://docs.openhab.org/administration/console.html

To check the status of openHAB you need to login via ssh. I don’t know of any other way …

It may not be openHAB, it may be your operating system. Usually if something like that happens one takes a new sd card and restores a backup.
If you don’t have a recent backup, try

But do a backup first!

Good luck.

angusc · May 20, 2018, 1:43pm

Thanx you SiHui for the detailed replies and all the informative links!

Yes I’m sure that to log in to SSH I enter user name of “openhabian” followed by my personal password. When I set up the remote access to the logs using the command:

ssh -p 8101 openhab@localhost

Then the default user:password are as per the documentation and I changed the default password by using the comand:

sudo sed -i -e "s/openhab = .*,/openhab = securePassword,/g" /var/lib/openhab2/etc/users.properties

I’m still waiting for my other half to re-power the RaspberryPi so that I can make a backup and then update OH to the latest version…
Do you know what the command is for:

Reset wife to default condition? --non argumentative mode

Thanx for your help and if it works then I will mark as solution.

sihui · May 20, 2018, 2:06pm

If you are trying to access the ssh shell from remote you need to bind the service to all interfaces. By default you can only login from localhost:
https://docs.openhab.org/administration/console.html#bind-console-to-all-interfaces

Same problem here, tell me if you’ve found a solution

angusc · May 20, 2018, 2:36pm

I’m using a DNS to access my home internet something like ‘myname.ddns.net:8080’ That allows me to access the OH installation within my home network. And same for my port 22 when logging in to the SSH

angusc · May 20, 2018, 6:25pm

OK so after a couple of days my daughter eventually was dispatched by my wife to do the deed and restart the Pi!

I ran the below update command and wanted to know if that looks OK or there is anything else that I should do regarding the upgrade (I did the back up thing first):

[20:18:44] openhabian@openHABianPi:~$ sudo apt-get install --reinstall openhab2
Reading package lists... Done
Building dependency tree
Reading state information... Done
0 upgraded, 0 newly installed, 1 reinstalled, 0 to remove and 37 not upgraded.
Need to get 68.6 MB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 https://dl.bintray.com/openhab/apt-repo2 stable/main armhf openhab2 all 2.2.0-1 [68.6 MB]
Fetched 68.6 MB in 42s (1,610 kB/s)
(Reading database ... 54399 files and directories currently installed.)
Preparing to unpack .../openhab2_2.2.0-1_all.deb ...
Unpacking openhab2 (2.2.0-1) over (2.2.0-1) ...
Setting up openhab2 (2.2.0-1) ...
Processing triggers for systemd (232-25+deb9u2) ...
Updating FireMotD available updates count ...
[20:20:48] openhabian@openHABianPi:~$

rlkoshak · May 20, 2018, 6:56pm

you have essentially exposed your entire home automation system to the internet without encryption or authentication?

Please please please don’t do this, ever. At a minimum use port 8433 but really you should, no must, use a reverse proxy with authentication. See the security section under the installation section of the user guide for how to set up nginx or Apache as a reverse proxy.

But even then, I wouldn’t recommend using a reverse proxy unless you really know what you are doing and know how to monitor and detect when your system is compromised and how to mitigate the damage when, not if, your machine does get hacked. The fact that you opened oh up to the internet with no protections what so ever indicates you probably do not have these skills. So I highly recommend using myopenhab.org with the cloud Connector binding to access your oh when not at home.

For ssh, I hope you are using certificates for logging in and have turned off password only based logins.

Every well known port exposed to the internet is under constant attack. If your system is like mine, I bet you will see in sshd’s log at least one failed attempt to login per minute if not more frequently. You won’t even see the attacks against OH because there are zero pretentions in place.

Here is the tl;dr on why powering off any machine that uses flash storage causes corruption.

Flash writes in blocks.

Each block may contain parts of other files (e.g. config files, OS executables, etc) which also get rewritten when this new write takes place to add the part of this new write you the block. What is really happening is the contexts of that block of storage is being moved to another physical block in the flash storage

When you pull the power while a write is taking place, you not only lose that write but everything that was in that block.

Unless you go out of your way not to, something is always being written to the SD card. So the risk of corruption is really high. And when a corruption occurs, you cannot predict what gets corrupted. It might be a bit of log file or it might be the driver that allows your RPi to connect to your WiFi.

There really is no way out of the box to predict or detect that this corruption occurred or whether it occurred. So if you even suspect it had happened, your only recourse is to rebuild the entire SD card from scratch or from a known good backup).

Another problem can occur after you have been running for quite awhile. SD cards wear out and when that happens corruption also can take place. In that case, you need to rebuild on a brand new SD card.

The fact that your network at worst, sshd at best stopped working strongly implies that a corruption did occur. Of course, since you are running completely unprotected on the internet, someone having attempted and possibly even succeeded in hackng your machine is a possibility as well.

If it were me, I’d potentially save some time and rebuild from scratch on a new SD card and one rebuilt, never expose it to the internet. Or if you must, access it through ssh using certs for authentication (you can set it up to require both a cert and password to get 2-factor auth) or set up a VPN. OpenVPN works nicely on an RPi but there are other easier options as well. Or access your OH UIs through myopenhab.org.

Addendum: in some circumstances you might be able to detect a corruption if you use tripwire or aide to alert when a file unexpectedly changes. However, it won’t be full proof (what if the tripwire database is what gets corrupted?) and you have to be really knowledge to know what does are supposed to change and when I’m order to know if something unexpected happened.

angusc · May 20, 2018, 8:39pm

no just my test set up that I’m playing with while away from home, but thank you for so candid observations and suggestions. I will switch to RSA keys for authorization.

Back to my question above, does the output from SSH look to be in order? and furthermore if the SDCard becomes corrupted is it terminal for the card or will a re format fix the card and make it "whole"again?

vzorglub · May 20, 2018, 8:48pm

There are not errors so everything appears to be in order.

No. Once an SD card starts to “go” the individual memory “cells” are gone and can’t be retreived. They will be put aside by a formatting but there will be new ones sooner rather than later. Put it in the bin.

rlkoshak · May 20, 2018, 9:06pm

The only thing that is terminal for the card is wearing out. A reformat it reimage will info any corruption that was done, assuming the image you are restoring isn’t corrupted.

If you are in doubt as to whether the problem is wearing out versus a per failure then I always recommend replacing the SD card.

Unless your test setup is somehow isolated from the rest of your lan then it can become a starting point from which to attack the rest of your computers. And even if it is isolated, it can become compromised and join a bot net to attack other computers or conscripted for Bitcoin mining and the like.

It doesn’t tell me anything really. If for example the sshd binary got corrupted there is nothing in the above output that will tell me of that fact. In this context, that output doesn’t tell me anything useful except perhaps that the apt binary isn’t corrupted and perhaps most of the a pi t the related support files are ok too.

This is good advice only for a card wearing out. If the problem was caused by a load of power, only the file system is corrupted. The card itself is fine and doesn’t need to be replaced.

Where it becomes a problem is when you don’t know the cause in which case I usually default to replace the card. The cost of a new card is usually less than the time it takes to try continuing with a potentially wearing out card.

angusc · May 21, 2018, 8:38am

Thanx for your replies, the sdcard is only 3 months old so should be ok, but I know what you mean about time vs money in regards to how long it takes to setup and later issues rather than replacement of the card from the outset.

As this is my test system that will become live at some point in the future, I’m more concerned about system reliability as I will be automating a remote location so it brings me to aanother question, is it better to have the OH installed on a raspberryPI or a NAS such as Synology in terms of system reliability add data corruption issues?

rlkoshak · May 21, 2018, 12:36pm

It is better to not use an an SD card or flash for your main storage. It is good to have UPS battery backup and gracefully shutdown in a power outage. Even on an HDD, fine system corruption can occur at loss of power, though the damage is usually much less severe.

Beyond that it’s mox nix. The RPi itself is capable of running for months at a time. I had one that got up to almost 200 days without a reboot and it would have gone on indifinitely had we not had a blackout. But this RPi is configured to have almost no writes to the SD card and it to not running OH.

angusc · May 21, 2018, 1:17pm

Thanx Rich,

You had me there on ‘macht nichts’ but Google helped me out. So would an OH installation on the home NAS be more robust that the RPi? It sounds like the RPi with a standard OH installation is going to need regular shut down and restarts to avoid problems from frequent file writing that takes place as standard, and for that I would need a way to disconnect the power to be able to restart the RPi after correctly shutting down with ‘sudo shutdown’ and for that I could use a NC relay and ESP with local network command line to open and close the relay, given that the RPi would be in a powered of state.

I already purchased a power bank that I will use as a ups ounce I return back to base and can lay my hands on the the RPi.

rlkoshak · May 21, 2018, 2:55pm

I have a bad habit of of using English/American idioms too frequently on this international forum. I had a great thread awhile back because I used the idiom “red herring” (a misleading clue).

Define “more robust”. OH will probably have more CPU and memory to work with. If you are writing to an HDD or SDD your chances of file corruptions or SD card wearing out is less. If you have a NAS in RAID then a HDD/SDD failure is even less of a problem.

But you can run an RPi off of an HDD/SDD. So it isn’t a matter of the RPi being a problem, it is the SD card that is the weak point. If you already have a NAS, you can run OH on that NAS or configure the RPi to put the heavy writes on that NAS over the network (logs, persistence).

No, when the SD card wears out the only way to fix it is to replace the SD card. Restarting the RPi will do nothing to fix the problem. And as discussed above, disconnecting the power has a high liklihood of causing your SD card to become corrupted which will only exacerbate the problem. Never pull the power from an SD card driven device unless you have no other choice.

But over all, no matter what you run on, there will be failures. You must have a good backup and restore procedure. Once you have that, then using a more robust system, whatever that may be, really only buys you time between failures. So you need to weigh whether it is worth the extra time to set up something more complex.

angusc · May 26, 2018, 4:29pm

Hi Rich,

Well they do say “a little knowledge is a dangerous thing”, I’m sure there is a German equivalent…

Any ways I wonder if you might be able to assist me out of this one…:

I ran

sudo visudo

And added lines

root    ALL=(ALL:ALL) ALL
openhab ALL=(ALL:ALL) ALL
osmc    ALL=(ALL:ALL) ALL

and

sudo ALL=NOPASSWD: ALL

but I think I messed up as my usual login is with openhabian and not openhab… so now I’m unable to access that file to edit it as i get the message:

openhabian is not in the sudoers file.  This incident will be reported.

Any advise (apart from the obvious )

Thanx in anticipation…

rlkoshak · May 26, 2018, 5:44pm

Ouch. So your solution is to:

start over
mount the SD card on another Linux system and run visudo -f /path/to/sudoers/on/sdcard and add openHABian back in.

I don’t think it is required to modify sudoers to give openhab sudo permission. I think you just need to added the openHAB user to the sudoers group. I could be wrong with openHABian though. That is how standard raspbian works.

Finally, please don’t give openhab full sudo permission on everything. You may as well run OH as root. If you must give openhab sudo, only do so for those commands it needs.

angusc · May 26, 2018, 7:05pm

Thanx for your advice…I have to wait for two weeks until I return home to be able to do that fix.

Will let you know how it goes then.

BR
Chris