EXTRA_JAVA_OPTS -Xmx settings ie Grafana crashes openhab 2.4

Well, seems like I´m about to give up find the reason why Grafana 5.1.4 needs PhantomJS.
This is the error I get:

t=2019-01-09T23:56:37+0100 lvl=info msg=Rendering logger=png-renderer path="d-solo/UgPbUoggz/stort-bad?refresh=30s&orgId=1&panelId=4&from=1547052995409&to=1547074595409&width=1000&height=500&tz=UTC%2B01%3A00"
t=2019-01-09T23:56:37+0100 lvl=eror msg="Could not start command" logger=png-renderer LOG15_ERROR= LOG15_ERROR="Normalized odd number of arguments by adding nil"
t=2019-01-09T23:56:37+0100 lvl=eror msg="Rendering failed." logger=context userId=1 orgId=1 uname=admin error="fork/exec /usr/share/grafana/tools/phantomjs/phantomjs: no such file or directory"
t=2019-01-09T23:56:37+0100 lvl=eror msg="Request Completed" logger=context userId=1 orgId=1 uname=admin method=GET path=/render/d-solo/UgPbUoggz/stort-bad status=500 remote_addr=10.4.28.30 time_ms=15 size=1703 referer="http://10.4.28.237:3000/d/UgPbUoggz/stort-bad?refres$

I renamed the PhantomJS file back to its original name. Then I took a look at @matt1 suggestions for the changes in the link he provided.

I changed the /etc/default/openhab2 to
EXTRA_JAVA_OPTS="-XX:+HeapDumpOnOutOfMemoryError -Xmx512m"

And then rebooted.

After this, I can render 3 charts wihtout a problem, (almost)… It seems like something else fails when I do so.

2019-01-10 00:06:32.003 [WARN ] [su.litvak.chromecast.api.v2.Channel ] - Error while reading
su.litvak.chromecast.api.v2.ChromeCastException: Remote socket closed
	at su.litvak.chromecast.api.v2.Channel.read(Channel.java:425) ~[235:org.openhab.binding.chromecast:2.4.0]
	at su.litvak.chromecast.api.v2.Channel.access$200(Channel.java:51) ~[235:org.openhab.binding.chromecast:2.4.0]
	at su.litvak.chromecast.api.v2.Channel$ReadThread.run(Channel.java:137) [235:org.openhab.binding.chromecast:2.4.0]
2019-01-10 00:06:32.060 [WARN ] [su.litvak.chromecast.api.v2.Channel ] -  <--  null payload in message 

==> /var/log/openhab2/events.log <==

2019-01-10 00:06:32.084 [hingStatusInfoChangedEvent] - 'chromecast:chromecast:255f3cf49521e13fa5f92fc38ae7ac51' changed from ONLINE to OFFLINE
2019-01-10 00:06:32.098 [hingStatusInfoChangedEvent] - 'chromecast:chromecast:255f3cf49521e13fa5f92fc38ae7ac51' changed from OFFLINE to OFFLINE (COMMUNICATION_ERROR): Interrupted while waiting for response

It does come back to online again a few seconds after.

2019-01-10 00:06:42.472 [hingStatusInfoChangedEvent] - 'chromecast:chromecast:255f3cf49521e13fa5f92fc38ae7ac51' changed from OFFLINE (COMMUNICATION_ERROR): Interrupted while waiting for response to ONLINE

This only happens sometimes…

Whats good, openhab doesn´t seem to crash, even though I´m using PhantomJS and rendering… I have thought about that maybe webview would be better… It´s alot faster. I have not tested how much memory it takes though. I´ll keep an close eye on the system the next few days… I dont really trust it atm.

However I still wish I could find the reason why Grafana needs PhantomJS using version 5.1.4. From what I´ve read, it´s required from version 5.2.0 only.

webview with no PhantomJS present? that’s the main point. there should be no PhantomJS on your machine at all.

is a slightly different url, see my example. if you are using the same url as you are with image then there is no difference as far as Grafana is concerned.

actually, that version comes with PhantomJS. and if your are generating static images in that version your are using PhantomJS to do so.

The newer version of grafana does not ship with PhantomJS and has no ability to generate static image charts unless you go out of your way to install PhantomJS yourself.

they got rid of PhantomJS for a reason…

because older versions of grafana have anyways used PhantomJS and shipped with Phantonjs. they dropped the library with the latest version because PhantomJS is old, buggy, and no longer being maintained. they did not, however, provide a replacement for the functionality that library provided. so your choice is to manually install PhantomJS yourself and suffer the consequences, or abandon generating static chart images.

is only required to be installed manually from 5.2.0. Previous versions of Grafana ship with the library as part of it.

This is what I was trying to get at in the recommendation thread. if you are using the latest version of grafana which is installed by OpenHABian then you should not be experiencing this sort of problem. it is the manual steps you took to add PhantomJS which ultimately led to the problem.

I found I got less ram issues out of a PI when I left off the Xms parameter. Since I no longer use a PI I never looked into it in enough depth to be 100% sure why this was, perhaps it was FRAGMENTATION caused by the use of a GC which does not compact after cleaning.

I value my time higher then the cost to upgrade so I see it as a waste of time trying to make a system that is clearly near its maximum to work as it may just stop working the next update or when I add a few extra components to my smart home… I now use an Odroid C2 since it has twice the ram and twice the cpu power and everything you own for the PI can be moved across except the case which may work too if you take a dremel/file to it.

If I outgrow the C2 (I use 5% of the CPU and only 25% of the ram so its not close to running out for what I do) I would be looking at this as my next step but note the ram will cost you more as it is not supplied…
https://www.hardkernel.com/shop/odroid-h2/

You can then use your SSD via real SATA3 and unlock its speed (or use M2 drive) and have up to 32gb of ram. All in a fanless design and very low power draw. X86 based which means you get less issues from things in linux not working on ARM processors which happens from time to time.

Yes… I misunderstood the 5.2.0 update of Grafana. I thought PhantomJS was implented at that time, and previous version didn´t.

Not exactly, Rich.
The Grafana version that comes with openhabian-config is 5.3.4. This version requires PhantomJS to be able to render charts, right?
But you have a point in Grafana will work without PhantomJS, by using Webview insted. And this makes it acceptable for beeing included in openhabian-config. However, it should be noticed, that the use of rendering (PhantomJS) is not recommended, and specially not for a user who is using a computer like the Rpi…

I guess we can agree on this?

However, there is still something strange going on which applies to openhab…
I had no problems with rendering Grafana charts using openhab 2.3 together with Grafana 5.1.4, which we just agreed, used PhantomJS for rendering as well.
My problems started when updating openhab from version 2.3 to version 2.4, without making any changes to Grafana 5.1.4.

This means something had to have been changed in openhab 2.4 to make this problem appear.
Like Markus has written a few times, and I finally agree with. It could be openhab 2.4 just consume more memory/CPU which makes this problem appear, together with PhantomJS beeing a really shitty piece of software.

A few minutes ago openhab crashed again due to rendering. Meaning, the changes Matt suggested in his link didn´t change anything anyway. The difference between last night and now is, that I´m connected via the openhab Androind app. Last night I was connected locally from my workstation using Google Chrome.

Final conclusion:
If using an Rpi and Grafana - Dont use PhantomJS to render Grafana charts in openhab version >2.3.
It will probably work fine with openhab 2.3 and previous versions, but not with openhab 2.4, (and openhab 2.5 as well, as this is what I´m running on my other Rpi 3B+, ie. the test setup).
Using any other kind of hardware configuration may/may not work as well. Over-all it´s probably best to just stay away from using charts rendering (PhantomJS).

I believe this (or something simular) should be an important notice in the docs of openhabian-config in openhab version >2.3.

Agree… I have given up trying to solve this.
I´ll be using webview for Grafana charts untill I find an suitable replacement for my Rpi.

Odroid C2 does look interesting. However I couldn´t find any info, if it can boot from USB.
I´m thinking about getting a Intel NUC insted. Intel NUC (J4005) is just a bit more expencive than the Odroid C2 16GB Kit.

Give it one more try and add another swapfile or grow the existing one (see my post how to, but don’t set the swappiness value).
At least in theory, the memory that’s useable to processes is not RAM but the virtual memory which is including swap so no Linux should be killing processes unless swap is exhausted, too.
Dunno if Raspbian works different but it’s worth a try.

I´ll take a look at your post. Thanks Markus.

Wrong. It only requires that library to render static images of your charts. And as of that version of Grafana, that feature is no longer supported. Grafana removed PhantomJS and did not replace it with anything (not a move I agree with but I have no influence over what Grafana does).

One can ignore this and restore the ability to render static images of charts by manually restoring PhantomJS. But that puts your Grafana in an unsupported and unrecommended configuration.

But static images are not the only way to put charts on your sitemap and the other ways are still supported.

I believe the original tutorial is a wiki and can be edited by anyone to remove the parts that talk about static images.

I don’t see what that could have been since all OH does is call the URL that you pass to it at the refresh rate defined. There isn’t anything that OH could do to cause Grafana to run off the rails like that except to bombard it with requests too fast. While something like that is possible, I would expect that other users who are using the Image tag on the sitemap would be seeing similar problems and to my knowledge there has been no report of it. I don’t see how OH 2.4 using marginally more resources would suddenly cause PhantomJS to go nuts.

I agree the InfluxDB+Grafana tutorial should be updated. I do not think this needs to be mentioned in the openHABian docs. openHABian does not install PhantomJS nor does it provide any information about the option to install it. As far as I know, the only place a user will learn to install it manually is from that thread on the forum or from external sites. We can only take so much responsibility for what users do to their system outside the scripts. We’d have a thousand pages worth of warnings.

I dont understand this…
If I click the share button of a panel from inside Grafana, and then click on direct link, it open a new browser window and gives a rendering error if PhantomJS isn´t installed in the grafana/tools/ folder.

I know of no other way to put charts on a sitemap except for using webview, as you suggested.

I did expect that as well.
But like I said earlier, this is not just the upgrade from openhab 2.3 to openhab 2.4 (meaning, it´s not just this single Rpi setup). I also did the test on my other Rpi (the very minimal test setup with only 3 bindings running). It crashed as well, though it can handle more charts at the same time before it crashes. That´s a clean 2.4 release install.

I would agree with you, if it wasn´t for the situation I just did descripe first in this post. If PhantomJS isn´t available, it will result in an rendering error using direct link. I believe this will force some users to go look for PhantomJS and install it, (just like I did). I do not see myself being that much unique :slight_smile:

Exactly. The original tutorial presents two ways to put charts on your sitemap using a Webview.

Image is no longer supported by Grafana.

Which is why I recommend changing the tutorial which is the only place I’m aware of that mentions the direct link.

Markus, should I just deleted the value, or comment out the command CONF_SWAPSIZE=100 ??
I assume it requires an reboot after change?

EDIT: I gave it a try without the value and rebooted the Rpi. It did start but it felt rather slow.
After start I enter a few sitemaps with rendering charts. It went well. While playing around with sitemaps and rendering charts, I use the free -h command to study whats going on…

This is how it looked:

[23:09:02] openhabian@openHABianPi:~$ free -h
             total       used       free     shared    buffers     cached
Mem:          970M       938M        32M       6.7M        20M       330M
-/+ buffers/cache:       587M       383M
Swap:         1.9G         0B       1.9G
[23:11:27] openhabian@openHABianPi:~$ free -h
             total       used       free     shared    buffers     cached
Mem:          970M       939M        31M       6.7M        21M       330M
-/+ buffers/cache:       587M       383M
Swap:         1.9G         0B       1.9G
[23:11:56] openhabian@openHABianPi:~$ free -h
             total       used       free     shared    buffers     cached
Mem:          970M       633M       337M       944K       4.1M        67M
-/+ buffers/cache:       562M       408M
Swap:         1.9G        53M       1.9G
[23:15:56] openhabian@openHABianPi:~$ free -h
             total       used       free     shared    buffers     cached
Mem:          970M       649M       320M       416K       1.1M        54M
-/+ buffers/cache:       594M       376M
Swap:         1.9G       485M       1.4G
[23:19:13] openhabian@openHABianPi:~$ free -h
             total       used       free     shared    buffers     cached
Mem:          970M       565M       405M       584K       5.0M        88M
-/+ buffers/cache:       472M       498M
Swap:         1.9G       214M       1.7G
[23:20:28] openhabian@openHABianPi:~$ free -h
             total       used       free     shared    buffers     cached
Mem:          970M       634M       336M       980K       1.6M        66M
-/+ buffers/cache:       566M       404M
Swap:         1.9G       105M       1.8G
[23:24:50] openhabian@openHABianPi:~$

It looks like it makes highly use of the swapfile now, 485M tops. Thats way beyound the default 100M it original was set for…

Does it mean this is solved. Well not really…
I couldn´t resist trying to push it to its limits. So while beeing connected to a sitemap with 3 charts rendering, I took my phone and the openhab Android app, and entered the same sitemap from it. Then things really slowed down. Infact if slowed down so much, so the Unifi Binding failed and lost connection… It came back online a few seconds after.
But what does seem to be solved is that openhab dont crash anymore (or at least I couldn´t get it to crash, only to slow down alot. But loosing a binding just for some seconds can be bad enough though.

Of course it will be slow, see my previous reply where I stated you never want a swap to be in use.

Your DDR2 ram is capable of >4000 MB per second.
Your SSD is crippled by usb2 which is shared and will not be reaching 60MB per second.

66 times slower. If you need more ram, you need more ram.

No comparison, the j4005 is dual core and the Odroid H2 is quad core. It also has an extra sata 3 and the far better NVME slot for seriously fast drives and supports up to 32gb ram instead of only 8. Of course a i5 NUC would be better but the price jumps and if you don’t need that amount of CPU power.

The Odroid C2 can boot from USB but only if you use a flash card to tell the box to boot from USB, the pi1 and 2 used to do it this way.

I personally prefer small cheap systems as I keep a full second system ready to go in case something breaks and I am not around to fix it.

Ofcouse… But the Rpi doesn´t offer this option :slight_smile:

I was comparing C2 with the NUC J4005.
For the H2, it should be compared with the NUC J5005 (NUC7PJYH).

I like that idea as well. And I agree the Rpi/Odroid are suitable for this without paying alot of money.

Then that is awesome if you can get the NUC for the price of C2 (locally the nuc with ram and drive will cost double a c2 setup) as it will unlock the speed of your ssd and give more ram ability. The NVMe drives are faster than sata ssd so good to have this feature for future proofing but it is not needed.

Time will tell but at least you have much more memory available to processes now.
(AFAIK Raspbian will now allocate 2 times RAM = 2G for swap so your processes may now grow to 3G in total.

And now that you mentioned CONF_SWAPFILE was 100M only, so you had 1.1G virtual mem in total, it’s obvious why that caused your problems.

Well yes and no. Yes of course it’s slower, you’ll see heavy paging right after you startup Pi and OH so startup maybe is noticeably slower but once things are paged out for the first time most RAM pages needn’t be touched (copied) again so from then on it’s as fast as usual.
And even if slower, does that create any problem ? No it doesn’t. No need for additional RAM.

The Intel NUC7CJYH (J4005) incl powersupply (and case ofcouse) = £115
The Intel NUC7PJYH (J5005) incl powersupply (and case ofcouse) = £165
Locally here in Denmark. Odroid I´ll have to order from the UK.

I dont think you can compare C2 and NUC (J4005) on the price. The NUC is a totally different CPU architexture, (x86). The C2 is ARM.
But looking at the NUC (J5005) and the Odroid H2, it looks much different. The NUC is far better, and yet alot cheaper. The Odroid have some advantages though… The NVMe bus is great, and it has a GPIO as well. But it doesn´t really make up for it´s price.

When it had started it acutally feels quite fast. However, when I pushed it to it limits, it got very slow. And as mentioned, Unifi binding failed. This shouldn´t be happning. I have seen Chromecast binding going offline as well. I guess this is due to some timing issues in the way bindings works in openhab.

Looking at the size of the swap file now, it does seem obvious that the setting of 100m was fare from enough. But this is the default setting the Rpi… I guess noone wants to use swapfile, if using an very slow SD card, which is default for the Rpi. And thats probably why default setting is 100m.

Surprise!

Wrong, that’s the only way things can function in complex, open setups like this.
You’re never content, are you ?

Sure I am… I just notice things like these, cause if/when a binding fail (device goes offline) it could have a fatal impact on the rest of the Home Automation System, all depending on which device/binding failing.
So it´s important to notice to make sure, if/when something goes wrong, that this wont have a fatal impact on something else.

Sorry to resurrect this thread.
I am experiencing the same OH crash issue as in the first post, OH 2.5.2 with Grafana 5.1.4 crash with rendering static image of 2 graphs on RPi3B.

What is the conclusion here?
Any changes in configuration that works?