Unfortunately there is still a slight slope in the used heap, but very small. You don’t see it in the nice graphs from @matt1 when only showing 2 days, but on 7 days it is clear.
Created several heap dumps and used the compare function between two snapshots, but no leak is detected.
Didn’t changed anything the last week, only started with creating a group to modify later the percistance strategy from default to store what is needed.
Next steps are:
if no change, start with disabling bindings 1 by 1. But this will take some time, especially due to the very small slope.
Enabled this morning to monitor the number of threads from the Systeminfo binding, this is the result until now.
Seeing the very small slope inclining on heap perhaps the elapsed time is to short to say something about this?
Below the graphs of 3 days ago. Guess one could slowly start seeing a slight incline in number of threads.
Coming weekend no time to work on openhab, so will keep the system run w/o making any changes this weekend.
Actually it is not to bad to do. Of course nice is different, but with the great help here and the really helpful feature of showing the heap in a graph is it pretty quick to determine if there is a change.
With about 12h you can see already a bit the direction, with ~24h you know pretty sure if there is a change. And you see the small slope, I have more the 2 weeks before the free space is gone.
All bindings have been checked except systeminfo. No change in the inclining slope.
Will try once removing all bindings after the weekend, but this is quite strange…
can it be the issue is connected to using the API connected to Nodered or something else?
I only have 1 rule forwarding the notification, can also try to disable that one.
Then I am out of idea’s, so if somebody had additional things to look at would be helpful.
Again an update on the current status. Tried now to delete all bindings and see the result.
Below two screenshots with the outcome, from 16/11 onward the system was removed with all bindings and restarted. 14-16/11 was with all bindings on.
Conclusion, there is a slight difference but not a lot. Will activate the bindings again and continue the search in other directions. Maybe the persistence services or NodeRed is causing issues.
Small update from my side, also to help others. After having all bindings removed, persistence removed and things disabled I reduced the searching and increased the heap size. Didn’t had much time and was out of idea’s.
But, didn’t stopped thinking and when new idea’s popped up I tried them. Also after updates kept an eye on the heap level to see if it improved.
Now I’ve found something concrete, apparently I still had 1 .item file which contained a link to a non-existing thing. Didn’t caused any errors, but it did increase the heap growth!
You can see it in the graph below, improved it at #1.
But, then I found also the scene control in the marketplace and decided to install. Also installed the JS scripting extension when I wasn’t succeeding at first. This event is #2 in the graph.
After seeing the heap level slope increasing again, removed only the JS scripting (#3) (actually wasn’t necessary at all to have it installed) and it seems to stabilize a bit again.
Will wait for 1 or 2 days, if stable, I will install it again to see if it is reproduceable.
After some more days letting the system run without further changes it seems stable.
Thanks all (special thanks for @matt1, @rossko57, @wborn, @Andrew_Rowe ) for the good assistance and the work on the new used heap channel of the Systeminfo binding to get a quick insight.
List of changes made which might of might not helped, for sure some made the heap leak less steep:
Remove dead nodes from the z-wave system
reduced persistence level to reduce workload
reduced unused items/channels, especially of items that where updated every second to reduce workload
improved modbus by instructions of Rossko57 (link)
Finally, think most important change:
removed 1 ‘dead’ link in an old .items file. Had one .items file and completely forgot about this one.
Took also the opportunity to also remove one .rules file and move it to the UI, so I don’t make the same mistake again.
Saw the JS scripting was already noticed in other threads so will leave that discussion there.
Please do that as we all want any bugs to get fixed so that other people don’t go throu what u have just done. Please narrow down which item it is and post the line here or a link to the GitHub issue. Also from your last graph it can not be from 3.2 Stable, suggest you backup and then upgrade when you have time to fault find any issues.
Will continue the search, no prob.have placed the file back, so should see it within a day or two.
According openhab self I am on 3.2 stable and I also have the new features from 3.2 (eg marketplace).
Why do you think I am not?
Didn’t updated the naming of the item when updating from your custom build jar though if that’s your reference now.
Edit: actually didn’t thought this would be a bug, just me making an error. (if this ia really reproduceable)
It is a bug if Openhab can not handle the user error gracefully. Not acceptable for a program to run out of memory and crash so if you can narrow it down so others can reproduce the fault, then someone can take a look to fix it.
I am wrong then about u on old build as it must have been a major change in the core that caused memory to make sudden changes to the memory and not the heap resizing and showing in the graph. Ignore my comments on that.
Ok, well, maybe the problem is not solved…
After placing the file back I did manage to get the heap increase, but after removing I it doesn’t stop anymore.
This might mean the root cause wasn’t the invalid channel, but there is something else. Perhaps I was to fast with concluding.
Still have a RPI3B doing nothing, will install this one with a fresh openhabian image and import my configuration towards there. This to excluded it is caused by any configuration issue/bug. Think I can exclude any of the bindings since I’ve removed them all 1 by 1, including all things disabled.
Maybe my system is damaged ‘somewhere’ which isn’t overwritten by updating or switching from milestone to stable.
That is possible as corruption can do anything. Make sure you have a UPS to prevent unplanned power resets when the SD card is in the middle of writing to the card. Also openhabian has zram features that will help, make sure they are setup.