Here are the requested outputs:
http://kulturnetz-drensteinfurt.de/Medien/lsof_n.txt
http://kulturnetz-drensteinfurt.de/Medien/lsof_p.txt
Here are the requested outputs:
http://kulturnetz-drensteinfurt.de/Medien/lsof_n.txt
http://kulturnetz-drensteinfurt.de/Medien/lsof_p.txt
I’m having an hard time opening them, they also look really small. Can you validate that the processes uses only 10 MB?
I got an exception during generating the heap dump:
Dumping heap to heap_03.bin ...
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at sun.tools.jmap.JMap.runTool(JMap.java:201)
at sun.tools.jmap.JMap.main(JMap.java:130)
Caused by: java.lang.InternalError: Metadata does not appear to be polymorphic
at sun.jvm.hotspot.types.basic.BasicTypeDataBase.findDynamicTypeForAddress(BasicTypeDataBase.java:278)
at sun.jvm.hotspot.runtime.VirtualBaseConstructor.instantiateWrapperFor(VirtualBaseConstructor.java:102)
at sun.jvm.hotspot.oops.Metadata.instantiateWrapperFor(Metadata.java:68)
at sun.jvm.hotspot.memory.DictionaryEntry.klass(DictionaryEntry.java:71)
at sun.jvm.hotspot.memory.Dictionary.classesDo(Dictionary.java:66)
at sun.jvm.hotspot.memory.SystemDictionary.classesDo(SystemDictionary.java:190)
at sun.jvm.hotspot.memory.SystemDictionary.allClassesDo(SystemDictionary.java:183)
at sun.jvm.hotspot.utilities.HeapHprofBinWriter.writeClasses(HeapHprofBinWriter.java:954)
at sun.jvm.hotspot.utilities.HeapHprofBinWriter.write(HeapHprofBinWriter.java:427)
at sun.jvm.hotspot.tools.HeapDumper.run(HeapDumper.java:62)
at sun.jvm.hotspot.tools.Tool.startInternal(Tool.java:260)
at sun.jvm.hotspot.tools.Tool.start(Tool.java:223)
at sun.jvm.hotspot.tools.Tool.execute(Tool.java:118)
at sun.jvm.hotspot.tools.HeapDumper.main(HeapDumper.java:83)
First result on Google:
Check the user which runs openhab vs the user you try to create the dump with
Here is a snap shot of what happened during a ‘Chrome Refresh’ on the PI3. The system is loosing the control of processes and IO
pi@MeteringPiRed:~/bin $ sudo ./startPerformanceMonitoring.sh
13:10:27 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu 917767 0 38124 899040 13354 0 3210 0 0 0
13:10:37 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu 917787 0 38139 903092 13356 0 3210 0 0 0
13:10:47 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu 917979 0 38167 906956 13357 0 3210 0 0 0
13:10:58 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu 917995 0 38181 911011 13357 0 3210 0 0 0
13:10:08 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu 918351 0 38229 914761 13364 0 3215 0 0 0
13:10:18 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu 918362 0 38257 918942 13364 0 3215 0 0 0
13:10:29 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu 918405 0 38294 922963 13365 0 3217 0 0 0
13:10:39 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu 918428 0 38323 927005 13367 0 3217 0 0 0
13:10:50 --- CLOSE_WAIT: 154 --- CPU_STAT: cpu 921434 0 38434 928022 13367 0 3226 0 0 0
13:10:00 --- CLOSE_WAIT: 316 --- CPU_STAT: cpu 924840 0 38534 928621 13367 0 3244 0 0 0
13:10:10 --- CLOSE_WAIT: 511 --- CPU_STAT: cpu 928567 0 38665 928999 13367 0 3264 0 0 0
13:10:21 --- CLOSE_WAIT: 657 --- CPU_STAT: cpu 932107 0 38779 929488 13367 0 3280 0 0 0
13:10:31 --- CLOSE_WAIT: 869 --- CPU_STAT: cpu 935709 0 38881 930004 13367 0 3300 0 0 0
13:10:42 --- CLOSE_WAIT: 934 --- CPU_STAT: cpu 939342 0 38980 930467 13367 0 3319 0 0 0
13:10:53 --- CLOSE_WAIT: 826 --- CPU_STAT: cpu 943116 0 39041 930885 13367 0 3320 0 0 0
13:10:03 --- CLOSE_WAIT: 721 --- CPU_STAT: cpu 946881 0 39144 931312 13367 0 3323 0 0 0
13:10:14 --- CLOSE_WAIT: 605 --- CPU_STAT: cpu 950653 0 39221 931733 13367 0 3325 0 0 0
13:10:25 --- CLOSE_WAIT: 493 --- CPU_STAT: cpu 954365 0 39293 932175 13368 0 3328 0 0 0
13:10:35 --- CLOSE_WAIT: 386 --- CPU_STAT: cpu 958126 0 39357 932601 13368 0 3330 0 0 0
13:10:46 --- CLOSE_WAIT: 283 --- CPU_STAT: cpu 961826 0 39418 933061 13368 0 3335 0 0 0
13:10:56 --- CLOSE_WAIT: 172 --- CPU_STAT: cpu 965586 0 39467 933498 13368 0 3336 0 0 0
13:10:07 --- CLOSE_WAIT: 79 --- CPU_STAT: cpu 969328 0 39549 933872 13368 0 3340 0 0 0
13:10:18 --- CLOSE_WAIT: 2 --- CPU_STAT: cpu 973039 0 39619 934281 13368 0 3343 0 0 0
13:10:28 --- CLOSE_WAIT: 206 --- CPU_STAT: cpu 976940 0 39720 934473 13368 0 3370 0 0 0
13:10:39 --- CLOSE_WAIT: 376 --- CPU_STAT: cpu 980638 0 39801 934909 13368 0 3385 0 0 0
13:10:49 --- CLOSE_WAIT: 594 --- CPU_STAT: cpu 984336 0 39899 935341 13368 0 3410 0 0 0
13:10:00 --- CLOSE_WAIT: 811 --- CPU_STAT: cpu 987988 0 40007 935814 13368 0 3431 0 0 0
13:10:11 --- CLOSE_WAIT: 1031 --- CPU_STAT: cpu 991954 0 40148 935983 13368 0 3457 0 0 0
13:10:21 --- CLOSE_WAIT: 1232 --- CPU_STAT: cpu 997565 0 40283 936520 13368 0 3493 0 0 0
13:10:37 --- CLOSE_WAIT: 1556 --- CPU_STAT: cpu 1001307 0 40402 936993 13368 0 3525 0 0 0
13:10:48 --- CLOSE_WAIT: 1769 --- CPU_STAT: cpu 1004952 0 40574 937417 13368 0 3574 0 0 0
13:10:59 --- CLOSE_WAIT: 1980 --- CPU_STAT: cpu 1008984 0 40739 937594 13369 0 3609 0 0 0
13:10:10 --- CLOSE_WAIT: 2198 --- CPU_STAT: cpu 1012765 0 40902 937995 13369 0 3644 0 0 0
13:10:21 --- CLOSE_WAIT: 2428 --- CPU_STAT: cpu 1016923 0 41053 938171 13369 0 3665 0 0 0
13:10:32 --- CLOSE_WAIT: 2662 --- CPU_STAT: cpu 1020760 0 41213 938628 13370 0 3686 0 0 0
13:10:44 --- CLOSE_WAIT: 2812 --- CPU_STAT: cpu 1024514 0 41353 939093 13370 0 3707 0 0 0
13:10:54 --- CLOSE_WAIT: 2703 --- CPU_STAT: cpu 1028728 0 41490 939249 13370 0 3717 0 0 0
Were you able to create the heap-dump using the user under which openHAB is running?
Here is the ‘complete’ dump
Yes that one works, it has 2751 connections ‘open’/
http://docs.openhab.org/administration/logging.html
maybe a wireshark file for the traffic between http client and host?
Great
And if you also make a thread dump that might also give some details, if you are in time I might be able to took a look at them in train back home
OK here they are.
Wireshark from two systems - linux kubuntu 16.4 32 bit and windows 7 professional 64 bit
http://kulturnetz-drensteinfurt.de/Medien/firefox_linux.pcap
http://kulturnetz-drensteinfurt.de/Medien/firefox_windows.pcap
Please apply as a filter in wireshark
ip.dst == 192.168.0.100 && http
You will see immediately:
under Linux: the refresh works perfectly - the complete data of the page is requested each time the refresh button is pressed.
Does it render graphs correctly or are there no (correct) graphs? Because the logging shows:
2017-10-09 11:02:00.113 [WARN ] [thome.ui.internal.chart.ChartServlet] - Illegal argument in chart: No Persistence service found.
2017-10-09 11:02:00.086 [WARN ] [thome.ui.internal.chart.ChartServlet] - Illegal argument in chart: No Persistence service found.
2017-10-09 11:02:00.073 [WARN ] [thome.ui.internal.chart.ChartServlet] - Illegal argument in chart: No Persistence service found.
From my point of view this is a subsequent error.
The first refresh under windows crashes the service. No chart is refreshed.
Before I refresh, the view in Firefox / Chrome updates according to the configuration of the openhab service.
Can you try to resolve No Persistence service found.
for example by installing RRD persistence addon
If that resolves it I might be able to setup a server at home which can reproduce this.
The ZeroW is a little bit slower in answering - but it works anyway:
pi@MeteringPiZeroW:~/bin $ sudo ./startPerformanceMonitoring.sh
15:10:51 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu 588714 0 54926 1229134 17284 0 10779 0 0 0
15:10:01 --- CLOSE_WAIT: 1 --- CPU_STAT: cpu 588976 0 54966 1229904 17288 0 10780 0 0 0
15:10:12 --- CLOSE_WAIT: 1 --- CPU_STAT: cpu 589324 0 55004 1230559 17291 0 10782 0 0 0
15:10:23 --- CLOSE_WAIT: 1 --- CPU_STAT: cpu 589352 0 55036 1231538 17291 0 10783 0 0 0
15:10:33 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu 589987 0 55080 1231914 17301 0 10788 0 0 0
15:10:44 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu 590166 0 55113 1232742 17303 0 10790 0 0 0
15:10:54 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu 590229 0 55145 1233686 17304 0 10790 0 0 0
This service is already installed
That’s where the chart data is written to.
Then I fear that you are demanding too much of the poor Pi’s every update of data item will lead to a refresh of chart, it seems that it just cannot keep up the pace. I see 18 requests in first second. And it only seems to be increasing, maybe you have to render / refresh your charts in a different way.
If you close the browser does it then normalize after some time? Will the connections be closed?
It is obviously linked to the usage of a windows browser like firefox or chrome. As long as I use a browser from a linux system it seems to be OK. When I start the view from Winwows, the number of hanging connections increases immediately due to the high number of requests from the windows browser:
16:10:09 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu 2494333 0 93288 3263097 24973 0 11154 0 0 0
16:10:20 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu 2496464 0 93338 3264981 24977 0 11164 0 0 0
16:10:30 --- CLOSE_WAIT: 1 --- CPU_STAT: cpu 2497074 0 93364 3268464 24984 0 11168 0 0 0
16:10:40 --- CLOSE_WAIT: 1 --- CPU_STAT: cpu 2497183 0 93382 3272438 24994 0 11171 0 0 0
16:10:51 --- CLOSE_WAIT: 1 --- CPU_STAT: cpu 2499070 0 93427 3274581 25004 0 11181 0 0 0
16:10:01 --- CLOSE_WAIT: 1 --- CPU_STAT: cpu 2501079 0 93484 3276583 25021 0 11186 0 0 0
16:10:11 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu 2503287 0 93564 3278361 25031 0 11195 0 0 0
16:10:22 --- CLOSE_WAIT: 1 --- CPU_STAT: cpu 2504508 0 93602 3281214 25041 0 11204 0 0 0
16:10:32 --- CLOSE_WAIT: 1 --- CPU_STAT: cpu 2506503 0 93652 3283259 25051 0 11214 0 0 0
16:10:42 --- CLOSE_WAIT: 98 --- CPU_STAT: cpu 2509080 0 93745 3284745 25054 0 11234 0 0 0
16:10:53 --- CLOSE_WAIT: 359 --- CPU_STAT: cpu 2512920 0 93894 3284936 25054 0 11264 0 0 0
16:10:03 --- CLOSE_WAIT: 603 --- CPU_STAT: cpu 2516567 0 94047 3285297 25054 0 11288 0 0 0
16:10:14 --- CLOSE_WAIT: 876 --- CPU_STAT: cpu 2520228 0 94183 3285696 25054 0 11309 0 0 0
16:10:24 --- CLOSE_WAIT: 1042 --- CPU_STAT: cpu 2523920 0 94291 3286108 25054 0 11335 0 0 0
16:10:35 --- CLOSE_WAIT: 990 --- CPU_STAT: cpu 2529506 0 94405 3286617 25065 0 11354 0 0 0
16:10:51 --- CLOSE_WAIT: 925 --- CPU_STAT: cpu 2533408 0 94501 3286850 25093 0 11359 0 0 0
16:10:01 --- CLOSE_WAIT: 802 --- CPU_STAT: cpu 2537180 0 94610 3287274 25093 0 11363 0 0 0
16:10:12 --- CLOSE_WAIT: 671 --- CPU_STAT: cpu 2540926 0 94704 3287656 25103 0 11366 0 0 0
16:10:23 --- CLOSE_WAIT: 537 --- CPU_STAT: cpu 2544659 0 94771 3287933 25261 0 11368 0 0 0
16:10:33 --- CLOSE_WAIT: 415 --- CPU_STAT: cpu 2548419 0 94837 3288345 25261 0 11370 0 0 0
16:10:44 --- CLOSE_WAIT: 289 --- CPU_STAT: cpu 2552204 0 94902 3288770 25261 0 11370 0 0 0
16:10:55 --- CLOSE_WAIT: 152 --- CPU_STAT: cpu 2555920 0 94978 3289194 25275 0 11375 0 0 0
16:10:05 --- CLOSE_WAIT: 18 --- CPU_STAT: cpu 2559799 0 95082 3289386 25278 0 11387 0 0 0
16:10:16 --- CLOSE_WAIT: 1 --- CPU_STAT: cpu 2561755 0 95126 3290790 25942 0 11394 0 0 0
16:10:26 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu 2562647 0 95164 3293971 25952 0 11401 0 0 0
16:10:36 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu 2562664 0 95185 3298056 25955 0 11402 0 0 0
16:10:47 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu 2562693 0 95200 3302193 25958 0 11402 0 0 0
16:10:57 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu 2562713 0 95221 3306219 26036 0 11403 0 0 0
16:10:08 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu 2562978 0 95252 3310016 26061 0 11403 0 0 0
As soon as I close the windows browser the connections are closed and the systems comes back to normal operation.
The Linux view is still working.
Maybe this is a topic for a new thread ‘hight amount of connections after refresh in windows web browser’?
Okay, that is great so the bug is mainly on the browser side. I fear that it is by design to refresh on updated items. I had similar but way smaller problems on my raspberry pi with my smart meter, I got so much data that it could not keep up anymore, but it did not come to this breaking point.
What you could maybe do is linking statically to the chart or switching to another charting solution like InfluxDB+Grafana persistence and graphing
@rlkoshak do you have any other suggestions.