java.io.IOException: Too many open files

Yes that one works, it has 2751 connections ‘open’/

  1. can you enable some debug logging:

http://docs.openhab.org/administration/logging.html

  1. Can you check the Chrome developer tools whether the browser server really fired so many requests, especially the network tab is relevant, just check what requests are being fired at the server, most likely requests fail and are somehow retried and fail again, at least that is what I hope :slight_smile:

maybe a wireshark file for the traffic between http client and host?

1 Like

Great

And if you also make a thread dump that might also give some details, if you are in time I might be able to took a look at them in train back home :slight_smile:

OK here they are.
Wireshark from two systems - linux kubuntu 16.4 32 bit and windows 7 professional 64 bit

http://kulturnetz-drensteinfurt.de/Medien/firefox_linux.pcap
http://kulturnetz-drensteinfurt.de/Medien/firefox_windows.pcap

Please apply as a filter in wireshark

ip.dst == 192.168.0.100 && http

You will see immediately:

  • under Linux: the refresh works perfectly - the complete data of the page is requested each time the refresh button is pressed.
  • under Windows: after the first refresh the Browser generates tons of chart requests and destroys the service

under Linux: the refresh works perfectly - the complete data of the page is requested each time the refresh button is pressed.

Does it render graphs correctly or are there no (correct) graphs? Because the logging shows:

2017-10-09 11:02:00.113 [WARN ] [thome.ui.internal.chart.ChartServlet] - Illegal argument in chart: No Persistence service found.
2017-10-09 11:02:00.086 [WARN ] [thome.ui.internal.chart.ChartServlet] - Illegal argument in chart: No Persistence service found.
2017-10-09 11:02:00.073 [WARN ] [thome.ui.internal.chart.ChartServlet] - Illegal argument in chart: No Persistence service found.

From my point of view this is a subsequent error.

The first refresh under windows crashes the service. No chart is refreshed.

Before I refresh, the view in Firefox / Chrome updates according to the configuration of the openhab service.

Can you try to resolve No Persistence service found. for example by installing RRD persistence addon

If that resolves it I might be able to setup a server at home which can reproduce this.

The ZeroW is a little bit slower in answering - but it works anyway:

pi@MeteringPiZeroW:~/bin $ sudo  ./startPerformanceMonitoring.sh
15:10:51 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu  588714 0 54926 1229134 17284 0 10779 0 0 0
15:10:01 --- CLOSE_WAIT: 1 --- CPU_STAT: cpu  588976 0 54966 1229904 17288 0 10780 0 0 0
15:10:12 --- CLOSE_WAIT: 1 --- CPU_STAT: cpu  589324 0 55004 1230559 17291 0 10782 0 0 0
15:10:23 --- CLOSE_WAIT: 1 --- CPU_STAT: cpu  589352 0 55036 1231538 17291 0 10783 0 0 0
15:10:33 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu  589987 0 55080 1231914 17301 0 10788 0 0 0
15:10:44 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu  590166 0 55113 1232742 17303 0 10790 0 0 0
15:10:54 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu  590229 0 55145 1233686 17304 0 10790 0 0 0

This service is already installed :wink:
That’s where the chart data is written to.

Then I fear that you are demanding too much of the poor Pi’s every update of data item will lead to a refresh of chart, it seems that it just cannot keep up the pace. I see 18 requests in first second. And it only seems to be increasing, maybe you have to render / refresh your charts in a different way.

If you close the browser does it then normalize after some time? Will the connections be closed?

It is obviously linked to the usage of a windows browser like firefox or chrome. As long as I use a browser from a linux system it seems to be OK. When I start the view from Winwows, the number of hanging connections increases immediately due to the high number of requests from the windows browser:

16:10:09 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu  2494333 0 93288 3263097 24973 0 11154 0 0 0
16:10:20 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu  2496464 0 93338 3264981 24977 0 11164 0 0 0
16:10:30 --- CLOSE_WAIT: 1 --- CPU_STAT: cpu  2497074 0 93364 3268464 24984 0 11168 0 0 0
16:10:40 --- CLOSE_WAIT: 1 --- CPU_STAT: cpu  2497183 0 93382 3272438 24994 0 11171 0 0 0
16:10:51 --- CLOSE_WAIT: 1 --- CPU_STAT: cpu  2499070 0 93427 3274581 25004 0 11181 0 0 0
16:10:01 --- CLOSE_WAIT: 1 --- CPU_STAT: cpu  2501079 0 93484 3276583 25021 0 11186 0 0 0
16:10:11 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu  2503287 0 93564 3278361 25031 0 11195 0 0 0
16:10:22 --- CLOSE_WAIT: 1 --- CPU_STAT: cpu  2504508 0 93602 3281214 25041 0 11204 0 0 0
16:10:32 --- CLOSE_WAIT: 1 --- CPU_STAT: cpu  2506503 0 93652 3283259 25051 0 11214 0 0 0
16:10:42 --- CLOSE_WAIT: 98 --- CPU_STAT: cpu  2509080 0 93745 3284745 25054 0 11234 0 0 0
16:10:53 --- CLOSE_WAIT: 359 --- CPU_STAT: cpu  2512920 0 93894 3284936 25054 0 11264 0 0 0
16:10:03 --- CLOSE_WAIT: 603 --- CPU_STAT: cpu  2516567 0 94047 3285297 25054 0 11288 0 0 0
16:10:14 --- CLOSE_WAIT: 876 --- CPU_STAT: cpu  2520228 0 94183 3285696 25054 0 11309 0 0 0
16:10:24 --- CLOSE_WAIT: 1042 --- CPU_STAT: cpu  2523920 0 94291 3286108 25054 0 11335 0 0 0
16:10:35 --- CLOSE_WAIT: 990 --- CPU_STAT: cpu  2529506 0 94405 3286617 25065 0 11354 0 0 0
16:10:51 --- CLOSE_WAIT: 925 --- CPU_STAT: cpu  2533408 0 94501 3286850 25093 0 11359 0 0 0
16:10:01 --- CLOSE_WAIT: 802 --- CPU_STAT: cpu  2537180 0 94610 3287274 25093 0 11363 0 0 0
16:10:12 --- CLOSE_WAIT: 671 --- CPU_STAT: cpu  2540926 0 94704 3287656 25103 0 11366 0 0 0
16:10:23 --- CLOSE_WAIT: 537 --- CPU_STAT: cpu  2544659 0 94771 3287933 25261 0 11368 0 0 0
16:10:33 --- CLOSE_WAIT: 415 --- CPU_STAT: cpu  2548419 0 94837 3288345 25261 0 11370 0 0 0
16:10:44 --- CLOSE_WAIT: 289 --- CPU_STAT: cpu  2552204 0 94902 3288770 25261 0 11370 0 0 0
16:10:55 --- CLOSE_WAIT: 152 --- CPU_STAT: cpu  2555920 0 94978 3289194 25275 0 11375 0 0 0
16:10:05 --- CLOSE_WAIT: 18 --- CPU_STAT: cpu  2559799 0 95082 3289386 25278 0 11387 0 0 0
16:10:16 --- CLOSE_WAIT: 1 --- CPU_STAT: cpu  2561755 0 95126 3290790 25942 0 11394 0 0 0
16:10:26 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu  2562647 0 95164 3293971 25952 0 11401 0 0 0
16:10:36 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu  2562664 0 95185 3298056 25955 0 11402 0 0 0
16:10:47 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu  2562693 0 95200 3302193 25958 0 11402 0 0 0
16:10:57 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu  2562713 0 95221 3306219 26036 0 11403 0 0 0
16:10:08 --- CLOSE_WAIT: 0 --- CPU_STAT: cpu  2562978 0 95252 3310016 26061 0 11403 0 0 0

As soon as I close the windows browser the connections are closed and the systems comes back to normal operation.
The Linux view is still working.

Maybe this is a topic for a new thread ‘hight amount of connections after refresh in windows web browser’?

Okay, that is great so the bug is mainly on the browser side. I fear that it is by design to refresh on updated items. I had similar but way smaller problems on my raspberry pi with my smart meter, I got so much data that it could not keep up anymore, but it did not come to this breaking point.

What you could maybe do is linking statically to the chart or switching to another charting solution like InfluxDB+Grafana persistence and graphing

@rlkoshak do you have any other suggestions.

That is what I would suggest. I think the important part is to link to the prerendered chart so all the work goes on in the Grafana server and the browser has just the one request and one image to download each refresh period.

The problem does not occur because of the refresh period. The service breaks down as soon as I refresh manually the view inside Chrome / Firefox. As long as the browser performs the automatic reload nothing strange happens. Everything seems to work.

Things change because of manual interaction of the user.

Anyway - tomorrow I will check the proposed combination of DB an chart service on my raspberries.

This morning I tested the ‘too many client connections’ problem with a 64 bit Windows 8.1 pro. I was not able to reproduce it. The Pi3 serves faster than the PiZeroW (surprise surprise), but I got a response on every refresh browser view action.

Only the Windows 7 pro System causes the trouble with the connections.

Interesting but bizarre…

I have a similar issue but I don’t have graphs. Running Synology OH2.3 stable with latest Java release; does this within 24 hours of a clean start (cache, tmp, logs) all cleared out.

2018-07-07 07:08:18.378 [ERROR] [me.storage.json.internal.JsonStorage] - Error writing JsonDB to /volume1/@appstore/openHAB/userdata/jsondb/org.eclipse.smarthome.core.thing.Thing.json. Cause /volume1/@appstore/openHAB/userdata/jsondb/org.eclipse.smarthome.core.thing.Thing.json (Too many open files)

Best, Jay

I found that I had exposed over 400+ items using the “openHAB Cloud” binding; once I removed all those exposed items - my issue went away.

I’m guessing there is a bug in having so many items exposed?

Best, Jay

Very interesting. I would recommend filing an issue against the openHAB Cloud Connector Add-on here.