Help understanding quartz crashing - "Worker thread was interrupt()'ed."

Folks,

I had another thread around about my cron problems but nothing suggested in there helped.

I’ve dug deeper and have found more clues as to my problem but I am now lost… Any thoughts or help appreciated as this is now causing my “smart home” to be a “dumb home” :smiley:

OH 2.4 on RPi 3B+ with nothing else running on it.
Symptoms - cron works for a few hours and then just stops working.
I’ve removed all long-running (HTTP and script task) cron rules and use a proxy item instead which cron triggers.
I’ve only got about 10 cron rules, one of which runs every minute (it’s a counter)
I’ve removed all “unnecessary” cron jobs for the sake of debugging.
To fix the problem, a restart is the only option.

The “culprit” rule

rule "Calc Last Motion ago"
when     Time cron "0 0/1 * 1/1 * ? *" //every minute
then


    g_MotionMinutesAgo?.allMembers.forEach[item |
        if (item === null){
            logWarn("Calc Last Motion ago", "Item item was null. Very odd")
            return;
        }
        if (item.state == NULL){
            item.sendCommand(0)
            logWarn("Calc Last Motion ago", "Item " + item.name + " was null. Setting to zero")
        }
        else {
            var num = (item.state as Number)
            if (num === null){
                logWarn("Calc Last Motion ago", item.name  + " num was null. Very odd")
                num = 0;
            }
            item.sendCommand(num + 1) 
            //logWarn("Calc Last Motion ago", "Setting to " + (num + 1).toString)
        }
    ]

end 

The Items:

Group:Switch        g_Motion                    "House motion group"
Group:Number:MIN    g_MotionMinutesAgo


Number MotionMinutesAgo (g_MotionMinutesAgo) ///try not use this.
Number MotionMinutesAgo_Proxy (g_MotionMinutesAgo)
Number MotionTest

Switch      Hallway_MotionSensor                                                    <motion>        (g_persist_change, g_Motion)    { channel="mihome:sensor_motion_aq2:158d0001e5266c:motion" }
Number      Hallway_MotionSensor_MotionMinutesAgo "Last Motion detected"                            (g_MotionMinutesAgo)

Switch      Frontroom_MotionSensor                                                    <motion>    (g_persist_change, g_Motion)    { channel="mihome:sensor_motion_aq2:158d00029a5baf:motion" }
Number      Frontroom_MotionSensor_MotionMinutesAgo "Last Motion detected"                            (g_MotionMinutesAgo)
//there are 4 or 5 of these items in the group. 

Take a look at how out of order the log entries are. They’re all over the place.

My theory is this that when someone moves, the motion triggers the rule which resets the counter back to zero however, at the same time, the cron kicks in (or the other way around) and odd things happen. A deadlock of sorts internally trying to update the item and read them?
Whatever is happening, I get the interrupt()'ed exception and null.

This used to happen within a few hours. I changed the quartz thread count to be 5 and, while it still happens, it takes longer before cron stops working all together.

snipped log to see pattern.

1	2019-01-05 10:48:14.674 [INFO ] [home.model.script.Motion - resetting] - Motion from - Kitchen_MotionSensor - resetting to zero	
2	2019-01-05 10:48:12.596 [WARN ] [me.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.eclipse.smarthome.io.monitor.internal.EventLogger@1e2e736' takes more than 5000ms.	
3	2019-01-05 10:47:37.641 [INFO ] [home.model.script.Motion - resetting] - Motion from - Hallway_MotionSensor - resetting to zero	
4	2019-01-05 10:47:27.061 [WARN ] [me.model.script.Motion - Auto Lights] - Bathroom_MotionSensor caused auto-lights - state:OFF	
5	2019-01-05 10:47:17.012 [WARN ] [l.script.controls.tablet.state.rules] - Brighten Screen - Setting brightness to 100.	
6	2019-01-05 10:52:12.890 [WARN ] [me.model.script.Motion - Auto Lights] - Room: Bathroom	
7	2019-01-05 10:49:42.950 [DEBUG] [rg.quartz.core.QuartzSchedulerThread] - batch acquisition of 1 triggers	
8	2019-01-05 10:49:42.949 [ERROR] [org.quartz.simpl.SimpleThreadPool   ] - Worker thread was interrupt()'ed.	
9	java.lang.InterruptedException: null	
10		at java.lang.Object.wait(Native Method) ~[?:?]
11		at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:568) [107:org.eclipse.smarthome.core.scheduler:0.10.0.oh240]
12	2019-01-05 10:49:12.610 [INFO ] [htwaverf.internal.LightwaveRfBinding] - Update from LightwaveRf ItemName[Power_Main], State[726]	
13	2019-01-05 10:49:12.610 [DEBUG] [org.quartz.core.JobRunShell         ] - Calling execute on job DEFAULT.state.motion.lastmotion.rules#Calc Last Motion ago#0 0/1 * 1/1 * ? *	
14	2019-01-05 10:55:12.954 [INFO ] [htwaverf.internal.LightwaveRfBinding] - Update from LightwaveRf ItemName[Power_Solar_prxy], State[144]	
15	2019-01-05 10:54:42.934 [ERROR] [org.quartz.simpl.SimpleThreadPool   ] - Worker thread was interrupt()'ed.	
16	java.lang.InterruptedException: null	
17		at java.lang.Object.wait(Native Method) ~[?:?]
18		at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:568) [107:org.eclipse.smarthome.core.scheduler:0.10.0.oh240]
19	2019-01-05 10:53:42.928 [WARN ] [org.apache.karaf.services.eventadmin] - EventAdmin: Event Task Processing Interrupted. Events may not be recieved in proper order.	
20	2019-01-05 10:53:17.891 [WARN ] [me.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.eclipse.smarthome.io.monitor.internal.EventLogger@1e2e736' takes more than 5000ms.	
21	2019-01-05 10:53:13.007 [WARN ] [me.model.script.Motion - Auto Lights] - Turning off	
22	2019-01-05 10:56:09.753 [WARN ] [org.apache.karaf.services.eventadmin] - EventAdmin: Event Task Processing Interrupted. Events may not be recieved in proper order.	
23	2019-01-05 10:55:43.145 [DEBUG] [rg.quartz.core.QuartzSchedulerThread] - batch acquisition of 1 triggers	
24	2019-01-05 10:56:09.760 [INFO ] [htwaverf.internal.LightwaveRfBinding] - Update from LightwaveRf ItemName[Power_Main], State[696]	
25	2019-01-05 10:55:43.135 [ERROR] [org.quartz.simpl.SimpleThreadPool   ] - Worker thread was interrupt()'ed.	
26	java.lang.InterruptedException: null	
27		at java.lang.Object.wait(Native Method) ~[?:?]
28		at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:568) [107:org.eclipse.smarthome.core.scheduler:0.10.0.oh240]
29	2019-01-05 10:56:09.801 [INFO ] [htwaverf.internal.LightwaveRfBinding] - Update from LightwaveRf ItemName[Power_Solar_prxy], State[149]	
30	2019-01-05 10:56:09.843 [INFO ] [htwaverf.internal.LightwaveRfBinding] - Update from LightwaveRf ItemName[Power_Main], State[701]	

Full log

2019-01-05 10:46:00.005 [DEBUG] [rg.quartz.core.QuartzSchedulerThread] - batch acquisition of 0 triggers
2019-01-05 10:46:00.005 [DEBUG] [org.quartz.core.JobRunShell         ] - Calling execute on job DEFAULT.state.motion.lastmotion.rules#Calc Last Motion ago#0 0/1 * 1/1 * ? *
2019-01-05 10:46:05.426 [INFO ] [htwaverf.internal.LightwaveRfBinding] - Update from LightwaveRf ItemName[Power_Main], State[846]
2019-01-05 10:46:13.955 [INFO ] [htwaverf.internal.LightwaveRfBinding] - Update from LightwaveRf ItemName[Power_Solar_prxy], State[147]
2019-01-05 10:46:20.420 [INFO ] [htwaverf.internal.LightwaveRfBinding] - Update from LightwaveRf ItemName[Power_Main], State[770]
2019-01-05 10:46:23.940 [DEBUG] [rg.quartz.core.QuartzSchedulerThread] - batch acquisition of 0 triggers
2019-01-05 10:46:29.051 [INFO ] [htwaverf.internal.LightwaveRfBinding] - Update from LightwaveRf ItemName[Power_Solar_prxy], State[149]
2019-01-05 10:46:35.417 [INFO ] [htwaverf.internal.LightwaveRfBinding] - Update from LightwaveRf ItemName[Power_Main], State[733]
2019-01-05 10:46:49.459 [INFO ] [home.model.script.Motion - resetting] - Motion from - Kitchen_MotionSensor - resetting to zero
2019-01-05 10:46:47.579 [WARN ] [me.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.eclipse.smarthome.io.monitor.internal.EventLogger@1e2e736' takes more than 5000ms.
2019-01-05 10:46:47.195 [DEBUG] [rg.quartz.core.QuartzSchedulerThread] - batch acquisition of 1 triggers
2019-01-05 10:46:44.141 [INFO ] [htwaverf.internal.LightwaveRfBinding] - Update from LightwaveRf ItemName[Power_Solar_prxy], State[147]
2019-01-05 10:48:42.602 [DEBUG] [rg.quartz.core.QuartzSchedulerThread] - batch acquisition of 1 triggers
2019-01-05 10:48:42.597 [DEBUG] [org.quartz.core.JobRunShell         ] - Calling execute on job DEFAULT.state.motion.lastmotion.rules#Calc Last Motion ago#0 0/1 * 1/1 * ? *
2019-01-05 10:48:29.208 [WARN ] [org.apache.karaf.services.eventadmin] - EventAdmin: Exception during event dispatch [org.osgi.service.event.Event [topic=openhab/command/Power_Solar_prxy] {item=Power_Solar_prxy, bridgemarker=true, command=-999, timestamp=1546685309202} | {org.osgi.service.event.EventHandler}={service.id=483, service.bundleid=192, service.scope=bundle, event.topics=openhab/command/*, ip=10.10.30.12, service.pid=org.openhab.lightwaverf, component.name=org.openhab.binding.lightwaverf.binding, component.id=202} | Bundle(org.openhab.binding.lightwaverf_1.10.0.201706251807 [192])]
java.lang.IllegalArgumentException: ENERGY_CURRENT_USAGE : is unexpected
	at org.openhab.binding.lightwaverf.internal.LightwaverfConvertor.convertToLightwaveRfMessage(LightwaverfConvertor.java:93) ~[?:?]
	at org.openhab.binding.lightwaverf.internal.LightwaveRfBinding.internalReceive(LightwaveRfBinding.java:264) ~[?:?]
	at org.openhab.binding.lightwaverf.internal.LightwaveRfBinding.internalReceiveCommand(LightwaveRfBinding.java:246) ~[?:?]
	at org.openhab.core.binding.AbstractBinding.receiveCommand(AbstractBinding.java:94) ~[?:?]
	at org.openhab.core.events.AbstractEventSubscriber.handleEvent(AbstractEventSubscriber.java:45) ~[?:?]
	at org.apache.felix.eventadmin.impl.handler.EventHandlerProxy.sendEvent(EventHandlerProxy.java:415) [3:org.apache.karaf.services.eventadmin:4.2.1]
	at org.apache.felix.eventadmin.impl.tasks.HandlerTask.runWithoutBlacklistTiming(HandlerTask.java:82) [3:org.apache.karaf.services.eventadmin:4.2.1]
	at org.apache.felix.eventadmin.impl.tasks.SyncDeliverTasks.execute(SyncDeliverTasks.java:104) [3:org.apache.karaf.services.eventadmin:4.2.1]
	at org.apache.felix.eventadmin.impl.tasks.AsyncDeliverTasks$TaskExecuter.run(AsyncDeliverTasks.java:166) [3:org.apache.karaf.services.eventadmin:4.2.1]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:?]
	at java.lang.Thread.run(Thread.java:748) [?:?]
2019-01-05 10:48:14.674 [INFO ] [home.model.script.Motion - resetting] - Motion from - Kitchen_MotionSensor - resetting to zero
2019-01-05 10:48:12.596 [WARN ] [me.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.eclipse.smarthome.io.monitor.internal.EventLogger@1e2e736' takes more than 5000ms.
2019-01-05 10:47:37.641 [INFO ] [home.model.script.Motion - resetting] - Motion from - Hallway_MotionSensor - resetting to zero
2019-01-05 10:47:27.061 [WARN ] [me.model.script.Motion - Auto Lights] - Bathroom_MotionSensor caused auto-lights - state:OFF
2019-01-05 10:47:17.012 [WARN ] [l.script.controls.tablet.state.rules] - Brighten Screen - Setting brightness to 100.
2019-01-05 10:52:12.890 [WARN ] [me.model.script.Motion - Auto Lights] - Room: Bathroom
2019-01-05 10:50:12.855 [WARN ] [org.apache.karaf.services.eventadmin] - EventAdmin: Exception during event dispatch [org.osgi.service.event.Event [topic=openhab/command/Power_Main] {item=Power_Main, bridgemarker=true, command=-999, timestamp=1546685316203} | {org.osgi.service.event.EventHandler}={service.id=483, service.bundleid=192, service.scope=bundle, event.topics=openhab/command/*, ip=10.10.30.12, service.pid=org.openhab.lightwaverf, component.name=org.openhab.binding.lightwaverf.binding, component.id=202} | Bundle(org.openhab.binding.lightwaverf_1.10.0.201706251807 [192])]
java.lang.IllegalArgumentException: ENERGY_CURRENT_USAGE : is unexpected
	at org.openhab.binding.lightwaverf.internal.LightwaverfConvertor.convertToLightwaveRfMessage(LightwaverfConvertor.java:93) ~[?:?]
	at org.openhab.binding.lightwaverf.internal.LightwaveRfBinding.internalReceive(LightwaveRfBinding.java:264) ~[?:?]
	at org.openhab.binding.lightwaverf.internal.LightwaveRfBinding.internalReceiveCommand(LightwaveRfBinding.java:246) ~[?:?]
	at org.openhab.core.binding.AbstractBinding.receiveCommand(AbstractBinding.java:94) ~[?:?]
	at org.openhab.core.events.AbstractEventSubscriber.handleEvent(AbstractEventSubscriber.java:45) ~[?:?]
	at org.apache.felix.eventadmin.impl.handler.EventHandlerProxy.sendEvent(EventHandlerProxy.java:415) [3:org.apache.karaf.services.eventadmin:4.2.1]
	at org.apache.felix.eventadmin.impl.tasks.HandlerTask.runWithoutBlacklistTiming(HandlerTask.java:82) [3:org.apache.karaf.services.eventadmin:4.2.1]
	at org.apache.felix.eventadmin.impl.tasks.SyncDeliverTasks.execute(SyncDeliverTasks.java:104) [3:org.apache.karaf.services.eventadmin:4.2.1]
	at org.apache.felix.eventadmin.impl.tasks.AsyncDeliverTasks$TaskExecuter.run(AsyncDeliverTasks.java:166) [3:org.apache.karaf.services.eventadmin:4.2.1]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:?]
	at java.lang.Thread.run(Thread.java:748) [?:?]
2019-01-05 10:49:42.950 [DEBUG] [rg.quartz.core.QuartzSchedulerThread] - batch acquisition of 1 triggers
2019-01-05 10:49:42.949 [ERROR] [org.quartz.simpl.SimpleThreadPool   ] - Worker thread was interrupt()'ed.
java.lang.InterruptedException: null
	at java.lang.Object.wait(Native Method) ~[?:?]
	at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:568) [107:org.eclipse.smarthome.core.scheduler:0.10.0.oh240]
2019-01-05 10:49:12.610 [INFO ] [htwaverf.internal.LightwaveRfBinding] - Update from LightwaveRf ItemName[Power_Main], State[726]
2019-01-05 10:49:12.610 [DEBUG] [org.quartz.core.JobRunShell         ] - Calling execute on job DEFAULT.state.motion.lastmotion.rules#Calc Last Motion ago#0 0/1 * 1/1 * ? *
2019-01-05 10:55:12.954 [INFO ] [htwaverf.internal.LightwaveRfBinding] - Update from LightwaveRf ItemName[Power_Solar_prxy], State[144]
2019-01-05 10:54:42.934 [ERROR] [org.quartz.simpl.SimpleThreadPool   ] - Worker thread was interrupt()'ed.
java.lang.InterruptedException: null
	at java.lang.Object.wait(Native Method) ~[?:?]
	at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:568) [107:org.eclipse.smarthome.core.scheduler:0.10.0.oh240]
2019-01-05 10:53:42.928 [WARN ] [org.apache.karaf.services.eventadmin] - EventAdmin: Event Task Processing Interrupted. Events may not be recieved in proper order.
2019-01-05 10:53:17.891 [WARN ] [me.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.eclipse.smarthome.io.monitor.internal.EventLogger@1e2e736' takes more than 5000ms.
2019-01-05 10:53:13.007 [WARN ] [me.model.script.Motion - Auto Lights] - Turning off
2019-01-05 10:56:09.753 [WARN ] [org.apache.karaf.services.eventadmin] - EventAdmin: Event Task Processing Interrupted. Events may not be recieved in proper order.
2019-01-05 10:55:43.145 [DEBUG] [rg.quartz.core.QuartzSchedulerThread] - batch acquisition of 1 triggers
2019-01-05 10:56:09.760 [INFO ] [htwaverf.internal.LightwaveRfBinding] - Update from LightwaveRf ItemName[Power_Main], State[696]
2019-01-05 10:55:43.135 [ERROR] [org.quartz.simpl.SimpleThreadPool   ] - Worker thread was interrupt()'ed.
java.lang.InterruptedException: null
	at java.lang.Object.wait(Native Method) ~[?:?]
	at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:568) [107:org.eclipse.smarthome.core.scheduler:0.10.0.oh240]
2019-01-05 10:56:09.801 [INFO ] [htwaverf.internal.LightwaveRfBinding] - Update from LightwaveRf ItemName[Power_Solar_prxy], State[149]
2019-01-05 10:56:09.843 [INFO ] [htwaverf.internal.LightwaveRfBinding] - Update from LightwaveRf ItemName[Power_Main], State[701]
2019-01-05 10:56:09.861 [INFO ] [htwaverf.internal.LightwaveRfBinding] - Update from LightwaveRf ItemName[Power_Solar_prxy], State[144]

This is very possible,
You may want to combine the rules into one and use a lock

So it is this. I comment out one of the rules and it does not break over time.

Yes, using locks is the obvious solution but this did not work either. It behaved a bit odd.

import java.util.concurrent.locks.ReentrantLock

val ReentrantLock motionLock = new ReentrantLock

rule "Motion ago - reset to zeros"
when Member of g_Motion changed to ON
then

    logWarn("Reset zero", "Reset zero - Firing")
    if (motionLock.getHoldCount() > 0){logWarn("Motion - resetting", "reset to zeros - Waiting on other locks. Count:" + motionLock.getHoldCount().toString)}

    try{
        motionLock.lock()
        logWarn("Motion - resetting", "Motion from - " + triggeringItem.name + " - resetting to zero")
        val item= g_MotionMinutesAgo.members.findFirst[ t | t.name == triggeringItem.name + "_MotionMinutesAgo" ] as NumberItem
        item.sendCommand(0)
    }
    catch(Exception t) { 
        logError("Motion - resetting", "Error: " + t.getMessage())
    }
    finally {
        motionLock.unlock()
    }        
    
end 


rule "Calc Last Motion ago Cron"
when     Time cron "0 0/1 * 1/1 * ? *" //every minute
then
    MotionMinutesAgo_CronProxy.sendCommand(ON)
end

rule "Calc Last Motion ago Cron"
when Item MotionMinutesAgo_CronProxy changed to ON
then

    logWarn("Last Motion", "Last Motion - Firing")

    if (motionLock.getHoldCount() > 0){logWarn("Calc Last Motion ago", "Motion Ago - Waiting on other locks. Count:" + motionLock.getHoldCount().toString)}

    try{

        motionLock.lock()

        g_MotionMinutesAgo?.allMembers.forEach[item |
            if (item === null){
                logWarn("Calc Last Motion ago", "Item item was null. Very odd")
                return;
            }
            if (item.state == NULL){
                item.sendCommand(0)
                logWarn("Calc Last Motion ago", "Item " + item.name + " was null. Setting to zero")
            }
            else {
                var num = (item.state as Number)
                if (num === null){
                    logWarn("Calc Last Motion ago", item.name  + " num was null. Very odd")
                    num = 0;
                }
                item.sendCommand(num + 1) 
                logWarn("Calc Last Motion ago", "Setting to " + item.name + " to " + (num + 1).toString)
            }
        ]
    }
    catch(Exception t) { 
        logError("Motion - Motion Ago", "Error: " + t.getMessage())
    }
    finally {
        motionLock.unlock()
    }        
    MotionMinutesAgo_CronProxy.sendCommand(OFF)
end 

So having this is what should work but, it seems to get stuck and not actually get past the second lock.
I need to set up a more controlled instance of it to make something reproduceable.

Still crashes with the above. I wonder if it’s a case of the “member of” rule which is internally monitoring the items. When there is a write and a read at the same time it deadlocks.

2019-01-07 21:23:58.503 [WARN ] [me.model.script.Motion - Auto Lights] - Bathroom_MotionSensor caused auto-lights - state:ON
2019-01-07 21:23:58.513 [WARN ] [home.model.script.Motion - resetting] - Motion from - Bathroom_MotionSensor - resetting to zero
2019-01-07 21:23:58.534 [WARN ] [me.model.script.Motion - Auto Lights] - Room: Bathroom
2019-01-07 21:23:58.619 [WARN ] [me.model.script.Motion - Auto Lights] - Turning on
2019-01-07 21:24:00.002 [DEBUG] [rg.quartz.core.QuartzSchedulerThread] - batch acquisition of 0 triggers
2019-01-07 21:24:00.002 [DEBUG] [org.quartz.core.JobRunShell         ] - Calling execute on job DEFAULT.state.motion.lastmotion.rules#Calc Last Motion ago Cron#0 0/1 * 1/1 * ? *
2019-01-07 21:24:00.059 [WARN ] [e.smarthome.model.script.Last Motion] - Last Motion - Firing
2019-01-07 21:24:00.102 [WARN ] [me.model.script.Calc Last Motion ago] - Setting to MotionMinutesAgo_Proxy to 58
2019-01-07 21:24:00.180 [WARN ] [me.model.script.Calc Last Motion ago] - Setting to Backroom_MotionSensor_MotionMinutesAgo to 47
2019-01-07 21:24:00.212 [WARN ] [me.model.script.Calc Last Motion ago] - Setting to MotionMinutesAgo to 58
2019-01-07 21:24:00.248 [WARN ] [me.model.script.Calc Last Motion ago] - Setting to Wetroom_MotionSensor_MotionMinutesAgo to 22
2019-01-07 21:24:23.991 [DEBUG] [rg.quartz.core.QuartzSchedulerThread] - batch acquisition of 0 triggers
2019-01-07 21:25:00.275 [DEBUG] [rg.quartz.core.QuartzSchedulerThread] - batch acquisition of 1 triggers
2019-01-07 21:24:05.277 [WARN ] [me.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.eclipse.smarthome.io.monitor.internal.EventLogger@1281bb3' takes more than 5000ms.
2019-01-07 21:24:00.275 [WARN ] [me.model.script.Calc Last Motion ago] - Setting to Hallway_MotionSensor_MotionMinutesAgo to 1
2019-01-07 21:26:00.318 [WARN ] [me.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.eclipse.smarthome.io.monitor.internal.EventLogger@1281bb3' takes more than 5000ms.
2019-01-07 21:25:30.284 [DEBUG] [rg.quartz.core.QuartzSchedulerThread] - batch acquisition of 1 triggers
2019-01-07 21:25:30.282 [DEBUG] [org.quartz.core.JobRunShell         ] - Calling execute on job DEFAULT.state.motion.lastmotion.rules#Calc Last Motion ago Cron#0 0/1 * 1/1 * ? *
2019-01-07 21:25:12.461 [WARN ] [l.script.rooms.wetroom.climate.rules] - Thermo-Control. Room Cooled - turning on
2019-01-07 21:27:30.344 [DEBUG] [rg.quartz.core.QuartzSchedulerThread] - batch acquisition of 1 triggers
2019-01-07 21:27:30.334 [ERROR] [org.quartz.simpl.SimpleThreadPool   ] - Worker thread was interrupt()'ed.
java.lang.InterruptedException: null
	at java.lang.Object.wait(Native Method) ~[?:?]
	at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:568) [107:org.eclipse.smarthome.core.scheduler:0.10.0.oh240]
2019-01-07 21:29:30.323 [ERROR] [org.quartz.simpl.SimpleThreadPool   ] - Worker thread was interrupt()'ed.
java.lang.InterruptedException: null
	at java.lang.Object.wait(Native Method) ~[?:?]
	at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:568) [107:org.eclipse.smarthome.core.scheduler:0.10.0.oh240]
2019-01-07 21:30:00.328 [ERROR] [org.quartz.simpl.SimpleThreadPool   ] - Worker thread was interrupt()'ed.
java.lang.InterruptedException: null
	at java.lang.Object.wait(Native Method) ~[?:?]
	at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:568) [107:org.eclipse.smarthome.core.scheduler:0.10.0.oh240]
2019-01-07 21:27:00.304 [DEBUG] [org.quartz.core.JobRunShell         ] - Calling execute on job DEFAULT.state.motion.lastmotion.rules#Calc Last Motion ago Cron#0 0/1 * 1/1 * ? *
2019-01-07 21:26:30.349 [WARN ] [me.model.script.Calc Last Motion ago] - Setting to Kitchen_MotionSensor_MotionMinutesAgo to 1
2019-01-07 21:30:30.334 [ERROR] [org.quartz.simpl.SimpleThreadPool   ] - Worker thread was interrupt()'ed.
java.lang.InterruptedException: null
	at java.lang.Object.wait(Native Method) ~[?:?]
	at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:568) [107:org.eclipse.smarthome.core.scheduler:0.10.0.oh240]
2019-01-07 21:31:30.344 [ERROR] [org.quartz.simpl.SimpleThreadPool   ] - Worker thread was interrupt()'ed.
java.lang.InterruptedException: null
	at java.lang.Object.wait(Native Method) ~[?:?]
	at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:568) [107:org.eclipse.smarthome.core.scheduler:0.10.0.oh240]
2019-01-07 21:32:00.350 [ERROR] [org.quartz.simpl.SimpleThreadPool   ] - Worker thread was interrupt()'ed.
java.lang.InterruptedException: null
	at java.lang.Object.wait(Native Method) ~[?:?]
	at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:568) [107:org.eclipse.smarthome.core.scheduler:0.10.0.oh240]
2019-01-07 21:32:30.355 [ERROR] [org.quartz.simpl.SimpleThreadPool   ] - Worker thread was interrupt()'ed.
java.lang.InterruptedException: null
	at java.lang.Object.wait(Native Method) ~[?:?]
	at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:568) [107:org.eclipse.smarthome.core.scheduler:0.10.0.oh240]
2019-01-07 21:33:00.361 [ERROR] [org.quartz.simpl.SimpleThreadPool   ] - Worker thread was interrupt()'ed.
java.lang.InterruptedException: null
	at java.lang.Object.wait(Native Method) ~[?:?]
	at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:568) [107:org.eclipse.smarthome.core.scheduler:0.10.0.oh240]
2019-01-07 21:29:14.028 [WARN ] [se.smarthome.model.script.Reset zero] - Reset zero - Firing
2019-01-07 21:29:02.333 [WARN ] [se.smarthome.model.script.Reset zero] - Reset zero - Firing
2019-01-07 21:28:43.279 [WARN ] [rthome.model.script.Wetroom Humidity] - Turning on fan. Humidity is 70.4
2019-01-07 21:28:43.242 [WARN ] [l.script.rooms.wetroom.climate.rules] - Thermo-Control. Room Cooled - turning on
2019-01-07 21:28:35.320 [WARN ] [me.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.eclipse.smarthome.io.monitor.internal.EventLogger@1281bb3' takes more than 5000ms.
2019-01-07 21:31:00.383 [DEBUG] [rg.quartz.core.QuartzSchedulerThread] - batch acquisition of 1 triggers
2019-01-07 21:31:00.378 [ERROR] [org.quartz.simpl.SimpleThreadPool   ] - Worker thread was interrupt()'ed.
java.lang.InterruptedException: null
	at java.lang.Object.wait(Native Method) ~[?:?]
	at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:568) [107:org.eclipse.smarthome.core.scheduler:0.10.0.oh240]
2019-01-07 21:31:00.374 [WARN ] [me.model.script.Calc Last Motion ago] - Setting to Frontroom_MotionSensor_MotionMinutesAgo to 58
2019-01-07 21:33:25.827 [WARN ] [me.model.script.Calc Last Motion ago] - Setting to Bathroom_MotionSensor_MotionMinutesAgo to 1
2019-01-07 21:33:25.902 [WARN ] [me.model.script.Motion - Auto Lights] - Wetroom_MotionSensor caused auto-lights - state:ON
2019-01-07 21:33:25.953 [WARN ] [me.model.script.Motion - Auto Lights] - Room: Wetroom
2019-01-07 21:33:25.968 [WARN ] [me.model.script.Calc Last Motion ago] - Setting to Proxy_MotionSensor_MotionMinutesAgo to 58
2019-01-07 21:33:25.989 [WARN ] [home.model.script.Motion - resetting] - Motion from - Hallway_MotionSensor - resetting to zero
2019-01-07 21:33:26.131 [WARN ] [home.model.script.Motion - resetting] - Motion from - Wetroom_MotionSensor - resetting to zero
2019-01-07 21:33:28.317 [WARN ] [me.model.script.Motion - Auto Lights] - Turning on
2019-01-07 21:36:46.249 [WARN ] [l.script.rooms.wetroom.climate.rules] - Thermo-Control. Room Cooled - turning on
2019-01-07 21:37:58.011 [WARN ] [.handler.UniFiControllerThingHandler] - Unhandled exception while refreshing the UniFi Controller unifi:controller:home - Unknown HTTP status code 503 returned by the controller
2019-01-07 21:38:32.424 [WARN ] [se.smarthome.model.script.Reset zero] - Reset zero - Firing
2019-01-07 21:38:32.452 [WARN ] [home.model.script.Motion - resetting] - Motion from - Hallway_MotionSensor - resetting to zero
2019-01-07 21:38:58.612 [WARN ] [se.smarthome.model.script.Reset zero] - Reset zero - Firing
2019-01-07 21:38:58.649 [WARN ] [home.model.script.Motion - resetting] - Motion from - Frontroom_MotionSensor - resetting to zero
2019-01-07 21:39:38.375 [WARN ] [se.smarthome.model.script.Reset zero] - Reset zero - Firing
2019-01-07 21:39:38.396 [WARN ] [home.model.script.Motion - resetting] - Motion from - Hallway_MotionSensor - resetting to zero
2019-01-07 21:39:51.394 [WARN ] [se.smarthome.model.script.Reset zero] - Reset zero - Firing
2019-01-07 21:39:51.409 [WARN ] [home.model.script.Motion - resetting] - Motion from - Kitchen_MotionSensor - resetting to zero
2019-01-07 21:40:40.943 [WARN ] [se.smarthome.model.script.Reset zero] - Reset zero - Firing
2019-01-07 21:40:40.962 [WARN ] [home.model.script.Motion - resetting] - Motion from - Hallway_MotionSensor - resetting to zero
2019-01-07 21:43:28.729 [WARN ] [se.smarthome.model.script.Reset zero] - Reset zero - Firing
2019-01-07 21:43:28.756 [WARN ] [home.model.script.Motion - resetting] - Motion from - Hallway_MotionSensor - resetting to zero

Nonetheless, I cannot use this as it is as it crashes it all :frowning:

Don’t like that.
item seems horribly like a reserved name and I’d use something else, itemx
What happens if no item is found? I’d have a item===null check before operating with it. If that’s unexpected, you can always logInfo a message about it.

Should there be something like a null failure within the try-catch block, rules DSL is pretty rubbish about carrying out the finally part. Meaning your lock never unlocks, and that’s the end of that until reboot.

1 Like

true. I’ll shore that up and see.
I’m doing a test case for it now to try and recreate it without any other input. Will post code when done for others to try / log a bug with it.

From learning the hard way, my group-searching rules are full of …

val seek = gGroup.members ....
if (seek === null) {
    logInfo("rule", "config error, stupid. " + targetname)
} else {
    useful stuff

Right, so this will crash my system. RPi 3B+ with nothing other than OH 2.4 running on it.

To start it, turn each switch on. It’ll then run by itself.
It’s pretty exaggerated but will eventually (10-15 minutes for me with 6 cron threads) crash cron.

The error with this is different to my normal errors but it seems that anything that happens inside a “cron trigger” will eventually bring your system down.

Do you have a test system you can run this one?

Items:

Group:Switch g_cc_test
Group:Number g_cc_test_mot


Switch cc1  (g_cc_test) {expire="1s,command=OFF"} 
Switch cc2  (g_cc_test) {expire="2s,command=OFF"}  
Switch cc3  (g_cc_test) {expire="3s,command=OFF"}
Switch cc4  (g_cc_test) {expire="4s,command=OFF"}
Switch cc5  (g_cc_test) {expire="5s,command=OFF"}
Switch cc6  (g_cc_test) {expire="6s,command=OFF"}

Number cc1_mot  (g_cc_test_mot)
Number cc2_mot  (g_cc_test_mot)  
Number cc3_mot  (g_cc_test_mot)
Number cc4_mot  (g_cc_test_mot)
Number cc5_mot  (g_cc_test_mot)
Number cc6_mot  (g_cc_test_mot)


Rules:

rule "Inc item count"
when Member of g_cc_test changed to ON
then 
        val itemx = g_cc_test_mot.members.findFirst[ t | t.name == triggeringItem.name + "_mot" ] as NumberItem
        if (itemx === null){
            logWarn("Crash Test", "Inc Item count - item was null. Odd")
            return;
        }
        logWarn("Crash Test", "Resetting " + itemx.name + " to zero")
        itemx.sendCommand(0)
end 


rule "cron crash test"
when     Time cron "0/5 0/1 * 1/1 * ? *" //every minute
then
    logWarn("FFFF", "JJJJ")

    g_cc_test_mot?.allMembers.forEach[itemx |
        if (itemx === null){
            logWarn("Calc Last Motion ago", "Item item was null. Very odd")
            return;
        }
        if (itemx.state == NULL){
            itemx.sendCommand(0)
            logWarn("Calc Last Motion ago", "Item " + itemx.name + " was null. Setting to zero")
        }
        else {
            var num = (itemx.state as Number)
            if (num === null){
                logWarn("Calc Last Motion ago", itemx.name  + " num was null. Very odd")
                num = 0;
            }
            itemx.sendCommand(num + 1) 
            logWarn("Calc Last Motion ago", "Setting to " + itemx.name + " to " + (num + 1).toString)
        }
    ]

end

rule "crash test random on"
when Member of g_cc_test changed to OFF
then
    val i = (Math::random * 10.0).intValue + 1
logWarn("SSSS", i.toString())
    createTimer(now.plusSeconds(i)) [|
                    triggeringItem.sendCommand(ON)
                ]    
end 
2019-01-08 21:06:29.589 [DEBUG] [rg.quartz.core.QuartzSchedulerThread] - batch acquisition of 1 triggers
2019-01-08 21:06:29.589 [DEBUG] [org.quartz.core.JobRunShell         ] - Calling execute on job DEFAULT.2019-01-08T21:06:29.587Z: Proxy for org.eclipse.xtext.xbase.lib.Procedures$Procedure0: [ | {
  <XFeatureCallImplCustom>.sendCommand(<XFeatureCallImplCustom>)
} ]
2019-01-08 21:06:29.603 [ERROR] [org.quartz.core.JobRunShell         ] - Job DEFAULT.2019-01-08T21:06:29.587Z: Proxy for org.eclipse.xtext.xbase.lib.Procedures$Procedure0: [ | {
  <XFeatureCallImplCustom>.sendCommand(<XFeatureCallImplCustom>)
} ] threw an unhandled Exception: 
java.lang.NullPointerException: null
	at org.eclipse.smarthome.model.script.engine.ScriptError.<init>(ScriptError.java:66) ~[?:?]
	at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.invokeFeature(ScriptInterpreter.java:140) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:902) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:865) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:224) ~[?:?]
	at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.doEvaluate(ScriptInterpreter.java:226) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:204) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.evaluateArgumentExpressions(XbaseInterpreter.java:1116) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._invokeFeature(XbaseInterpreter.java:1046) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.invokeFeature(XbaseInterpreter.java:992) ~[?:?]
	at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.invokeFeature(ScriptInterpreter.java:151) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:772) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:220) ~[?:?]
	at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.doEvaluate(ScriptInterpreter.java:226) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:204) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:447) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:228) ~[?:?]
	at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.doEvaluate(ScriptInterpreter.java:226) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:204) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.evaluate(XbaseInterpreter.java:190) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.ClosureInvocationHandler.doInvoke(ClosureInvocationHandler.java:46) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.AbstractClosureInvocationHandler.invoke(AbstractClosureInvocationHandler.java:29) ~[?:?]
	at com.sun.proxy.$Proxy142.apply(Unknown Source) ~[?:?]
	at org.eclipse.smarthome.model.script.internal.actions.TimerExecutionJob.execute(TimerExecutionJob.java:49) ~[?:?]
	at org.quartz.core.JobRunShell.run(JobRunShell.java:202) [107:org.eclipse.smarthome.core.scheduler:0.10.0.oh240]
	at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) [107:org.eclipse.smarthome.core.scheduler:0.10.0.oh240]
2019-01-08 21:06:29.744 [ERROR] [org.quartz.core.ErrorLogger         ] - Job (DEFAULT.2019-01-08T21:06:29.587Z: Proxy for org.eclipse.xtext.xbase.lib.Procedures$Procedure0: [ | {
  <XFeatureCallImplCustom>.sendCommand(<XFeatureCallImplCustom>)
} ] threw an exception.
org.quartz.SchedulerException: Job threw an unhandled exception.
	at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [107:org.eclipse.smarthome.core.scheduler:0.10.0.oh240]
	at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) [107:org.eclipse.smarthome.core.scheduler:0.10.0.oh240]
Caused by: java.lang.NullPointerException
	at org.eclipse.smarthome.model.script.engine.ScriptError.<init>(ScriptError.java:66) ~[?:?]
	at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.invokeFeature(ScriptInterpreter.java:140) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:902) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:865) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:224) ~[?:?]
	at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.doEvaluate(ScriptInterpreter.java:226) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:204) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.evaluateArgumentExpressions(XbaseInterpreter.java:1116) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._invokeFeature(XbaseInterpreter.java:1046) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.invokeFeature(XbaseInterpreter.java:992) ~[?:?]
	at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.invokeFeature(ScriptInterpreter.java:151) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:772) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:220) ~[?:?]
	at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.doEvaluate(ScriptInterpreter.java:226) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:204) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter._doEvaluate(XbaseInterpreter.java:447) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.doEvaluate(XbaseInterpreter.java:228) ~[?:?]
	at org.eclipse.smarthome.model.script.interpreter.ScriptInterpreter.doEvaluate(ScriptInterpreter.java:226) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.internalEvaluate(XbaseInterpreter.java:204) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.XbaseInterpreter.evaluate(XbaseInterpreter.java:190) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.ClosureInvocationHandler.doInvoke(ClosureInvocationHandler.java:46) ~[?:?]
	at org.eclipse.xtext.xbase.interpreter.impl.AbstractClosureInvocationHandler.invoke(AbstractClosureInvocationHandler.java:29) ~[?:?]
	at com.sun.proxy.$Proxy142.apply(Unknown Source) ~[?:?]
	at org.eclipse.smarthome.model.script.internal.actions.TimerExecutionJob.execute(TimerExecutionJob.java:49) ~[?:?]
	at org.quartz.core.JobRunShell.run(JobRunShell.java:202) ~[?:?]
	... 1 more
2019-01-08 21:06:29.986 [DEBUG] [org.quartz.core.JobRunShell         ] - Calling execute on job DEFAULT.2019-01-08T21:06:29.985Z: Proxy for org.eclipse.xtext.xbase.lib.Procedures$Procedure0: [ | {
  <XFeatureCallImplCustom>.sendCommand(<XFeatureCallImplCustom>)
} ]
2019-01-08 21:06:29.990 [DEBUG] [rg.quartz.core.QuartzSchedulerThread] - batch acquisition of 1 triggers
2019-01-08 21:06:30.009 [DEBUG] [org.quartz.core.JobRunShell         ] - Calling execute on job DEFAULT.crash.test.rules#cron crash test#0/5 0/1 * 1/1 * ? *
2019-01-08 21:06:30.011 [DEBUG] [rg.quartz.core.QuartzSchedulerThread] - batch acquisition of 1 triggers
2019-01-08 21:06:30.022 [WARN ] [.eclipse.smarthome.model.script.FFFF] - JJJJ
2019-01-08 21:06:30.026 [WARN ] [se.smarthome.model.script.Crash Test] - Resetting cc1_mot to zero
2019-01-08 21:06:30.038 [WARN ] [me.model.script.Calc Last Motion ago] - Setting to cc3_mot to 3
2019-01-08 21:06:30.056 [WARN ] [me.model.script.Calc Last Motion ago] - Setting to cc2_mot to 44

Just to make sure you’re not chasing a red herring now, consider:-
If you edit and reload a .rules file - any createTimer jobs already scheduled stay running.
Their context of the original rules file is lost and destroyed, though. That usually means that when quartz does execute the scheduled job, there’s almost always a failure. Usually a null error of some kind.

So those can show up N minutes or hours after a rules edit, and you just have to remember what jobs your rules system might have scheduled. An OH reboot will give a clean start, of course.

Yup, seen that and they always show up in log:tail as null. These though never show in log:tail, just the log file.

My test might be a bit dirty with the timers overlapping and swamping…