File not found in rules from time to time

Hi All, not sure if this is the correct place for this question.
Around once a day, i get an error in the logs that looks like this:
2015-10-30 20:55:00.455 ERROR o.e.x.r.i.DefaultResourceDescription[:75] - openzwaveDoor.rules (No such file or directory)

When this happens, the rules in that file stop working entierly.
The resulution is to restart openhab, or ‘touch’ the file in question.
Its not always the same rules file, but i have a couple that it happens tge most to.
Looking at other system logs i cant find any i/o or disk errors.

Is this a bug in openhab? i am running openhab-1.7.1.
Is there something i can do to resolve it?

Many thanks.

The full debug logs can be found here:
2015-10-30 20:55:00.424 DEBUG o.e.x.c.t.u.L.Request[:135] - getReceiverType.result = null
2015-10-30 20:55:00.421 DEBUG o.e.x.u.OnChangeEvictingCache[:273] - Clear 270 cache entries for resource openzwaveDoor.rules after 0 hits and 270 misses (quota: 0%)
2015-10-30 20:55:00.427 DEBUG o.e.x.u.OnChangeEvictingCache[:273] - Clear 1 cache entries for resource openzwaveDoor.rules after 0 hits and 1 misses (quota: 0%)
2015-10-30 20:55:00.428 DEBUG o.e.x.u.OnChangeEvictingCache[:273] - Clear 1 cache entries for resource openzwaveDoor.rules after 0 hits and 1 misses (quota: 0%)
2015-10-30 20:55:00.434 DEBUG o.e.x.c.t.u.L.Request[:142] - getFeature()
2015-10-30 20:55:00.435 DEBUG o.e.x.c.t.u.L.Request[:148] - getFeature.result = /null/
2015-10-30 20:55:00.436 DEBUG o.e.x.c.t.u.L.Request[:142] - getFeature()
2015-10-30 20:55:00.438 DEBUG o.e.x.c.t.u.L.Request[:148] - getFeature.result = /null/
2015-10-30 20:55:00.439 DEBUG o.e.x.c.t.u.L.Request[:213] - getDeclaredType()
2015-10-30 20:55:00.440 DEBUG o.e.x.x.t.AbstractTypeProvider[:346] - getTypeForIdentifiable: org.eclipse.xtext.xbase.impl.XVariableDeclarationImplCustom@4bbf3597 (name: isInZone, writeable: false)
2015-10-30 20:55:00.441 DEBUG o.e.x.x.t.AbstractTypeProvider[:169] - cache skip: JvmParameterizedTypeReference: org.eclipse.xtext.xbase.lib.Functions$Function4
2015-10-30 20:55:00.443 DEBUG o.e.x.x.t.AbstractTypeProvider[:663] - cache hit: false for: org.eclipse.xtext.xbase.impl.XVariableDeclarationImplCustom@4bbf3597 (name: isInZone, writeable: false)
2015-10-30 20:55:00.444 DEBUG o.e.x.x.t.AbstractTypeProvider[:350] - result: JvmParameterizedTypeReference: org.eclipse.xtext.xbase.lib.Functions$Function4 getTypeForIdentifiable(parameterized) : org.eclipse.xtext.xbase.impl.XVariableDeclarationImplCustom@4bbf3597 (name: isInZone, writeable: false)
2015-10-30 20:55:00.445 DEBUG o.e.x.c.t.u.L.Request[:219] - getDeclaredType.result = JvmParameterizedTypeReference: org.eclipse.xtext.xbase.lib.Functions$Function4
2015-10-30 20:55:00.453 DEBUG o.e.x.c.t.u.L.Request[:200] - getExpectedType()
2015-10-30 20:55:00.454 DEBUG o.e.x.x.t.AbstractTypeProvider[:346] - getExpectedType: isInZone
2015-10-30 20:55:00.456 DEBUG o.e.x.c.t.u.L.Primary[:55] - Initialized argument context for request XbaseTypeProvider.getFeatureCallTypeArgContext [expr=.apply(,,,)]
2015-10-30 20:55:00.451 DEBUG o.e.x.u.OnChangeEvictingCache[:273] - Clear 81 cache entries for resource openzwaveDoor.rules after 0 hits and 81 misses (quota: 0%)
2015-10-30 20:55:00.458 DEBUG o.e.x.x.t.AbstractTypeProvider[:346] - getType: Pair(isInZone, null)
2015-10-30 20:55:00.455 ERROR o.e.x.r.i.DefaultResourceDescription[:75] - openzwaveDoor.rules (No such file or directory)
java.io.FileNotFoundException: openzwaveDoor.rules (No such file or directory)
at java.io.FileInputStream.open(Native Method)
2015-10-30 20:55:00.460 DEBUG o.e.x.x.t.AbstractTypeProvider[:350] - result: null getType(parameterized) : Pair(isInZone, null)
2015-10-30 20:55:00.458 INFO o.o.m.c.i.ModelRepositoryImpl[:98] - Refreshing model ‘openzwaveDoor.rules’
2015-10-30 20:55:00.462 DEBUG o.e.x.c.t.u.L.Primary[:77] - getLowerBound(JvmParameterizedTypeReference: org.eclipse.xtext.xbase.lib.Functions$Function4<JvmParameterizedTypeReference: P1, JvmParameterizedTypeReference: P2, JvmParameterizedTypeReference: P3, JvmParameterizedTypeReference: P4, JvmParameterizedTypeReference: Result>)
2015-10-30 20:55:00.461 DEBUG o.e.x.x.t.AbstractTypeProvider[:341] - getType(parameterized) : org.eclipse.xtext.xbase.util.XExpressionHelper.isShortCircuiteBooleanOperation(XExpressionHelper.java:163)
2015-10-30 20:55:00.463 DEBUG o.e.x.c.t.u.L.Request[:161] - getNearestDeclarator()
2015-10-30 20:55:00.462 WARN o.o.m.c.i.ModelRepositoryImpl[:58] - Configuration model ‘openzwaveDoor.rules’ is either empty or cannot be parsed correctly!
2015-10-30 20:55:00.465 DEBUG o.e.x.c.t.u.L.Request[:167] - getNearestDeclarator.result = null
2015-10-30 20:55:00.464 DEBUG o.e.x.x.t.AbstractTypeProvider[:346] - getType: Pair( && , null)
2015-10-30 20:55:00.466 DEBUG o.e.x.c.t.u.L.Request[:129] - getReceiverType()
2015-10-30 20:55:00.467 DEBUG o.e.x.x.t.AbstractTypeProvider[:346] - getTypeForIdentifiable: org.eclipse.xtext.common.types.impl.JvmOperationImpl@15d9035f (visibility: PUBLIC, simpleName: operator_and, identifier: org.eclipse.xtext.xbase.lib.BooleanExtensions.operator_and(boolean,boolean)) (varArgs: false) (static: true, final: false, abstract: false)
2015-10-30 20:55:00.469 DEBUG o.e.x.x.t.AbstractTypeProvider[:663] - cache hit: false for: org.eclipse.xtext.common.types.impl.JvmOperationImpl@15d9035f (visibility: PUBLIC, simpleName: operator_and, identifier: org.eclipse.xtext.xbase.lib.BooleanExtensions.operator_and(boolean,boolean)) (varArgs: false) (static: true, final: false, abstract: false)
2015-10-30 20:55:00.468 DEBUG o.e.x.x.t.AbstractTypeProvider[:346] - getType: Pair(isInZone, null)
2015-10-30 20:55:00.469 DEBUG o.e.x.x.t.AbstractTypeProvider[:350] - result: JvmParameterizedTypeReference: boolean getTypeForIdentifiable(parameterized) : org.eclipse.xtext.common.types.impl.JvmOperationImpl@15d9035f (visibility: PUBLIC, simpleName: operator_and, identifier: org.eclipse.xtext.xbase.lib.BooleanExtensions.operator_and(boolean,boolean)) (varArgs: false) (static: true, final: false, abstract: false)
2015-10-30 20:55:00.471 DEBUG o.e.x.x.t.AbstractTypeProvider[:350] - result: null getType(parameterized) : Pair(isInZone, null)
2015-10-30 20:55:00.471 DEBUG o.e.x.x.t.AbstractTypeProvider[:169] - cache skip: JvmParameterizedTypeReference: boolean
2015-10-30 20:55:00.473 DEBUG o.e.x.c.t.u.L.Request[:135] - getReceiverType.result = null
2015-10-30 20:55:00.473 DEBUG o.e.x.x.t.AbstractTypeProvider[:663] - cache hit: false for: Pair( && , null)
2015-10-30 20:55:00.474 DEBUG o.e.x.c.t.u.L.Request[:142] - getFeature()
2015-10-30 20:55:00.475 DEBUG o.e.x.x.t.AbstractTypeProvider[:350] - result: JvmParameterizedTypeReference: boolean getType(parameterized) : Pair( && , null)
2015-10-30 20:55:00.476 DEBUG o.e.x.c.t.u.L.Request[:148] - getFeature.result = org.eclipse.xtext.xbase.lib.Functions$Function4.apply(P1,P2,P3,P4)
2015-10-30 20:55:00.477 DEBUG o.e.x.x.t.AbstractTypeProvider[:341] - getType(parameterized) : org.eclipse.xtext.xbase.util.XExpressionHelper.isShortCircuiteBooleanOperation(XExpressionHelper.java:164)
2015-10-30 20:55:00.478 DEBUG o.e.x.c.t.u.L.Request[:187] - getExplicitTypeArgument()
2015-10-30 20:55:00.478 DEBUG o.e.x.x.t.AbstractTypeProvider[:346] - getType: Pair( && , null)
2015-10-30 20:55:00.479 DEBUG o.e.x.c.t.u.L.Request[:193] - getExplicitTypeArgument.result = []
2015-10-30 20:55:00.480 DEBUG o.e.x.x.t.AbstractTypeProvider[:346] - getTypeForIdentifiable: org.eclipse.xtext.common.types.impl.JvmOperationImpl@15d9035f (visibility: PUBLIC, simpleName: operator_and, identifier: org.eclipse.xtext.xbase.lib.BooleanExtensions.operator_and(boolean,boolean)) (varArgs: false) (static: true, final: false, abstract: false)
2015-10-30 20:55:00.481 DEBUG o.e.x.c.t.u.L.Request[:174] - getArgumentTypes()

THIS IS WHERE I SIMPLY TOUCH THE FILE TO UPDATE ITS TIMESTAMP.

2015-10-30 20:55:10.494 INFO o.o.m.c.i.ModelRepositoryImpl[:80] - Loading model ‘openzwaveDoor.rules’
2015-10-30 20:55:10.496 DEBUG o.e.x.l.i.DefaultLinkingService[:123] - after getLinkedObjects: node: ‘everyChange’ result: everyChange
2015-10-30 20:55:10.496 DEBUG o.e.x.l.i.DefaultLinkingService[:116] - before getLinkedObjects: node: ‘everyChange’
2015-10-30 20:55:10.497 DEBUG o.e.x.l.i.DefaultLinkingService[:123] - after getLinkedObjects: node: ‘everyChange’ result: everyChange
2015-10-30 20:55:10.620 DEBUG o.e.x.p.a.AbstractInternalAntlrParser[:562] - Parsing took: 121 ms
2015-10-30 20:55:10.637 DEBUG o.e.x.l.i.AbstractCleaningLinker[:35] - beforeModelLinked took: 13ms
2015-10-30 20:55:10.713 DEBUG o.e.x.l.i.AbstractCleaningLinker[:41] - doLinkModel took: 76ms
2015-10-30 20:55:10.714 DEBUG o.e.x.l.i.AbstractCleaningLinker[:47] - afterModelLinked took: 1ms

It’s a bug. A lot of people see it rarely. Most people never see it. There really isn’t a good workaround I know of. Some people put a system started rule that touches the files when items are undefined. I just watch the logs when ii restart.

If you are seeing the bug a lot you can try rearranging your rules into fewer files. I know the bug goes away if you only use one file.

1 Like

I have 7 rule files because the openhab designer does not manage to handle one big file. Additionally changes in one file keeps the remaining functionality untoughed.

I see this bug every time I am restarting openhab.( It seems to be fixed by openhab 2)
The workaround I am using is to have an item for each rule file that is set to ON a startup rule in each rule file.
Also I have a dedicated rule file that is short and is starting with an “a” in the name, that parses all items and checks whether the related items are still “null” and then performs a touch operation via a command line. The requirement for the name “a” seems to determine the priority of the loading. So this file is always loaded and also never fails to load because maybe(?) it is also very short.

in every rule, e.g. in Innenklima.rules

rule "Innenklima startup "
when
System started
then
sendCommand(Innenklima,ON)
end

in the checking rule file starting with “a” in its name. e.g. aarulechecker.rules

rule "check start "
when
Item Rule_reload received update
then
var Boolean not_complete=false

 if (Innenklima.state == Undefined || Innenklima.state == Uninitialized) {
     sendCommand(Innenklima_re,ON)
     Thread::sleep(4000)
     not_complete=true
 }

// further rules are check like above …

 sendCommand(Refreshbrowser,ON)
 if (not_complete == true) createTimer(now.plusSeconds(30)) [|sendCommand(Rule_reload,ON)]

end

// in items
Switch Innenklima_re “Bildschirm ein / aus” { exec= “ON:./configurations/rules/Innenklima.bat,OFF:./configurations/rules/Innenklima.bat”}

and the bat file (Windows) for each rules looks like:

copy /b C:\users\martin\Openhab3\configurations\rules\Innenklima.rules +, C:\users\martin\Openhab3\configurations\rules\Innenklima.rules

2 Likes

Is there anybody actively working on it? I use openhab for locks, lights, power and thermostat. i really cant have it blowing up and freezing my house in the winter.

I also wrote a watchdog script in perl to monitor and touch files or restart openhab during fatal errors. I am simply shocked at the frequency of errors.
I also get this error:
2015-11-03 06:30:10.061 ERROR o.o.c.s.ScriptExecutionThread[:50]- Error during the execution of rule ‘Update Thermostat Device’: Could not invoke method: org.openhab.model.script.actions.BusEvent.sendCommand(org.openhab.core.items.Item,java.lang.String) on instance: null
Which causes items to stop working entierly. a restart of openhab is required in this case.

My understanding is this has been fixed in OH2, or is at least being looked at to be fixed in OH2. I don’t think it is slated to be fixed in OH1 though. OH1’s core is basically frozen with continuing development only being done on bindings. All further development is being done on OH2.

It should only be a problem on openHAB restart which, once your system’s configuration stabilizes, should be relatively infrequent.

This sounds like a configuration/rules error as opposed to an openHAB bug. I think the error is that your items have stopped working, not that the error causes them to stop working. Things to check for:

  • Have all of your Items files and persistence loaded before this rule? If not it may be that the rule is executing before the Item is loaded and populated.
  • Break apart your statements and put in lots of logging of states so you can see which one is not right. I suspect one is either Undefined or Uninitialized at the time the rule throws and error
  • Look for patterns on when the error occurs. Does it occur at the same time? Does it occur immediately after some other event?

I have seen this and similar errors many times and every time was caused by an assumption that I made that was false (e.g. I thought the Item would have a value but it was uninitialized) or a typo.

In both errors, they work fine for a certain period of time, then blow up at random times.

The “no such file or directory” error happens middle of the day and i only restart openhab at night.

The " instance: null" error runs flawlessly most of the time, but randomly messes up when the item dies. At that point all rules and items used in my thermostat stop. There are only two items which are manipulated, and would never be undefined. They use ‘exec’ to get values from the system, and this script can only return a number. When i get it next, i will post the full debug log.

As side note, i have two scripts which run via cron that update item files automatically every 5 mins.
So i am taxing the system to refresh config files a lot.
I will try v. 2 shortly. When will it be stable?

Why?

This is a very unusual use case and probably why you are seeing the errors so frequently. I believe the typical use case is to only reload the items files when you are actively developing them and only restarting openHAB when there is a problem or change to the logging or openhab.cfg configs. I know that personally I’ll go weeks between updates to the Item file and restarts of openHAB. I think my experience is typical.

Whenever you reload your Items files it is like openHAB has restarted. This means that for a very brief time your Items will be Undefined until their last value can be loaded from persistence or obtained from their binding. This brief period of Undefinedness will only be a problem if the rule happens to be executing during that time which should be pretty rare. However, because you cause this Undefined state every five minutes you create a lot of opportunities for this state to occur so it shouldn’t be surprising that you see the error frequently.

Similarly you create a lot of opportunities for the FileNotFoundException error to occur. You are generating more opportunities for these errors to occur in a day than most of us do in a year.

Hopefully OH2 will fix the FileNotFoundException but I doubt it will fix your Undefined Item problem. That has less to do with a bug in OH and more to do with the fact that OH is multi-threaded.

Obviously I don’t know what problem you are trying to solve by reloading the Items files every five minutes but I personally would look for another way to solve your problem. I am skeptical that OH (1 or 2) will ever reliably support this case out of the box. It is not a use case that I suspect the developers have thought of nor is it one that is in demand enough to support.

I have objects which connect to my network and i represent these as items in openhab. These items allow me to control and manipulate these objects.
While i agree i am pushing the boundaries, the system is designed to re-read config files on the fly.
Is there no mutex to protect against config file reloading and rules running?
Note that the items i change regularly are not related to the rules which crash. Also note that while my script runs every 5 min, changes to the items files probably only happens every few hours only.
So when will ver 2 be stable?

Is this error also related?
2015-11-08 16:35:00.154 WARN o.a.cpr.DefaultBroadcaster[:1369] - Duplicate resource be29c181-f6ed-4f43-91f7-ac0b79fe961e. Could be caused by a dead connection not detected by your server. Replacing the old one with the fresh one

It requires me to restart openhab as well.

Do you know a priori the full list of these objects that can connect to your network? If so, perhaps it would take a load off of OH to predefine your Items for all of them and use the visibility attribute on the sitemap to only show those that are actually connected. I still don’t know enough to understand specifically why editing the Items file directly with an external script is required.

There is no mutex. Everything is done in parallel without any locks. Most people, myself included, put in place our own locks in our rules, or otherwise put in state checking to handle cases where our Items are undefined or unknown when it becomes a problem. Though for me it really hasn’t been a problem because I have my polling periods set to ensure the rules are executed after my Items and Persistence is loaded and when I am manually editing my Items files I know to look for these sorts of errors. So during the steady state operation my items are always in some state or another, never Undefined.

If they are in the same file then it doesn’t matter whether they changed. When the timestamp on the Items file is changed openHAB resets all the Items in that file. I’ve also see another posting I can’t find right now where someone has had their Items from other files also be reset when changing a different file. I’ve not been able to reproduce this behavior with my setup so I’ve not looked into it yet.

So long as the script doesn’t cause the timestamp of the file to change then it is indeed only reloading the Items file every couple hours then. You can easily tell by looking in the logs. That is still pretty high for the average OH use case but not as bad as I first assumed. But if the timestamp is changed even if there are no changes (e.g. the script reads in the Items file then writes it out even if there are no changes).

Kai has mentioned that OH 2 is pretty stable right now, though not all of the bindings are stable or have even been ported yet. You might look to see if all the bindings you need have been ported and are stable. One of the big changes in OH2 was splitting up the configs into multiple files so perhaps they have a more reliable mechanism for loading changes to your Items files for your use case. They also support programmatically creating Items so perhaps your script would work better ported into OH 2 directly instead of being external.

I’ve seen that warning on occasion but it has never required me to restart OH. I usually see it when a client connects then disconnects. It is a warning and the code around it corrects the problem so it shouldn’t cause any problems that would require an OH restart.

How did you install openHAB?

Has anyone who experiences this problem tried increasing the soft or hard limit on file descriptors in the script that starts the server VM? It seems conceivable that the OSGi container process may have a crazy number of open file descriptors, and after adding one more script, one more addon JAR, etc., this limit is being hit, but the exception being thrown is not detailed enough to trace back to this root cause. I’ve done no research myself on this, but various descriptions of the issue seem to merit some experimentation with ulimit.

For the time being I do not experience the behavior anymore.
The “workaround” I am using is that I only have two rule files.
One very large file that is containing all the rules that I consider as stable and less likely to be changed. And another small rule file that contains rules that are in development.

The approach has the disadvantage that if I have to change the large rule file I cannot use OH designer anymore and also that the loading after a change takes about 4 minutes (on slow Windows tablet).

What I am still observing is that after a new start of OH, the services are sometimes not loaded before the rules are already firing. So rules are breaking because they rely on persisted items that did not receive their values and thereby have the value NULL. I see similar behavior with transformations. I am using XSLT and MAP and I see a couple of XSLT errors at the beginning.

Overall this behavior is not a showstopper so I can live with it.

1 Like

Sounds like pure race conditions to me… But what about?
(xtend) import’s? Items? Syntax Errors in the rules?

I’m using OH 1.8.3, with which the problem still exists.

That corroborates my theory…

This problem never was fixed and at this point never will given the end of development on 1.x . I do not believe this is a problem in OH 2, though I don’t fully remember the problem described in the thread so I could be wrong.

yes, problem is also present in OH2!

@nickma82 From my testing it seems to happen when OH is busy when viewed with “TOP”. Any time the CPU is at 100% or above is the danger zone for this fault - my OH instances only ever use 2 of the Pi’s 4 cores, so maximum CPU for the Java process is about 197%.
To workaround this bug I run a cron script at startup to move all but 1 of my rules files into a seperate folder at system boot, and then move them back one by one, with a delay added to allow each rule file to load and initialise before moving the next one across. Assuming the delays are higher than the absolute minimum time needed it tames things for about 98% of the time.
Experience suggests that despite the file not found error, at least some aspects of the rule file are loaded and working.

1 Like

@kevin could you please share your cron script?

Will do, but it will have to wait until tomorrow - GMT +12 here, off to bed shortly

@PaulHam

This works as follows:

  1. System is booted and a script is called from crontab before OpenHAB starts loading moves all rules out of the way to allow OpenHAB to load before it has to do anything.
  2. After 20 seconds same script moves PreStart.rules back, I use a fast mini PC these days so the 20 seconds may need to be longer on a Pi, or that one rule can be left in place, but then the rest of the rules will need to be moved by name in the script rather than using a wildcard.
  3. PreStart.rules checks that a persisted item called StartingUp is in a valid state before moving the rest of the rules back.
  4. Another item “DisableScheduler” is set to OFF by one of the other rules when the scheduler finished initialising.

This has proven to be robust solution since I moved to a miniPC

crontab -e reads

@reboot sudo /home/pi/scripts/fixopenhab.sh

fixopenhab.sh reads:

 #!/bin/bash
 # This script fixes openhabs rule problems which screwup on starting, its runs at start from crontab -e
mv /home/pi/openhab2/etc/openhab2/rules/MasterClock.rules /home/pi/openhab2/finalrules
mv /home/pi/openhab2/etc/openhab2/rules/* /home/pi/openhab2/rulesstore/
sleep 20
mv /home/pi/openhab2/rulesstore/PreStart.rules /home/pi/openhab2/etc/openhab2/rules/

PreStart.rules:

var boolean PreStartDone = false    
rule "startup"
when Item StartingUp changed or System started then
if (!PreStartDone && StartingUp.state !==NULL)	{
DisableScheduler.sendCommand(ON) // this is changed to OFF when the timers rule is loaded
executeCommandLine('"/home/pi/scripts/MoveRulesBack.sh"', 10000)
PreStartDone = true
} 
end

MoveRulesBack.sh reads:

#!/bin/bash
# This script moves openhabs rules back into place on starting, its runs at start from prestart.rules
mv /home/pi/openhab2/rulesstore/* /home/pi/openhab2/etc/openhab2/rules/