JSR223 File encoding jython

Spaceman_Spiff · March 19, 2016, 7:15am

Hi,

how do I set the proper file encoding for the jython files?
I tried adding in the first line a

# -*- coding: utf-8 -*-

in the beginning but then I get the following error:

javax.script.ScriptException: org.python.antlr.ParseException: org.python.antlr.ParseException: encoding declaration in Unicode string
        at org.python.jsr223.PyScriptEngine.scriptException(PyScriptEngine.java:207) ~[jython.jar:na]
        at org.python.jsr223.PyScriptEngine.compileScript(PyScriptEngine.java:88) ~[jython.jar:na]
        at org.python.jsr223.PyScriptEngine.eval(PyScriptEngine.java:47) ~[jython.jar:na]
        at javax.script.AbstractScriptEngine.eval(Unknown Source) ~[na:1.8.0_73]
        at org.openhab.core.jsr223.internal.engine.scriptmanager.Script.loadScript(Script.java:91) ~[bundlefile:na]
        at org.openhab.core.jsr223.internal.engine.scriptmanager.Script.<init>(Script.java:79) ~[bundlefile:na]
        at org.openhab.core.jsr223.internal.engine.scriptmanager.ScriptManager.loadScript(ScriptManager.java:90) [bundlefile:na]
        at org.openhab.core.jsr223.internal.engine.scriptmanager.ScriptManager.scriptsChanged(ScriptManager.java:185) [bundlefile:na]
        at org.openhab.core.jsr223.internal.engine.scriptmanager.ScriptUpdateWatcher.run(ScriptUpdateWatcher.java:105) [bundlefile:na]
        at java.lang.Thread.run(Unknown Source) [na:1.8.0_73]
Caused by: org.python.core.PyException: null
        at org.python.core.Py.JavaError(Py.java:546) ~[jython.jar:na]
        at org.python.core.ParserFacade.parseExpressionOrModule(ParserFacade.java:129) ~[jython.jar:na]
        at org.python.util.PythonInterpreter.compile(PythonInterpreter.java:320) ~[jython.jar:na]
        at org.python.util.PythonInterpreter.compile(PythonInterpreter.java:312) ~[jython.jar:na]
        at org.python.jsr223.PyScriptEngine.compileScript(PyScriptEngine.java:83) ~[jython.jar:na]
        ... 8 common frames omitted
Caused by: org.python.antlr.ParseException: encoding declaration in Unicode string
        at org.python.core.ParserFacade.prepBufReader(ParserFacade.java:281) ~[jython.jar:na]
        at org.python.core.ParserFacade.parseExpressionOrModule(ParserFacade.java:123) ~[jython.jar:na]
        ... 11 common frames omitted

While googeling I found this should acutally work, but it obviously seems not.

When I remove the entry in the first line, my rule loads fine but I am having problems with special characters like “°C”, “äöü”, etc.
Can anyone point me in the right direction?

steve1 · March 19, 2016, 12:17pm

Are you using the “u” prefix on the Python strings? I tried an example and it worked when not including the comment and prefixing the unicode strings.

Here’s some discussion about the encoding comment… http://bugs.jython.org/issue1696

I don’t fully understand the reasoning presented there. It seems to work both with or without the comment if the script is interpreted directly from Jython rather than through the JSR223 ScriptEngine.eval function.

Spaceman_Spiff · March 19, 2016, 1:27pm

I tried both. I also played around with .decode and .encode but I did not manage to get it to work.
My file encoding is utf-8 and I am not including the comment and not including the BOM.
When using the u"°C" it doesn’t work ether.
What is your file encoding?

steve1 · March 19, 2016, 9:50pm

When you say it’s not working, what do you mean?

Spaceman_Spiff · March 20, 2016, 9:39am

I ment I have not been able to use special characters in strings like “°C”.
It becames something completely unreadable.

print( "°C")
print( u"°C")

becomes

├?┬░C
┬░C

steve1 · March 20, 2016, 2:07pm

I run openhab as a daemon so I can’t use print in that context. However, I’ve run standalone Jython scripts that use the JSR223 ScriptEngine class to load scripts that print to the console (Terminal on Mac OS X) and it works correctly (without encoding comment, BOM, etc.). I’ve also logged unicode from from an openhab rule and that worked too.

riturrioz · August 29, 2016, 12:46pm

Hi @steve1 and @Spaceman_Spiff!

I’m really confused about this unicode, str, utf-8, encode, decode… stuff

I’ve made a really simple jython script and I don’t understand what’s going on here. Here you can see the rule:

import openhab

@openhab.rule
class TestRule(Rule):
    def __init__(self):
        self.log.info("---------------------------------------------")
        self.log.info("---------------------------------------------")
        test = u'habitación'
        self.log.info("test: %s" % (test))
        self.log.info("test's repr: %s" % (repr(test)))
        self.log.info("test's type: %s" % (type(test)))
        test_encode = test.encode("utf-8")
        self.log.info("test_encode: %s" % (test_encode))
        self.log.info("test_encode's repr: %s" % (repr(test_encode)))
        self.log.info("test_encode's type: %s" % (type(test_encode)))
        self.log.info("---------------------------------------------")
        self.log.info("---------------------------------------------")

    def getEventTrigger(self):
        return [

        ]

    def execute(self, event):
        pass

def getRules():
    return RuleSet([
        TestRule()
    ])

And here the openHAB log result:

2016-08-29 14:42:32.811 [INFO ] [.openhab.model.jsr223.TestRule] - ---------------------------------------------
2016-08-29 14:42:32.811 [INFO ] [.openhab.model.jsr223.TestRule] - ---------------------------------------------
2016-08-29 14:42:32.811 [INFO ] [.openhab.model.jsr223.TestRule] - test: habitaci??n
2016-08-29 14:42:32.812 [INFO ] [.openhab.model.jsr223.TestRule] - test's repr: u'habitaci\ufffd\ufffdn'
2016-08-29 14:42:32.812 [INFO ] [.openhab.model.jsr223.TestRule] - test's type: <type 'unicode'>
2016-08-29 14:42:32.812 [INFO ] [.openhab.model.jsr223.TestRule] - test_encode: habitaci??????n
2016-08-29 14:42:32.812 [INFO ] [.openhab.model.jsr223.TestRule] - test_encode's repr: 'habitaci\xef\xbf\xbd\xef\xbf\xbdn'
2016-08-29 14:42:32.812 [INFO ] [.openhab.model.jsr223.TestRule] - test_encode's type: <type 'str'>
2016-08-29 14:42:32.812 [INFO ] [.openhab.model.jsr223.TestRule] - ---------------------------------------------
2016-08-29 14:42:32.812 [INFO ] [.openhab.model.jsr223.TestRule] - ---------------------------------------------

The result is the same defining “test” using “habitación”, so same result with “habitación” and u’habitación’.

Does anybody know why “habitaci??n” is printed in the log file instead of “habitación”?

Many thanks for your help guys!

Best regards,

Aitor