No Emojis in Telegram after splitting unicode string

WOPR · November 7, 2019, 9:40am

Hey guys,
I am having trouble with Emojis in Telegram after using .split().
My Rule uses an item that contains user/group and message split by ;:

rule “Telegram-Gateway”
when
Item Telegram_Gateway received update
then
val String User = Telegram_Gateway.state.toString.split(‘;’).get(0)
val String Msg = Telegram_Gateway.state.toString.split(‘;’).get(1)
val String Timestring = now.toString("dd.MM.YY' 'HH:mm:ss")
val String Message = "[" + Timestring + "] " + Msg

logInfo("Telegram-Gateway", "User: " + User + " | Msg: " + Message)

sendTelegram(User, Message)
sendTelegram("userxxx", "[" + Timestring + "] " + "\ud83d\udd12")
end

When I am sending the follwing message via command-cli, the Gateway-message is pure text whereas the testing message contains the lock-emoji (Charbase U+1F512: LOCK):
openhab> smarthome:send Telegram_Gateway "userxxx;\ud83d\udd12"

Loginfo looks not suspicious:
2019-11-07 10:38:36.141 [INFO ] [rthome.model.script.Telegram-Gateway] - User: userxxx | Msg: [07.11.19 10:38:36] \ud83d\udd12

My assumption is that the .split() somehow changes the string, but I cannot figure out why.

Any hints?

Kind regards
Arne

rossko57 · November 7, 2019, 3:24pm

I would guess you need to escape those backslashes i.e. \\ud83d

WOPR · November 8, 2019, 10:11am

Escaping does not work. Just leads to non-parsing of emojis in the testing message.
I added a screenshot: 3rd line is parsed like in the example above, 4th line is hardcoded like in the xample above. Line 1 and 2 are escaped like you mentioned. It changes nothing with the parsed message, but does not parse the emoji in the hardcoded example anymore.

Kind regards,
Arne

EDIT: Maybe it is related to this post? Maybe it is also related to this post? Dynamic unicode in rules - openHAB Community (also unsolved though)

rossko57 · November 8, 2019, 12:05pm

This looks more relevant, it works for them -

but note that was under OH1

Are you just seeing a problem because you bang off two telegrams within a millisecond? It doesn’t seem like a fair test; I would add a little delay while diagnosing, probably makes no difference but why have doubts.

I think it might be helpful to see if your substring is really unicode - two characters - or ascii

rule “Telegram-Gateway”
when
   Item Telegram_Gateway received update
then
   val String User = Telegram_Gateway.state.toString.split(’;’).get(0)
   val String Msg = Telegram_Gateway.state.toString.split(’;’).get(1)

   logInfo("Telegram-Gateway", "Msg chars: " + Msg.length)

   val String Timestring = now.toString("dd.MM.YY' 'HH:mm:ss")
   val String Message = "[" + Timestring + "] " + Msg

   logInfo("Telegram-Gateway", "User: " + User + " | Msg: " + Message)

   sendTelegram(User, Message)
   Thread::sleep(500)
   sendTelegram("userxxx", "[" + Timestring + "] " + "\ud83d\udd12")

end

WOPR · November 22, 2019, 3:37pm

Sorry for the late reply.
I modified my rule. Sleeping for 500ms does not change anything. My string is interpreted as ascii:

2019-11-22 16:31:14.950 [INFO ] [rthome.model.script.Telegram-Gateway] - Msg chars: 12
2019-11-22 16:31:14.951 [INFO ] [rthome.model.script.Telegram-Gateway] - User: userxxx | Msg: [22.11.19 16:31:14] \ud83d\udd12

But why is that important as the hardcoded emoji is also entered as ascii?

Kidn regards

rossko57 · November 22, 2019, 4:37pm

Whatever parses
sendTelegram( ... + "\ud83d\udd12")
actually parses it I suspect, it reads character \ character u character d etc. and ends up with a string containing two unicode characters.

The string manipulation results in a string containing 12 chars, \ u d etc. which looks like "\ud83d\udd12" of course but is not unicode.

I appears if you give sendTelegram(someString) a string variable, it uses the string as it is.
If however you give sendTelegram("literal") a string literal, something (and I’m guessing it is rules DSL not the command) parses the literal, where it recognises and encodes the unicode.

What you would like is an encodeStringAsUnicode function to work on your string before passing to telegram. Sounds simple, but I’ve no idea how to do that.