Simple XML Parsing in Python

I’m programming in python and I’m trying to read the status of my Yamaha network stereo. I get the following xml response

<YAMAHA_AV rsp="GET" RC="0">
    <System>
        <Power_Control>
            <Power>On</Power>
        </Power_Control>
    </System>
</YAMAHA_AV>

All good so far. Now, I’m trying to search for Power to get the latest status. It is there in XML but I can’t locate the Power Status value when I use a XML parser such as Elementtree.

import requests
from xml.etree import ElementTree as ET

url = "http://192.168.128.199/YamahaRemoteControl/ctrl"

payload = "<?xml version=\"1.0\" encoding=\"utf-8\"?>\r\n<YAMAHA_AV cmd=\"GET\">\r\n<System>\r\n    <Power_Control>\r\n        <Power>GetParam</Power>\r\n    </Power_Control>\r\n</System>\r\n</YAMAHA_AV>"
headers = {
    'Content-Type': "text/xml",
    'Cache-Control': "no-cache",
    'Postman-Token': "9ad57fc1-4c78-a921-5967-bef4d2167214"
    }

response = requests.request("POST", url, data=payload, headers=headers)
print(response.text)
rawdata=ET.fromstring(response.content)
result= rawdata.find('Power').text
print (result)
result = rawdata.items()
print(result)

I get a null value. So I step through python interactive and I can see findall and findtext attribute when I use dir() function on the root but not on the subelements.

>>> root=ET.parse('yamahaPowerStatus.xml').getroot()
>>> print root
<Element 'YAMAHA_AV' at 0x6386bb0>
>>> PC=root.getchildren()[0].getchildren()
>>> print PC
[<Element 'Power_Control' at 0x6386cf0>]
>>> check= PC.findtext('Power')

Traceback (most recent call last):
  File "<pyshell#40>", line 1, in <module>
    check= PC.findtext('Power')
AttributeError: 'list' object has no attribute 'findtext'

dir(PC)
[‘add’, ‘class’, ‘contains’, ‘delattr’, ‘delitem’, ‘delslice’, ‘doc’, ‘eq’, ‘format’, ‘ge’, ‘getattribute’, ‘getitem’, ‘getslice’, ‘gt’, ‘hash’, ‘iadd’, ‘imul’, ‘init’, ‘iter’, ‘le’, ‘len’, ‘lt’, ‘mul’, ‘ne’, ‘new’, ‘reduce’, ‘reduce_ex’, ‘repr’, ‘reversed’, ‘rmul’, ‘setattr’, ‘setitem’, ‘setslice’, ‘sizeof’, ‘str’, ‘subclasshook’, ‘append’, ‘count’, ‘extend’, ‘index’, ‘insert’, ‘pop’, ‘remove’, ‘reverse’, ‘sort’]

dir (root)
[‘class’, ‘delattr’, ‘delitem’, ‘dict’, ‘doc’, ‘format’, ‘getattribute’, ‘getitem’, ‘hash’, ‘init’, ‘len’, ‘module’, ‘new’, ‘nonzero’, ‘reduce’, ‘reduce_ex’, ‘repr’, ‘setattr’, ‘setitem’, ‘sizeof’, ‘str’, ‘subclasshook’, ‘weakref’, ‘_children’, ‘append’, ‘attrib’, ‘clear’, ‘copy’, ‘extend’, ‘find’, ‘findall’, ‘findtext’, ‘get’, ‘getchildren’, ‘getiterator’, ‘insert’, ‘items’, ‘iter’, ‘iterfind’, ‘itertext’, ‘keys’, ‘makeelement’, ‘remove’, ‘set’, ‘tag’, ‘tail’, ‘text’]

So what am I’m doing wrong? The XML response appears to be properly formatted. I tried using other XML parsers but the same result. I also tried converting it to JSON first but the same result. I reviewed youtube videos and I tried to locate something similar here with no luck. I think I’m doing something fundamentally wrong.

You will probably get a better response on a Python or XML forum. The XML itself looks valid but there are not many Python or XML experts on this forum.

If you were using this from an OH Item, I would expect the XPATH transform would be able to extract the value.

Thanks Rich. Do you have an example of how this would be done with XPATH? I’m thinking I would use a http post item and then use XPATH to transform the result? But based on the result I need to do several other posts.

It would be something like

String MyItem "Yamaha Power" {http="<[http://192.168.128.199/YamahaRemoteControl/ctrl:30000:XPATH(//YAMAHA_AV/System/Power_Control/Power)]"}

The above is an Item bound to the HTTP Binding that will send a get to that URL every 30 seconds and set the State of MyItem to the value of the Power element.

Because the result is not “ON” we can’t assign it to a Switch Item directly. You would have to use a Rule and update a Proxy Item.

Thanks Rich for the example. I copied it to my items file and created a text item in the sitemap but no value appears. I guess I need to troubleshoot it.

The example won’t work without installing and configuring the HTTP binding.

https://docs.openhab.org/addons/bindings/http1/readme.html

…and the XPATH transformation. Now I’m getting an exception

2018-01-11 14:38:35.243 [WARN ] [ab.binding.http.internal.HttpBinding] - Transformation ‘XPATH(//YAMAHA_AV/System/Power_Control/Power)’ threw an exception. [response=]

org.openhab.core.transform.TransformationException: transformation throws exceptions

at org.openhab.core.transform.TransformationHelper$TransformationServiceDelegate.transform(TransformationHelper.java:67) [221:org.openhab.core.compat1x:2.3.0.201801092213]

at org.openhab.binding.http.internal.HttpBinding.execute(HttpBinding.java:194) [229:org.openhab.binding.http:1.12.0.201801080210]

at org.openhab.core.binding.AbstractActiveBinding$BindingActiveService.execute(AbstractActiveBinding.java:144) [221:org.openhab.core.compat1x:2.3.0.201801092213]

at org.openhab.core.service.AbstractActiveService$RefreshThread.run(AbstractActiveService.java:166) [221:org.openhab.core.compat1x:2.3.0.201801092213]

I suspect it is not parsing the response

Drop the transform and look in the log to see exactly what is being returned. I suspect there is something about the response making it not valid xml.

I dropped from the item statement but I got a parsing error. Do I need to append a different regexp?

String MyItem "Yamaha Power" {http="<[http://192.168.128.199/YamahaRemoteControl/ctrl:30000]"}

2018-01-11 17:11:04.380 [ERROR] [el.item.internal.GenericItemProvider] - Binding configuration of type ‘http’ of item ‘MyItem’ could not be parsed correctly.

org.eclipse.smarthome.model.item.BindingConfigParseException: bindingConfig ‘http://192.168.128.199/YamahaRemoteControl/ctrl:30000’ doesn’t represent a valid in-binding-configuration. A valid configuration is matched by the RegExp ‘(.?)({.})?:(?!//)(\d*):(.*)’

I should have been more specific. Replace the transform with “default”. I don’t think you can have it without something in the transformation section.

default will just pass the data through to the Item unchanged.

If that doesn’t work, try REGEX((.*)) which matches everything.

I finally got this to work. So the following python script is used to turn on my kitchen radio and switch it to aux2 where the Chromecast Audio is connected. The exec binding is installed and I create a exec thing where I then enter the python command and the path for the script. The interval is set to 0 so it won’t run automatically. In the rules section, I created a rule that runs the exec command when the Google Play starts playing.

import requests
from xml.etree import ElementTree as ET

url = "http://192.168.128.199/YamahaRemoteControl/ctrl"

PowerStatus = "<YAMAHA_AV cmd=\"GET\"><System><Power_Control><Power>GetParam</Power></Power_Control></System></YAMAHA_AV>"
PowerOn = "<YAMAHA_AV cmd=\"PUT\"><System><Power_Control><Power>On</Power></Power_Control></System></YAMAHA_AV>"
SetAUX2 = "<YAMAHA_AV cmd=\"PUT\"><System><Input><Input_Sel>AUX2</Input_Sel></Input></System></YAMAHA_AV>"
SetTUNER = "<YAMAHA_AV cmd=\"PUT\"><System><Input><Input_Sel>TUNER</Input_Sel></Input></System></YAMAHA_AV>"

headers = {
    'Content-Type': "text/xml",
    'Cache-Control': "no-cache",
    'Postman-Token': "9ad57fc1-4c78-a921-5967-bef4d2167214"
    }

response = requests.request("POST", url, data=PowerStatus, headers=headers)
tree=ET.fromstring(response.content)
for power_name in tree.findall('.//Power_Control/Power'):
    result= power_name.text

if result != 'On':
    response = requests.request("POST", url, data=PowerOn, headers=headers)
    while result != 'On':
        response = requests.request("POST", url, data=PowerStatus, headers=headers)
        tree=ET.fromstring(response.content)
        for power_name in tree.findall('.//Power_Control/Power'):
            result= power_name.text      
    response = requests.request("POST", url, data=SetAUX2, headers=headers)
else:
    response = requests.request("POST", url, data=SetAUX2, headers=headers)