I’m trying to get the ID from a waypoint in my gpx-file. The ID is placed in the extension tag of my file. I’m using gpxpy to get other values like the latitude and longitude from the file, but I didn’t find a way to get the ID.
Here you can see my code:
import gpxpy
node_id = []
gpx_file = open("test.gpx", mode='rt', encoding='utf-8')
gpx = gpxpy.parse(gpx_file)
for waypoint in gpx.waypoints:
node_id.append(waypoint.extensions.id)
And a part of my test.gpx-file:
<wpt lat="53.865650" lon="10.684415">
<extensions>
<ogr:id>17</ogr:id>
<ogr:longitude>10.684415</ogr:longitude>
<ogr:latitude>53.865650</ogr:latitude>
</extensions>
</wpt>
Is there a way to get the id of the waypoint with gpxpy?
waypoint.extensions is just an array. So you can't just get an item by name. You have to iterate through that array. The "name" of the extensions is stored in the "tag" property of the Element, the value in the "text" property. As i don't have your xml-scheme to test with the extension ogr:id, i tried with the following gpx file:
<?xml version="1.0" encoding="UTF-8" ?>
<gpx xmlns="http://www.topografix.com/GPX/1/1" version="1.1" creator="OSMTracker for Android™ - https://github.com/labexp/osmtracker-android"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.topografix.com/GPX/1/1 http://www.topografix.com/GPX/1/1/gpx.xsd ">
<wpt lat="10.31345465" lon="10.21237815">
<extensions>
<id>17</id>
</extensions>
<ele>110.0</ele>
<time>2018-09-29T09:31:58Z</time>
<name><![CDATA[train station]]></name>
<sat>0</sat>
</wpt>
</gpx>
I wrote an short function to get the id. It is not tested against anything (for example the extensions doesn't exist).
import gpxpy
def getId(waypoint):
for extension in waypoint.extensions:
if extension.tag == 'id':
return extension.text
node_id = []
gpx_file = open("test2.gpx", mode='rt', encoding='utf-8')
gpx = gpxpy.parse(gpx_file)
for waypoint in gpx.waypoints:
print(getId(waypoint))
The functions gets an GPX Waypoint as argument and loops through the extensions array. If that array contains an element with the tag (name) "id" it returns the text (value).
Best regards
Thimo
Related
I am learning my way around python and right now I need a little bit of help. I have an XML file from soap api that I am failing at converting to CSV. I managed to get the data with the request library easily. My struggle is converting it to CSV, I end up with headers with no values
My XML Data :
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<Level2 xmlns="https://xxxxxxxxxx/xxxxxxx">
<Level3>
<ResponseStatus>Success</ResponseStatus>
<ErrorMessage/>
<Message>20 alert(s) generated for this period</Message>
<ProcessingTimeSecs>0.88217689999999993</ProcessingTimeSecs>
<Something1>1</Something1>
<Something2/>
<Something3/>
<Something4/>
<VIP>
<MainVIP>
<Date>20210616</Date>
<RegisteredDate>20210216</RegisteredDate>
<Type>YMBA</Type>
<TypeDescription>TYPE OF ENQUIRY</TypeDescription>
<BusinessName>COMPANY NAME</BusinessName>
<ITNumber>987654321</ITNumber>
<RegistrationNumber>123456789</RegistrationNumber>
<SubscriberNumber>55889977</SubscriberNumber>
<SubscriberReference/>
<TicketNumber>1122336655</TicketNumber>
<SubscriberName>COMPANY NAME 2 </SubscriberName>
<CompletedDate>20210615</CompletedDate>
</MainVIP>
</VIP>
<Something5/>
<Something6/>
<Something7/>
<Something8/>
<Something9/>
<PrincipalSomething10/>
<PrincipalSomething11/>
<PrincipalSomething12/>
<PrincipalSomething13/>
<Something14/>
<Something15/>
<Something16/>
<Something17/>
<Something18/>
<PrincipalSomething19/>
<PrincipalSomething20/>
</Level3>
</Level2>
</soap:Body>
</soap:Envelope>
My python code looks like this :
import xml.etree.ElementTree as ET
import pandas as pd
cols = ['Date', 'RegisteredDate', 'Type',
'TypeDescription']
rows = []
# parse xml file
xmlparse = ET.parse('xmldata.xml')
root = xmlparse.getroot()
for i in root:
Date = i.get('Date').text
RegisteredDate = i.get('RegisteredDate').text
Type = i.get('Type').text
TypeDescription = i.get('TypeDescription').text
rows.append({'Date': Date,
'RegisteredDate': RegisteredDate,
'Type': Type,
'TypeDescription': TypeDescription})
df = pd.DataFrame(rows, columns=cols)
print(df)
df.to_csv('csvdata.csv')
In my approach, I was following the idea from here https://www.geeksforgeeks.org/convert-xml-to-csv-in-python/
You probably don't need to go through ElementTree; you can feed the xml directly to pandas. If I understand you correctly, this should do it:
df = pd.read_xml(path_to_file,"//*[local-name()='MainVIP']")
df = df.iloc[:,:4]
df
Output from your xml above:
Date RegisteredDate Type TypeDescription
0 20210616 20210216 YMBA TYPE OF ENQUIRY
Without any external lib - the code below generates a csv file.
The idea is to collect the required elements data from MainVip and store it in list of dicts. Loop on the list and write the data into a file.
import xml.etree.ElementTree as ET
xml = ''' <soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<Level2 xmlns="https://xxxxxxxxxx/xxxxxxx">
<Level3>
<ResponseStatus>Success</ResponseStatus>
<ErrorMessage/>
<Message>20 alert(s) generated for this period</Message>
<ProcessingTimeSecs>0.88217689999999993</ProcessingTimeSecs>
<Something1>1</Something1>
<Something2/>
<Something3/>
<Something4/>
<VIP>
<MainVIP>
<Date>20210616</Date>
<RegisteredDate>20210216</RegisteredDate>
<Type>YMBA</Type>
<TypeDescription>TYPE OF ENQUIRY</TypeDescription>
<BusinessName>COMPANY NAME</BusinessName>
<ITNumber>987654321</ITNumber>
<RegistrationNumber>123456789</RegistrationNumber>
<SubscriberNumber>55889977</SubscriberNumber>
<SubscriberReference/>
<TicketNumber>1122336655</TicketNumber>
<SubscriberName>COMPANY NAME 2 </SubscriberName>
<CompletedDate>20210615</CompletedDate>
</MainVIP>
</VIP>
<Something5/>
<Something6/>
<Something7/>
<Something8/>
<Something9/>
<PrincipalSomething10/>
<PrincipalSomething11/>
<PrincipalSomething12/>
<PrincipalSomething13/>
<Something14/>
<Something15/>
<Something16/>
<Something17/>
<Something18/>
<PrincipalSomething19/>
<PrincipalSomething20/>
</Level3>
</Level2>
</soap:Body>
</soap:Envelope>'''
cols = ['Date', 'RegisteredDate', 'Type',
'TypeDescription']
rows = []
NS = '{https://xxxxxxxxxx/xxxxxxx}'
root = ET.fromstring(xml)
for vip in root.findall(f'.//{NS}MainVIP'):
rows.append({c: vip.find(NS+c).text for c in cols})
with open('out.csv','w') as f:
f.write(','.join(cols) + '\n')
for row in rows:
f.write(','.join(row[c] for c in cols) + '\n')
out.csv
Date,RegisteredDate,Type,TypeDescription
20210616,20210216,YMBA,TYPE OF ENQUIRY
Based on a couple of other examples I've found here, I've created a script that creates an xml file from a csv input using lxml.etree and lxml.ebuilder. It gives me almost what I need - the one thing I'm struggling with is that I need to also include a single-occurrence tag at the top of the data which will contain a static value.
Here's my sample data:
ACTION|INV_ACCT_CLASS|EXT_INV_ID|WAREHOUSE_ID|NAME|CNTRY_CD|PHONE|ADDR_STR1|ADDR_STR2|CITY|ST|ZIP|ADD_KEY_NUM
add|2|AAA_00005|1001213|Company 1|US|9995555555|1313 Mockingbird Lane||New York|NY|10001|44433322
add|2|BBB_00008|1004312|Company 2|US|43255511110|Some other address||Stamford|CT|44112|11122233
My code so far:
import lxml.etree
from lxml.builder import E
import csv
with open("filename.csv") as csvfile:
results = E.paiInv(*(
E.invrec(
E.action(row['ACTION']),
E.investor(
E.inv_account_class(row['INV_ACCOUNT_CLASS']),
E.ext_inv_id(row['EXT_INV_ID']),
E.warehouse_id(row['WAREHOUSE_ID']),
E.name(row['NAME']),
E.cntry_cd(row['CNTRY_CD']),
E.phone(row['PHONE']),
E.addr_str1(row['ADDRESS_STR1']),
E.addr_str2(row['ADDRESS_STR2']),
E.city(row['CITY']),
E.st(row['ST']),
E.zip(row['ZIP']),
E.add_key_num(row['ADD_KEY_NUM'])
)
) for row in csv.DictReader(csvfile, delimiter = '|'))
)
lxml.etree.ElementTree(results).write("OutputFile.xml")
Here's my output so far:
<paiInv>
<invrec>
<action>add</action>
<investor>
<inv_account_class>2</inv_account_class>
<ext_inv_id>AAA_00005</ext_inv_id>
<warehouse_id>1001213</warehouse_id>
<name>Company 1</name>
<cntry_cd>US</cntry_cd>
<phone>9995555555</phone>
<addr_str1>1313 Mockingbird Lane</addr_str1>
<addr_str2></addr_str2>
<city>New York</city>
<st>NY</st>
<zip>10001</zip>
<add_key_num>44433322</add_key_num>
</investor>
</invrec>
<invrec>
<action>add</action>
<investor>
<inv_account_class>2</inv_account_class>
<ext_inv_id>BBB_00008</ext_inv_id>
<warehouse_id>1004312</warehouse_id>
<name>Company 2</name>
<cntry_cd>US</cntry_cd>
<phone>43255511110</phone>
<addr_str1>Some other address</addr_str1>
<addr_str2></addr_str2>
<city>Stamford</city>
<st>NB</st>
<zip>44112</zip>
<add_key_num>11122233</add_key_num>
</investor>
</invrec>
</paiInv>
And the output I need includes one extra (single occurrence) tag, named request_id, occurring at the top of the data, like this:
<paiInv>
<request_id>req44</request_id>
<invrec>
<action>add</action>
<investor>
<inv_account_class>2</inv_account_class>
<ext_inv_id>AAA_00005</ext_inv_id>
<warehouse_id>1001213</warehouse_id>
<name>Company 1</name>
<cntry_cd>US</cntry_cd>
<phone>9995555555</phone>
<addr_str1>1313 Mockingbird Lane</addr_str1>
<addr_str2></addr_str2>
<city>New York</city>
<st>NY</st>
<zip>10001</zip>
<add_key_num>44433322</add_key_num>
</investor>
</invrec>
<invrec>
<action>add</action>
<investor>
<inv_account_class>2</inv_account_class>
<ext_inv_id>BBB_00008</ext_inv_id>
<warehouse_id>1004312</warehouse_id>
<name>Company 2</name>
<cntry_cd>US</cntry_cd>
<phone>43255511110</phone>
<addr_str1>Some other address</addr_str1>
<addr_str2></addr_str2>
<city>Stamford</city>
<st>NB</st>
<zip>44112</zip>
<add_key_num>11122233</add_key_num>
</investor>
</invrec>
</paiInv>
Any suggestions will be appreciated. I haven't been able to get anything other than syntax errors with my attempts to get the extra tag so far.
Before you save the file, try something like:
doc = lxml.etree.ElementTree(results)
ins = lxml.etree.fromstring('<request_id>req44</request_id>')
ins.tail = "\n"
dest = doc.xpath('/paiInv')[0]
dest.insert(0,ins)
print(lxml.etree.tostring(doc).decode())
The output should be what you are looking for.
Here i need to parse the xml and get the values. I need to get attribute element like 'personid =01' which i couldnt get in this code. And also i need to fetch the grand children node values also. here it is for "SIBLING" and its name tags.BUt i cant hard code it as sibling and fetch the value. And top of all i need to handle multiple attributes and join them to form a unique key which will come as a column in the final table.
import xml.dom
import xml.dom.minidom
doc = xml.dom.minidom.parseString('''
<root>
<person id="01">
<name> abc</name>
<age>32</age>
<address>addr123</address>
<siblings>
<name></name>
</siblings>
</person>
<person id="02">
<name> def</name>
<age>22</age>
<address>addr456</address>
<siblings>
<name></name>
<name></name>
</siblings>
</person>
</root>
''')
innerlist=[]
outerlist=[]
def innerHtml(root):
text = ''
nodes = [ root ]
while not nodes==[]:
node = nodes.pop()
if node.nodeType==xml.dom.Node.TEXT_NODE:
text += node.wholeText
else:
nodes.extend(node.childNodes)
return text
for statusNode in doc.getElementsByTagName('person'):
for childNode in statusNode.childNodes:
if childNode.nodeType==xml.dom.Node.ELEMENT_NODE:
if innerHtml(childNode).strip() != '':
innerlist.append(childNode.nodeName+" "+innerHtml(childNode).strip())
outerlist.append(innerlist)
innerlist=[]
#print(outerlist)
attrlist = []
nodes = doc.getElementsByTagName('person')
for node in nodes:
if 'id' in node.attributes:
#print(node.attributes['id'].value)
attrlist.append(node.attributes['id'].value)
#print(attrlist)
dictionary = dict(zip(attrlist, outerlist))
print(dictionary)
Comment: i have stored it in a dictnorary. {'01': ['name abc', 'age 32', 'address addr123'], '02': ['name def', 'age 22', 'address addr456']}.
You can't write suche a dict to CSV!
ValueError: dict contains fields not in fieldnames: '01'
Do you REALY want to convert to CSV?
Read about CSV File Reading and Writing
Comment: Here i need to get sibiling tag also as another innerlist.
CSV dosn't support such innerlist?
Edit your Question and show expected CSV Output!
Question: xml to csv conversion
Solution with xml.etree.ElementTree.
Note: Don't understand how you want to handle grand children node values.
Write it as List of dict in one Column.
import csv
import xml.etree.ElementTree as ET
root = ET.fromstring(doc)
fieldnames = None
with open('doc.csv', 'w') as fh:
for p in root.findall('person'):
person = {'_id':p.attrib['id']}
for element in p:
if len(element) >= 1:
person[element.tag] = []
for sub_e in element:
person[element.tag].append({sub_e.tag:sub_e.text})
else:
person[element.tag] = element.text
if not fieldnames:
fieldnames = sorted(person)
w = csv.DictWriter(fh, fieldnames=fieldnames)
w.writeheader()
w.writerow(person)
Output:
_id,address,age,name,siblings
01,addr123,32, abc,[{'name': 'sib1'}]
02,addr456,, def,"[{'name': 'sib2'}, {'name': 'sib3'}]"
Tested with Python: 3.4.2
Please read entire question before marking duplicate.
I have a nested XML file which i Want to convert to a csv file.
I have to write a python script for same.
The XML file is:
<?xml version="1.0"?>
<ListOrdersResponse xmlns="https://mws.amazonservices.com/Orders/2013-09-01">
<ListOrdersResult>
<Orders>
<Order>
<LatestShipDate>2015-06-02T18:29:59Z</LatestShipDate>
<OrderType>StandardOrder</OrderType>
<PurchaseDate>2015-05-31T03:58:30Z</PurchaseDate>
<AmazonOrderId>171-6355256-9594715</AmazonOrderId>
<LastUpdateDate>2015-06-01T04:18:58Z</LastUpdateDate>
<ShipServiceLevel>IN Std Domestic</ShipServiceLevel>
<NumberOfItemsShipped>0</NumberOfItemsShipped>
<OrderStatus>Canceled</OrderStatus>
<SalesChannel>Amazon.in</SalesChannel>
<NumberOfItemsUnshipped>0</NumberOfItemsUnshipped>
<IsPremiumOrder>false</IsPremiumOrder>
<EarliestShipDate>2015-05-31T18:30:00Z</EarliestShipDate>
<MarketplaceId>A21TJRUUN4KGV</MarketplaceId>
<FulfillmentChannel>MFN</FulfillmentChannel>
<IsPrime>false</IsPrime>
<ShipmentServiceLevelCategory>Standard</ShipmentServiceLevelCategory>
</Order>
<Order>
<LatestShipDate>2015-06-02T18:29:59Z</LatestShipDate>
<OrderType>StandardOrder</OrderType>
<PurchaseDate>2015-05-31T04:50:07Z</PurchaseDate>
<BuyerEmail>dr7h1rhy6457rng#marketplace.amazon.in</BuyerEmail>
<AmazonOrderId>403-5551715-2566754</AmazonOrderId>
<LastUpdateDate>2015-06-01T07:52:49Z</LastUpdateDate>
<ShipServiceLevel>IN Exp Dom 2</ShipServiceLevel>
<NumberOfItemsShipped>2</NumberOfItemsShipped>
<OrderStatus>Shipped</OrderStatus>
<SalesChannel>Amazon.in</SalesChannel>
<ShippedByAmazonTFM>false</ShippedByAmazonTFM>
<LatestDeliveryDate>2015-06-06T18:29:59Z</LatestDeliveryDate>
<NumberOfItemsUnshipped>0</NumberOfItemsUnshipped>
<BuyerName>Ajit Nair</BuyerName>
<EarliestDeliveryDate>2015-06-02T18:30:00Z</EarliestDeliveryDate>
<OrderTotal>
<CurrencyCode>INR</CurrencyCode>
<Amount>938.00</Amount>
</OrderTotal>
<IsPremiumOrder>false</IsPremiumOrder>
<EarliestShipDate>2015-05-31T18:30:00Z</EarliestShipDate>
<MarketplaceId>A21TJRUUN4KGV</MarketplaceId>
<FulfillmentChannel>MFN</FulfillmentChannel>
<TFMShipmentStatus>Delivered</TFMShipmentStatus>
<PaymentMethod>Other</PaymentMethod>
<ShippingAddress>
<StateOrRegion>MAHARASHTRA</StateOrRegion>
<City>THANE</City>
<Phone>9769994355</Phone>
<CountryCode>IN</CountryCode>
<PostalCode>400709</PostalCode>
<Name>Ajit Nair</Name>
<AddressLine1>C-25 / con-7 / Chandralok CHS</AddressLine1>
<AddressLine2>Sector-10 ,Koper khairne</AddressLine2>
</ShippingAddress>
<IsPrime>false</IsPrime>
<ShipmentServiceLevelCategory>Expedited</ShipmentServiceLevelCategory>
</Order>
I tried to get values for my code in form of a list. But it doesn't print anything.
My Code:
from xml.etree import ElementTree
with open('orders.xml', 'rb') as f:
tree = ElementTree.parse(f)
for node in tree.findall('.//Order'):
oid = node.attrib.get('SellerOrderId')
if oid:
print oid
What is wrong with my code?
EDIT: Temporary link to complete File Orders.xml
Your XML has default namespace defined here :
<ListOrdersResponse xmlns="https://mws.amazonservices.com/Orders/2013-09-01">
Note that descendant elements inherits ancestor default namespace implicitly, unless otherwise specified. You need to combine namespace + local name to form a fully qualified element name, for example :
ns = {'d': 'https://mws.amazonservices.com/Orders/2013-09-01'}
for node in tree.findall('.//d:Order', ns):
oid = node.attrib.get('SellerOrderId')
if oid:
print oid
According to the full XML file you linked to, SellerOrderId is child element of Order instead of attribute. In this case, you can simply use .//d:Order/d:SellerOrderId to get them and then print it's value, like so :
ns = {'d': 'https://mws.amazonservices.com/Orders/2013-09-01'}
for node in tree.findall('.//d:Order/d:SellerOrderId', ns):
print node.text
output :
171-1322776-9700344
171-4214129-7148305
402-8263846-7042737
402-7017923-9474716
402-9691237-2887553
171-4614227-7597903
403-6729903-2119563
402-2184564-2676353
171-4520392-2088330
402-7986969-8827533
I have a xml file Orders.xml (excerpt follows):
<?xml version="1.0"?>
<ListOrdersResponse xmlns="https://mws.amazonservices.com/Orders/2013-09-01">
<ListOrdersResult>
<Orders>
<Order>
<LatestShipDate>2015-06-02T18:29:59Z</LatestShipDate>
<OrderType>StandardOrder</OrderType>
<PurchaseDate>2015-05-31T03:58:30Z</PurchaseDate>
<AmazonOrderId>171-6355256-9594715</AmazonOrderId>
<LastUpdateDate>2015-06-01T04:18:58Z</LastUpdateDate>
<ShipServiceLevel>IN Std Domestic</ShipServiceLevel>
<NumberOfItemsShipped>0</NumberOfItemsShipped>
<OrderStatus>Canceled</OrderStatus>
<SalesChannel>Amazon.in</SalesChannel>
<NumberOfItemsUnshipped>0</NumberOfItemsUnshipped>
<IsPremiumOrder>false</IsPremiumOrder>
<EarliestShipDate>2015-05-31T18:30:00Z</EarliestShipDate>
<MarketplaceId>A21TJRUUN4KGV</MarketplaceId>
<FulfillmentChannel>MFN</FulfillmentChannel>
<IsPrime>false</IsPrime>
<ShipmentServiceLevelCategory>Standard</ShipmentServiceLevelCategory>
</Order>
<Order>
<LatestShipDate>2015-06-02T18:29:59Z</LatestShipDate>
<OrderType>StandardOrder</OrderType>
<PurchaseDate>2015-05-31T04:50:07Z</PurchaseDate>
<BuyerEmail>dr7h1rhy6457rng#marketplace.amazon.in</BuyerEmail>
<AmazonOrderId>403-5551715-2566754</AmazonOrderId>
<LastUpdateDate>2015-06-01T07:52:49Z</LastUpdateDate>
<ShipServiceLevel>IN Exp Dom 2</ShipServiceLevel>
<NumberOfItemsShipped>2</NumberOfItemsShipped>
<OrderStatus>Shipped</OrderStatus>
<SalesChannel>Amazon.in</SalesChannel>
<ShippedByAmazonTFM>false</ShippedByAmazonTFM>
<LatestDeliveryDate>2015-06-06T18:29:59Z</LatestDeliveryDate>
<NumberOfItemsUnshipped>0</NumberOfItemsUnshipped>
<BuyerName>Ajit Nair</BuyerName>
<EarliestDeliveryDate>2015-06-02T18:30:00Z</EarliestDeliveryDate>
<OrderTotal>
<CurrencyCode>INR</CurrencyCode>
<Amount>938.00</Amount>
</OrderTotal>
<IsPremiumOrder>false</IsPremiumOrder>
<EarliestShipDate>2015-05-31T18:30:00Z</EarliestShipDate>
<MarketplaceId>A21TJRUUN4KGV</MarketplaceId>
<FulfillmentChannel>MFN</FulfillmentChannel>
<TFMShipmentStatus>Delivered</TFMShipmentStatus>
<PaymentMethod>Other</PaymentMethod>
<ShippingAddress>
<StateOrRegion>MAHARASHTRA</StateOrRegion>
<City>THANE</City>
<Phone>9769994355</Phone>
<CountryCode>IN</CountryCode>
<PostalCode>400709</PostalCode>
<Name>Ajit Nair</Name>
<AddressLine1>C-25 / con-7 / Chandralok CHS</AddressLine1>
<AddressLine2>Sector-10 ,Koper khairne</AddressLine2>
</ShippingAddress>
<IsPrime>false</IsPrime>
<ShipmentServiceLevelCategory>Expedited</ShipmentServiceLevelCategory>
</Order>
</Orders>
<CreatedBefore>2015-06-08T06:45:22Z</CreatedBefore>
<NextToken>smN7fNREdZyaJqJYLDm0ZIfVkJJPpovRb7YcCAmB0tlUojdU4H46trQzazHyYVyLqBXdLk4iogxpJASl2BeRezElfc2tdWR3lK0FtvOjoEqUrelVme04kSJ0wMvlylZkWQWPqGlbsnPaEpJjLWtrc27Vm9nDvRdgFtvOhjiqTWA16vKmtecRgbuZIF9n45mtnrZ4AbBdBTdge/hBzh1HtoVw85GaTVKBVfeXMWcfhX25HmwX5IAmwKfxnqm3JqvZ0Rjw/YZARKQMcjl5+H0CsJGesRwkZOQCBLVDshZ93sFo8v4Do3XuodaFg8ZGJDSTcawcthgh/MGM4KOIYd79q7Aq3I/8b9+STDy5JVgPyI0jQ6ftKc7EcAIwpq2cHuPbP+HgZXNbc7qI4HDvHa5YloEDUrIQbaP8qbwRHLZm6VTmGvVwLKwj6AZ0GNanrGO6</NextToken>
</ListOrdersResult>
<ResponseMetadata>
<RequestId>f2b55344-d281-4bd3-b8b3-788be07b7656</RequestId>
</ResponseMetadata>
</ListOrdersResponse>
I am using a python script to parse data from xml file. I want two fields from XML file AmazonOrderID and BuyerName. Some sub element in XML might not have have BuyerName. When I parse both individually, I get a list of 100 AmazonOrder and 70 BuyerName.
I want to get a empty string instead of nothing. i.e. if any subelement doesn't have a buyer name, i want to include '' instead of nothing.
My Code:
from xml.etree import ElementTree
with open('orders.xml', 'rb') as f:
tree = ElementTree.parse(f)
ns = {'d': 'https://mws.amazonservices.com/Orders/2013-09-01'}
for node in tree.findall('.//d:Order/d:AmazonOrderId', ns):
oid.append(node.text)
for node in tree.findall('.//d:Order/d:BuyerName', ns):
bn.append(node.text)
print oid
print bn
You can make it in a single loop using findtext() specifying the default as an empty string:
for node in tree.findall('.//d:Order', namespaces=ns):
oid.append(node.findtext("d:AmazonOrderId", default='', namespaces=ns))
bn.append(node.findtext("d:BuyerName", default='', namespaces=ns))