Python lxml and stdin

Python lxml and stdin - python

I have a xml file, book.xml (http://msdn.microsoft.com/en-us/library/ms762271(VS.85).aspx)
I would like to cat books.xml and get all book ids and genres for the book id.
Similar to
cat books.xml | python reader.py
Any tips or help would be appreciated. Thanks.

To read an XML file from stdin, just use etree.parse. This function accepts a file object, which can be sys.stdin.
import sys
from lxml import etree
tree = etree.parse(sys.stdin)
print ( [(b.get('id'), b.findtext('genre')) for b in tree.iterfind('book')] )

Related

searching between two csv output in python

Guys I have written this code which ssh to server using paramiko module and get output in csv format for couple of commands. Here is the code and output:-
stdin, stdout, stderr = ssh_client.exec_command('isi nfs exports list --verbose --format=csv')
nfs_exports = (stdout.read().decode(encoding='ascii'))
stdin, stdout, stderr = ssh_client.exec_command('isi sync policies list --format=csv | grep True')
active_sync = (stdout.read().decode(encoding='ascii'))
print(nfs_exports)
16,System,"/test/true/usa/synctest","true
29,System,"/test/lab/Lab_File_Pool_1",false
32,System,"/test/vipr/Lab_File_Pool_1",false
33,System,"/test/testing2/apps001",null
print(active_sync)
synctest,/test/nam/test/synctest,sync,True,target.domain123.com
synctest,/test/lab/Lab_File_Pool_1,sync,True,target.domain123.com
synctest,/test/nar/usa/synctest,sync,True,target.domain123.com
synctest,/test/testing2/apps001,sync,True,target.domain123.com
synctest,/test/true/usa/synctest,sync,True,target.domain123.com
Now the challenging part for me is I need to search for path ("/test/true/usa/synctest") from nfs_exports output in active_sync output. If path matches I need to create new CSV output with all the information from nfs_export.
Desired Output is:-
33,System,"/test/testing2/apps001",null
29,System,"/test/lab/Lab_File_Pool_1",false

First, I would parse the CSV strings into a more usable format. You could use the csv library or re (regular expressions).
Then, if I understand the question correctly, it should just be a simple double-nested for loop.
for each line in one output:
for each line in the other output:
do the paths on the two lines match?
import csv
import io
def parse_csv(string):
string_file = io.StringIO(string)
reader = csv.reader(string_file)
return list(reader)
nfs_exports = parse_csv(nfs_exports)
active_sync = parse_csv(active_sync)
print(nfs_exports)
print(active_sync)
results = []
for active_sync_line in active_sync:
active_sync_path = active_sync_line[1]
for nfs_export_line in nfs_exports:
nfs_export_path = nfs_export_line[2]
if nfs_export_path.strip() == active_sync_path.strip():
results.append(nfs_export_line)
print("output:")
for line in results:
print(",".join(line))
This gives the output:
29,System,/test/lab/Lab_File_Pool_1,false
33,System,/test/testing2/apps001,null
16,System,/test/true/usa/synctest,true
Which is a little different than what you posted but if I understand what you were asking correctly it should be right. If not -- please let me know and I can amend this answer.

Unable to create a formatted JSON file in python

I am a newbie in python (Using python 2.7) and I am trying to write a JSON file like this
import os;
import json;
BUILDNUMBER = "1.0.0"
class Foo(object):
def __init__(self):
self.buildNumber = BUILDNUMBER;
foo = Foo()
s = json.dumps(foo.__dict__)
os.system("echo {0} > ./build.json".format(s));
The contents of build.json looks like this
{buildNumber: 1.0.0}
I want it to look like this
{"buildNumber" : "1.0.0"}
Any help is appreciated.

No, you do not use os.system to call echo to redirect to a file. Never. In Python. Like ever.
Since no one showed how to do it right, this is how you write a JSON file in Python:
with open('./build.json', 'w') as f:
json.dump(foo.__dict__, f)

How to create Jenkins Job with Customized XML in python

I am trying to create a Jenkins job using Jenkins module in python. I am able to successfully connect with jenkins and perfrom get job_count as well as create_job() method.
In create_job() method i can perfrom this operation only with "jenkins.EMPTY_CONFIG_XML" parameter. How do i pass my own xml config file? below is my code, I have config saved on local, how to pass it by replacing EMPTY_CONFIG_XML. I tried few things, didn't work. Couldn't find it online. My below code is working. It's creating TestJob with EMPTY_CONFIG_XML. can someone please help how to pass customized XML file? Thank you for help!
import jenkins
import xml.etree.ElementTree as ET
server = jenkins.Jenkins("http://x.x.x.x:8080", username="foo", password="baar")
#print server.get_whoami()
server.create_job("TestJob",jenkins.EMPTY_CONFIG_XML)

Looking through the documentation for create_job, the config_xml parameter should be passed as a string representation of the xml.
I used a xml.etree.ElementTree to parse the XML file and convert it into a string:
import xml.etree.ElementTree as ET
def convert_xml_file_to_str():
tree = ET.parse(path_to_config_file)
root = tree.getroot()
return ET.tostring(root, encoding='utf8', method='xml').decode()
def main():
target_server = jenkins.Jenkins(url, username=username, password=password)
config = convert_xml_file_to_str()
target_server.create_job(job_name, config)
main()
I found this thread very helpful in understanding how to parse XML files, it also has a nice explanation about differences between Python2/3.

Error validating/parsing xml file against xsd with lxml/objectify in Python

in Python/Django, I need to parse and objectify a file .xml according to a given XMLSchema made of three .xsd files referring each other in such a way:
schema3.xsd (referring schema1.xsd)
schema2.xsd (referring schema1.xsd)
schema1.xsd (referring schema2.xsd)
xml schemas import
For this I'm using the following piece of code which I've already tested being succesfull when used with a couple of xml/xsd files (where .xsd is "standalone" without refering others .xsd):
import lxml
import os.path
from lxml import etree, objectify
from lxml.etree import XMLSyntaxError
def xml_validator(request):
# define path of files
path_file_xml = '../myxmlfile.xml'
path_file_xsd = '../schema3.xsd'
# get file XML
xml_file = open(path_file_xml, 'r')
xml_string = xml_file.read()
xml_file.close()
# get XML Schema
doc = etree.parse(path_file_xsd)
schema = etree.XMLSchema(doc)
#define parser
parser = objectify.makeparser(schema=schema)
# trasform XML file
root = objectify.fromstring(xml_string, parser)
test1 = root.tag
return render(request, 'risultati.html', {'risultato': test1})
Unfortunately, I'm stucked with the following error that i got with the multiple .xsd described above:
complex type 'ObjectType': The content model is not determinist.
Request Method: GET Request URL: http://127.0.0.1:8000/xml_validator
Django Version: 1.9.1 Exception Type: XMLSchemaParseError Exception
Value: complex type 'ObjectType': The content model is not
determinist., line 80
Any idea about that ?
Thanks a lot in advance for any suggestion or useful tips to approach this problem...
cheers
Update 23/03/2016
Here (and in the following answers to the post, because it actually exceed the max number of characters for a post), a sample of the files to figure out the problem...
sample files on GitHub

My best guess would be that your XSD model does not obey the Unique Particle Attribution rule. I would rule that out before looking at anything else.

Parse Json data to Excel

I have data in Json format availaible on this link:
Json Data
What would be the best way to get this done? I know this could be done by Python but not sure how.

Use urllib module to fetch details from the url.
import urllib
url = "http://www.omdbapi.com/?t=UN%20HOMME%20ID%C3%89AL"
res = urllib.urlopen(url)
print res.code
data = res.read()
Parse data to JSON by json module.
import json
data1 = json.loads(data)
Use xlwt module to create xls file.
data = {"Title":"Un homme idÃ©al","Year":"2015","Rated":"N/A",\
"Released":"18 Mar 2015","Runtime":"97 min","Genre":"Thriller",\
"Director":"Yann Gozlan","Writer":"Yann Gozlan, Guillaume Lemans, GrÃ©goire Vigneron",\
"Actors":"Pierre Niney, Ana Girardot, AndrÃ© Marcon, Valeria Cavalli",\
"Plot":"N/A","Language":"French","Country":"France","Awards":"N/A",\
"Poster":"N/A","Metascore":"N/A","imdbRating":"6.3","imdbVotes":"214",\
"imdbID":"tt4058500","Type":"movie","Response":"True"}
import xlwt
book = xlwt.Workbook(encoding="utf-8")
sheet1 = book.add_sheet("AssetsReport0")
colunm_count = 0
for title, value in data.iteritems():
sheet1.write(0, colunm_count, title)
sheet1.write(1, colunm_count, value)
colunm_count += 1
file_name = "test.xls"%()
book.save(file_name)
Get URL from User.
By Command Line Argument:
Use sys.argv to get arguments passed from the command.
Demo:
import sys
print "Arguments:", sys.argv
Output:
vivek:~/workspace/vtestproject/study$ python polydict.py arg1 arg2 arg3
Arguments: ['polydict.py', 'arg1', 'arg2', 'arg3']
By Raw_input() /input() method
Demo:
>>> url = raw_input("Enter url:-")
Enter url:-www.google.com
>>> url
'www.google.com'
>>>
Note:
Use raw_input() for Python 2.x
Use input for Python 3.x

To get data in Python from an URL (and print it):
import requests
r = requests.get('http://www.omdbapi.com/?t=UN%20HOMME%20ID%C3%89AL')
print(r.text)
To parse a json in Python
import requests
import json
r = requests.get('http://www.omdbapi.com/?t=UN%20HOMME%20ID%C3%89AL')
json.loads(r.text)
You will have a JSON object.
To convert from JSON to tsv you may use tablib.
To create a excel document in Python
you may use openpyxl (more tools at python-excel.org).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python lxml and stdin - python

I have a xml file, book.xml (http://msdn.microsoft.com/en-us/library/ms762271(VS.85).aspx) I would like to cat books.xml and get all book ids and genres for the book id. Similar to cat books.xml | python reader.py Any tips or help would be appreciated. Thanks.

To read an XML file from stdin, just use etree.parse. This function accepts a file object, which can be sys.stdin. import sys from lxml import etree tree = etree.parse(sys.stdin) print ( [(b.get('id'), b.findtext('genre')) for b in tree.iterfind('book')] )

Related

searching between two csv output in python

Unable to create a formatted JSON file in python

How to create Jenkins Job with Customized XML in python

Error validating/parsing xml file against xsd with lxml/objectify in Python

Parse Json data to Excel

Categories

Resources