How to rewrite XML file in gitlab from python - python

I read XML file from gitlab into a variable, then I do some manipulations with it. And I need to rewrite the file in gitlab using that variable. When I use dump - it deletes all from the file. How can I rewrite XML file in gitlab from python?
import gitlab
import io
import xml.etree.ElementTree as ET
gl = gitlab.Gitlab(
private_token='xxxxx')
gl.auth()
projects = gl.projects.list(owned=True, search='Python')
raw_content = projects[0].files.raw(file_path='9_XML/XML_hw.xml', ref='main')
f = io.BytesIO()
f.write(raw_content)
f.seek(0)
xml_file = ET.parse(f) # read file
..... some manipulations with xml_file
project_id = 111111
project = gl.projects.get(project_id)
f = project.files.get(file_path='9_XML/XML_hw.xml', ref='main')
f.content = ET.dump(xml_file) # IT doesn't rewrite, it deletes everything from the file
f.save(branch='main', commit_message='Update file')

ET.dump doesn't produce a return value. It only prints to stdout. As stated in the docs:
Writes an element tree or element structure to sys.stdout. This function should be used for debugging only.
Hence, you end up setting f.content = None.
Instead of using .dump, use .tostring:
xml_str = ET.tostring(xml_file, encoding='unicode')
f.content = xml_str

Related

How to rewrite JSON file in gitlab from python

I have a JSON file in git repo, then I upload it into the python variable, do some manipulations with this variable. How can I update JSON in gitlab using this variable
import gitlab
import json
import io
gl = gitlab.Gitlab(
private_token='xxxxxxxxx')
gl.auth()
projects = gl.projects.list(owned=True, search='Python')
raw_content = projects[0].files.raw(file_path='8_JSON_Module/json_HW.json', ref='main')
f = io.BytesIO()
f.write(raw_content)
f.seek(0)
data = json.load(f) # read from the file
... do some manipulations with variable data
I know that in Python we use this command to update the file but I have no idea how to update it in gitlab
json.dump(data, open('json_HW.json', 'w'))
Use the files api file object. Reference
f = project.files.get(file_path='README.rst', ref='main')
decoded_content = f.decode()
new_content = modify_content(decoded_content) # you implement this
# update the contents and commit the changes
f.content = new_content
f.save(branch='main', commit_message='Update file')
You can use json.dumps to get a string for the new content.
f.content = json.dumps(data)
f.save(...)

Append XML responses in Python

I am trying to parse multiple XML responses in one file. However, when I write a responses to file, it shows only last one. I assume I need to add append somewhere in order to keep all responses.
Here is my code:
import json
import xml.etree.ElementTree as ET
#loop test
feins = ['800228936', '451957238']
for i in feins:
rr = requests.get('https://pdb-services.nipr.com/pdb-xml-reports/hitlist_xml.cgi?report_type=0&id_fein={}'.format(i),auth=('test', 'test'))
root = ET.fromstring(rr.text)
tree = ET.ElementTree(root)
tree.write("file.xml")
Try changing
for i in feins:
...
tree = ET.ElementTree(root)
tree.write("file.xml")
to (note the indentation):
for i in feins:
...
tree = ET.ElementTree(root)
with open("file.xml", "wb") as f:
tree.write(f)
and see if it works.

Loading an XML file from a specific path

I am creating a simple GUI application to manage unknown words while learning a new language. Anyways, I am having troubles with loading an XML file from a specific path because I do not know how to properly declare filepaths. The program should first declare filepaths, check to see if the directory exists and create it if necessary, check to see whether the file (XML document) exists and create it if necessary, write start and end elements, and finally load the XML document from the specified path.
In C# and Windows, I would do it like this:
string path = Environment.GetFolderPath(Environment.SpecialFolder.ApplicationData);
string vocabulary_path = path + "\\Vocabulary\\Words.xml";
if (!Directory.Exists(path + "\\Vocabulary"))
Directory.CreateDirectory(path + "\\Vocabulary");
if (!File.Exists(vocabulary_path))
{
XmlTextWriter xW = new XmlTextWriter(vocabulary_path, Encoding.UTF8);
xW.WriteStartElement("Words");
xW.WriteEndElement();
xW.Close();
}
XmlDocument xDoc = new XmlDocument();
xDoc.Load(vocabulary_path);
...but I'm using Python and Linux Mint Xfce.
Here is what I have so far:
if not os.path.exists(directory):
os.makedirs(directory)
my_file = Path("/path/to/file")
if not my_file.is_file():
# create an XML document and write start and end element into it
In Python use the ElementTree-Module:
import os
import xml.etree.ElementTree as et
vocabulary = os.path.join(path, "Vocabulary", "Words.xml")
if not os.path.exists(vocabulary):
if not os.path.exists(os.path.dirname(vocabulary)):
os.mkdirs(os.path.dirname(vocabulary))
doc = et.Element('Words')
tree = et.ElementTree(doc)
tree.write(vocabulary)
else:
tree = et.ElementTree(file=vocabulary)

Can anyone tell me what error msg "line 1182 in parse" means when I'm trying to parse and xml in python

This is the code that results in an error message:
import urllib
import xml.etree.ElementTree as ET
url = raw_input('Enter URL:')
urlhandle = urllib.urlopen(url)
data = urlhandle.read()
tree = ET.parse(data)
The error:
I'm new to python. I did read documentation and a couple of tutorials, but clearly I still have done something wrong. I don't believe it is the xml file itself because it does this to two different xml files.
Consider using ElementTree's fromstring():
import urllib
import xml.etree.ElementTree as ET
url = raw_input('Enter URL:')
# http://feeds.bbci.co.uk/news/rss.xml?edition=int
urlhandle = urllib.urlopen(url)
data = urlhandle.read()
tree = ET.fromstring(data)
print ET.tostring(tree, encoding='utf8', method='xml')
data is a reference to the XML content as a string, but the parse() function expects a filename or file object as argument. That's why there is an an error.
urlhandle is a file object, so tree = ET.parse(urlhandle) should work for you.
The error message indicates that your code is trying to open a file, who's name is stored in the variable source.
It's failing to open that file (IOError) because the variable source contains a bunch of XML, not a file name.

Exception when parsing a xml using lxml

I wrote this code to validate my xml file via a xsd
def parseAndObjectifyXml(xmlPath, xsdPath):
from lxml import etree
xsdFile = open(xsdPath)
schema = etree.XMLSchema(file=xsdFile)
xmlinput = open(xmlPath)
xmlContent = xmlinput.read()
myxml = etree.parse(xmlinput) # In this line xml input is empty
schema.assertValid(myxml)
but when I want to validate it, my xmlinput is empty but my xmlContent is not empty.
what is the problem?
Files in python have a "current position"; it starts at the beginning of the file (position 0), then, as you read the file, the current position pointer moves along until it reaches the end.
You'll need to put that pointer back to the beginning before the lxml parser can read the contents in full. Use the .seek() method for that:
from lxml import etree
def parseAndObjectifyXml(xmlPath, xsdPath):
xsdFile = open(xsdPath)
schema = etree.XMLSchema(file=xsdFile)
xmlinput = open(xmlPath)
xmlContent = xmlinput.read()
xmlinput.seek(0)
myxml = etree.parse(xmlinput)
schema.assertValid(myxml)
You only need to do this if you need xmlContent somewhere else too; you could alternatively pass it into the .parse() method if wrapped in a StringIO object to provide the necessary file object methods:
from lxml import etree
from cStringIO import StringIO
def parseAndObjectifyXml(xmlPath, xsdPath):
xsdFile = open(xsdPath)
schema = etree.XMLSchema(file=xsdFile)
xmlinput = open(xmlPath)
xmlContent = xmlinput.read()
myxml = etree.parse(StringIO(xmlContent))
schema.assertValid(myxml)
If you are not using xmlContent for anything else, then you do not need the extra .read() call either, and subsequently won't have problems parsing it with lxml; just omit the call altogether, and you won't need to move the current position pointer back to the start either:
from lxml import etree
def parseAndObjectifyXml(xmlPath, xsdPath):
xsdFile = open(xsdPath)
schema = etree.XMLSchema(file=xsdFile)
xmlinput = open(xmlPath)
myxml = etree.parse(xmlinput)
schema.assertValid(myxml)
To learn more about .seek() (and it's counterpart, .tell()), read up on file objects in the Python tutorial.
You should use the XML content that you have read:
xmlContent = xmlinput.read()
myxml = etree.parse(xmlContent)
instead of:
myxml = etree.parse(xmlinput)

Categories