Parsing yaml file with --- in python - python

How can we parse a file which contains multiple configs and which are separated by --- in python.
I've config file which looks like
File name temp.yaml
%YAML 1.2
---
name: first
cmp:
- Some: first
top:
top_rate: 16000
audio_device: "pulse"
---
name: second
components:
- name: second
parameters:
always_on: true
timeout: 200000
When I read it with
import yaml
with open('./temp.yaml', 'r') as f:
temp = yaml.load(f)
I am getting following error
temp = yaml.load(f)
Traceback (most recent call last):
File "temp.py", line 4, in <module>
temp = yaml.load(f)
File "/home/pranjald/.local/lib/python3.6/site-packages/yaml/__init__.py", line 114, in load
return loader.get_single_data()
File "/home/pranjald/.local/lib/python3.6/site-packages/yaml/constructor.py", line 41, in get_single_data
node = self.get_single_node()
File "/home/pranjald/.local/lib/python3.6/site-packages/yaml/composer.py", line 43, in get_single_node
event.start_mark)
yaml.composer.ComposerError: expected a single document in the stream
in "./temp.yaml", line 3, column 1
but found another document
in "./temp.yaml", line 10, column 1

Your input is composed of multiple YAML documents. For that you will need yaml.load_all() or better yet yaml.safe_load_all(). (The latter will not construct arbitrary Python objects outside of data-like structures such as list/dict.)
import yaml
with open('temp.yaml') as f:
temp = yaml.safe_load_all(f)
As hinted at by the error message, yaml.load() is strict about accepting only a single YAML document.
Note that safe_load_all() returns a generator of Python objects which you'll need to iterate over.
>>> gen = yaml.safe_load_all(f)
>>> next(gen)
{'name': 'first', 'cmp': [{'Some': 'first', 'top': {'top_rate': 16000, 'audio_device': 'pulse'}}]}
>>> next(gen)
{'name': 'second', 'components': [{'name': 'second', 'parameters': {'always_on': True, 'timeout': 200000}}]}

Related

Why am I receiving this error using BackTrader on Python?

I am trying to learn how to use the backtrader module on Python. I copied the code directly from the website but receiving an error message.
Here is the website: https://www.backtrader.com/docu/quickstart/quickstart/
I downloaded S&P500 stock data from Yahoo Finance and saved it into an excel file named 'SPY'. Here is the code so far:
from __future__ import (absolute_import, division, print_function,
unicode_literals)
import datetime # For datetime objects
import os.path # To manage paths
import sys # To find out the script name (in argv[0])
# Import the backtrader platform
import backtrader as bt
if __name__ == '__main__':
# Create a cerebro entity
cerebro = bt.Cerebro()
# Datas are in a subfolder of the samples. Need to find where the script is
# because it could have been called from anywhere
modpath = os.path.dirname(os.path.abspath(sys.argv[0]))
datapath = os.path.join(modpath, 'C:\\Users\\xboss\\Desktop\\SPY.csv')
# Create a Data Feed
data = bt.feeds.YahooFinanceCSVData(
dataname=datapath,
# Do not pass values before this date
fromdate=datetime.datetime(2000, 1, 1),
# Do not pass values after this date
todate=datetime.datetime(2000, 12, 31),
reverse=False)
# Add the Data Feed to Cerebro
cerebro.adddata(data)
# Set our desired cash start
cerebro.broker.setcash(100000.0)
# Print out the starting conditions
print('Starting Portfolio Value: %.2f' % cerebro.broker.getvalue())
# Run over everything
cerebro.run()
# Print out the final result
print('Final Portfolio Value: %.2f' % cerebro.broker.getvalue())
Here is the error that I am receiving:
C:\Users\xboss\PycharmProjects\BackTraderDemo\venv\Scripts\python.exe C:/Users/xboss/PycharmProjects/BackTraderDemo/backtrader_quickstart.py
Traceback (most recent call last):
File "C:/Users/xboss/PycharmProjects/BackTraderDemo/backtrader_quickstart.py", line 39, in <module>
cerebro.run()
File "C:\Users\xboss\PycharmProjects\BackTraderDemo\venv\lib\site-packages\backtrader\cerebro.py", line 1127, in run
runstrat = self.runstrategies(iterstrat)
File "C:\Users\xboss\PycharmProjects\BackTraderDemo\venv\lib\site-packages\backtrader\cerebro.py", line 1212, in runstrategies
data.preload()
File "C:\Users\xboss\PycharmProjects\BackTraderDemo\venv\lib\site-packages\backtrader\feed.py", line 688, in preload
while self.load():
Starting Portfolio Value: 100000.00
File "C:\Users\xboss\PycharmProjects\BackTraderDemo\venv\lib\site-packages\backtrader\feed.py", line 479, in load
_loadret = self._load()
File "C:\Users\xboss\PycharmProjects\BackTraderDemo\venv\lib\site-packages\backtrader\feed.py", line 710, in _load
return self._loadline(linetokens)
File "C:\Users\xboss\PycharmProjects\BackTraderDemo\venv\lib\site-packages\backtrader\feeds\yahoo.py", line 129, in _loadline
dt = date(int(dttxt[0:4]), int(dttxt[5:7]), int(dttxt[8:10]))
ValueError: invalid literal for int() with base 10: '1/29'
Process finished with exit code 1
Does anyone have any suggestions? Any help would be greatly appreciated. Thank you so much for your time!
You get the error because of using custom csv with YahooFinanceCSVData method.
You should import them using GenericCSVData method.
data = btfeed.GenericCSVData(
dataname='SPY.csv',
fromdate=datetime.datetime(2000, 1, 1),
todate=datetime.datetime(2000, 12, 31),
nullvalue=0.0,
dtformat=('%Y-%m-%d'),
datetime=0,
high=1,
low=2,
open=3,
close=4,
volume=5,
openinterest=-1
)
For more information you can see the instruction here

Configuring ruamel.yaml to allow duplicate keys

I'm trying to use the ruamel.yaml library to process a Yaml document that contains duplicate keys. In this case the duplicate key happens to be a merge key <<:.
This is the yaml file, dupe.yml:
foo: &ref1
a: 1
bar: &ref2
b: 2
baz:
<<: *ref1
<<: *ref2
c: 3
This is my script:
import ruamel.yaml
yml = ruamel.yaml.YAML()
yml.allow_duplicate_keys = True
doc = yml.load(open('dupe.yml'))
assert doc['baz']['a'] == 1
assert doc['baz']['b'] == 2
assert doc['baz']['c'] == 3
When run, it raises this error:
Traceback (most recent call last):
File "rua.py", line 5, in <module>
yml.load(open('dupe.yml'))
File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/main.py", line 331, in load
return constructor.get_single_data()
File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 111, in get_single_data
return self.construct_document(node)
File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 121, in construct_document
for _dummy in generator:
File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 1543, in construct_yaml_map
self.construct_mapping(node, data, deep=True)
File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 1448, in construct_mapping
value = self.construct_object(value_node, deep=deep)
File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 174, in construct_object
for _dummy in generator:
File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 1543, in construct_yaml_map
self.construct_mapping(node, data, deep=True)
File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 1399, in construct_mapping
merge_map = self.flatten_mapping(node)
File "/usr/local/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 1350, in flatten_mapping
raise DuplicateKeyError(*args)
ruamel.yaml.constructor.DuplicateKeyError: while constructing a mapping
in "dupe.yml", line 8, column 3
found duplicate key "<<"
in "dupe.yml", line 9, column 3
To suppress this check see:
http://yaml.readthedocs.io/en/latest/api.html#duplicate-keys
Duplicate keys will become an error in future releases, and are errors
by default when using the new API.
How can I make ruamel read this file without errors? The documentation says that allow_duplicate_keys = True will make the loader tolerate duplicated keys, but it doesn't seem to work.
I'm using Python 3.7 and ruamel.yaml 0.15.90.
That
yaml.allow_duplicate_keys = True
only works for non-merge keys in versions before 0.15.91.
In 0.15.91+ this works and the merge key assumes the value of the first instantiation of the key (like with non-merge keys), that means it works as if you had written:
baz:
<<: *ref1
c: 3
and not as if you had written:
baz:
<<: [*ref1, *ref2]
c: 3
If you need that you have to monkey-patch the flatten routine that handles the merge keys (and that affects loading of all following YAML files with double merge keys):
import sys
import ruamel.yaml
yaml_str = """\
foo: &ref1
a: 1
bar: &ref2
b: 2
baz:
<<: *ref1
<<: *ref2
c: 3
"""
def my_flatten_mapping(self, node):
def constructed(value_node):
# type: (Any) -> Any
# If the contents of a merge are defined within the
# merge marker, then they won't have been constructed
# yet. But if they were already constructed, we need to use
# the existing object.
if value_node in self.constructed_objects:
value = self.constructed_objects[value_node]
else:
value = self.construct_object(value_node, deep=False)
return value
merge_map_list = []
index = 0
while index < len(node.value):
key_node, value_node = node.value[index]
if key_node.tag == u'tag:yaml.org,2002:merge':
if merge_map_list and not self.allow_duplicate_keys: # double << key
args = [
'while constructing a mapping',
node.start_mark,
'found duplicate key "{}"'.format(key_node.value),
key_node.start_mark,
"""
To suppress this check see:
http://yaml.readthedocs.io/en/latest/api.html#duplicate-keys
""",
"""\
Duplicate keys will become an error in future releases, and are errors
by default when using the new API.
""",
]
if self.allow_duplicate_keys is None:
warnings.warn(DuplicateKeyFutureWarning(*args))
else:
raise DuplicateKeyError(*args)
del node.value[index]
# if key/values from later merge keys have preference you need
# to insert value_node(s) at the beginning of merge_map_list
# instead of appending
if isinstance(value_node, ruamel.yaml.nodes.MappingNode):
merge_map_list.append((index, constructed(value_node)))
elif isinstance(value_node, ruamel.yaml.nodes.SequenceNode):
for subnode in value_node.value:
if not isinstance(subnode, ruamel.yaml.nodes.MappingNode):
raise ruamel.yaml.constructor.ConstructorError(
'while constructing a mapping',
node.start_mark,
'expected a mapping for merging, but found %s' % subnode.id,
subnode.start_mark,
)
merge_map_list.append((index, constructed(subnode)))
else:
raise ConstructorError(
'while constructing a mapping',
node.start_mark,
'expected a mapping or list of mappings for merging, '
'but found %s' % value_node.id,
value_node.start_mark,
)
elif key_node.tag == u'tag:yaml.org,2002:value':
key_node.tag = u'tag:yaml.org,2002:str'
index += 1
else:
index += 1
return merge_map_list
ruamel.yaml.constructor.RoundTripConstructor.flatten_mapping = my_flatten_mapping
yaml = ruamel.yaml.YAML()
yaml.allow_duplicate_keys = True
data = yaml.load(yaml_str)
for k in data['baz']:
print(k, '>', data['baz'][k])
The above gives:
c > 3
a > 1
b > 2
After reading the library source code, I found a workaround. Setting the option to None prevents the error.
yml.allow_duplicate_keys = None
A warning is still printed to the console, but it's not fatal and the program will continue.

TypeError: 'xml.etree.ElementTree.Element' object is not callable

I am converting to Python an application I had earlier written in C#. It's a GUI application to manage unknown words while learning a new language. When the application starts, I have to load the words from the XML file which has a pretty simple structure:
<Words>
<Word>
<Word>test</Word>
<Explanation>test</Explanation>
<Translation>test</Translation>
<Examples>test</Examples>
</Word>
</Words>
Nevertheless, I am getting:
/usr/bin/python3.5 /home/cali/PycharmProjects/Vocabulary/Vocabulary.py
Traceback (most recent call last): File
"/home/cali/PycharmProjects/Vocabulary/Vocabulary.py", line 203, in
main() File "/home/cali/PycharmProjects/Vocabulary/Vocabulary.py", line 198, in
main
gui = Vocabulary(root) File "/home/cali/PycharmProjects/Vocabulary/Vocabulary.py", line 28, in
init
self.load_words() File "/home/cali/PycharmProjects/Vocabulary/Vocabulary.py", line 168, in
load_words
w = Word(node('Word').text, node('Explanation').text, node('Translation').text, node('Example').text) TypeError:
'xml.etree.ElementTree.Element' object is not callable
This is the original LoadWords() method:
void LoadWords()
{
words.Clear();
listView1.Items.Clear();
string path = Environment.GetFolderPath(Environment.SpecialFolder.ApplicationData);
string vocabulary_path = path + "\\Vocabulary\\Words.xml";
if (!Directory.Exists(path + "\\Vocabulary"))
Directory.CreateDirectory(path + "\\Vocabulary");
if (!File.Exists(vocabulary_path))
{
XmlTextWriter xW = new XmlTextWriter(vocabulary_path, Encoding.UTF8);
xW.WriteStartElement("Words");
xW.WriteEndElement();
xW.Close();
}
XmlDocument xDoc = new XmlDocument();
xDoc.Load(vocabulary_path);
foreach (XmlNode xNode in xDoc.SelectNodes("Words/Word"))
{
Word w = new Word();
w.WordOrPhrase = xNode.SelectSingleNode("Word").InnerText;
w.Explanation = xNode.SelectSingleNode("Explanation").InnerText;
w.Translation = xNode.SelectSingleNode("Translation").InnerText;
w.Examples = xNode.SelectSingleNode("Examples").InnerText;
words.Add(w);
listView1.Items.Add(w.WordOrPhrase);
WordCount();
}
}
I don't know how to access each node's inner text.
Here is my load_words function:
def load_words(self):
self.listBox.delete(0, END)
self.words.clear()
path = os.path.expanduser('~/Desktop')
vocabulary = os.path.join(path, 'Vocabulary', 'Words.xml')
if not os.path.exists(vocabulary):
if not os.path.exists(os.path.dirname(vocabulary)):
os.mkdir(os.path.dirname(vocabulary))
doc = ET.Element('Words')
tree = ET.ElementTree(doc)
tree.write(vocabulary)
else:
tree = ET.ElementTree(file=vocabulary)
for node in tree.findall('Word'):
w = Word(node('Word').text, node('Explanation').text, node('Translation').text, node('Example').text)
self.words.append(w)
self.listBox.insert(w.wordorphrase)
TypeError: 'xml.etree.ElementTree.Element' object is not callable
As the error message mentioned, node is an Element, not a method which you can call/invoke like method_name(parameters) as you did in this part :
w = Word(node('Word').text, node('Explanation').text, node('Translation').text, node('Example').text)
Method that is closer to SelectSingleNode() in your C# would be Element.find(), for example, to get the first child element named Word from node and then extract the inner text :
inner_text = node.find('Word').text
And the implementation in your context code would be as follows :
w = Word(node.find('Word').text, node.find('Explanation').text, node.find('Translation').text, node.find('Example').text)

What does 'yaml.parser.ParserError: expected '<document start>', but found '<block mapping start>'' mean?

I have the following YAML file:
[mysqld]
user: "mysql"
pid-file: /var/run/mysqld/mysqld.pid
skip-external-locking
old_passwords: 1
skip-bdb
skip-innodb
create_key: yes
needs_agent: no
knows_oop: True
likes_emacs: TRUE
women:
- Mary Smith
- Susan Williams
and the following Python code:
#!/usr/bin/env python
import yaml
with open("config.yml") as f:
sample_config = f.read()
print(yaml.load(sample_config))
But it gives me:
Traceback (most recent call last):
File "/home/moose/Desktop/bla.py", line 9, in <module>
print(yaml.load(sample_config))
File "/usr/local/lib/python2.7/dist-packages/yaml/__init__.py", line 71, in load
return loader.get_single_data()
File "/usr/local/lib/python2.7/dist-packages/yaml/constructor.py", line 37, in get_single_data
node = self.get_single_node()
File "/usr/local/lib/python2.7/dist-packages/yaml/composer.py", line 39, in get_single_node
if not self.check_event(StreamEndEvent):
File "/usr/local/lib/python2.7/dist-packages/yaml/parser.py", line 98, in check_event
[Finished in 0.1s with exit code 1]
[shell_cmd: python -u "/home/moose/Desktop/bla.py"]
[dir: /home/moose/Desktop]
[path: /usr/local/texlive/2013/bin/x86_64-linux:/home/moose/google-cloud-sdk/bin:/home/moose/Downloads/google_appengine:/usr/local/texlive/2013/bin/x86_64-linux:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games] self.current_event = self.state()
File "/usr/local/lib/python2.7/dist-packages/yaml/parser.py", line 174, in parse_document_start
self.peek_token().start_mark)
yaml.parser.ParserError: expected '<document start>', but found '<block mapping start>'
in "<string>", line 2, column 1:
user: "mysql"
I have no idea what
expected '<document start>', but found '<block mapping start>'
means and how to fix it. What is <document start> and what is a <block mapping start>?
Your file isn't valid YAML. It looks like a mix of YAML and INI file.
You can't define blocks like [mysql] in YAML. If you want to define a collection of related properties, use a list with nested keys:
- service:
name: mysql
type: database
port: 3306
- service:
name: ssh
type: remote access
port: 22
You can't have bare words like skip-external-locking. Each property requires a value. Use skip-external-locking: true instead.
Here's a version of your document with the syntax errors fixed. I checked this over with YAMLLint, a handy tool for validating YAML.
name: mysqld
user: mysql
pid-file: /var/run/mysqld/mysqld.pid
skip-external-locking: true
old_passwords: 1
skip-bdb: true
skip-innodb: true
create_key: yes
needs_agent: no
knows_oop: True
likes_emacs: TRUE
women:
- Mary Smith
- Susan Williams

Python - SimpleJSON Issue

I'm working with the Mega API and Python in hope to produce a folder tree readable by Python. At the moment I'm working with the JSON responses Mega's API gives, but for some reason am having trouble parsing it. In the past I would simply use simplejson in the format below, though right now it's not working. At the moment I'm just trying to get the file name. Any help is appreciated!
import simplejson
megaResponseToFileSearch = "(u'BMExefXbYa', {u'a': {u'n': u'A Bullet For Pretty Boy - 1 - Dial M For Murder.mp3'}, u'h': u'BMExXbYa', u'k': (5710166, 21957970, 11015946, 7749654L), u'ts': 13736999, 'iv': (7949460, 15946811, 0, 0), u'p': u'4FlnwBTb', u's': 5236864, 'meta_mac': (529642, 2979591L), u'u': u'xpz_tb-YDUg', u't': 0, 'key': (223xx15874, 642xx8505, 1571620, 26489769L, 799460, 1596811, 559642, 279591L)})"
jsonRespone = simplejson.loads(megaResponseToFileSearch)
print jsonRespone[u'a'][u'n']
ERROR:
Traceback (most recent call last):
File "D:/Projects/Mega Sync/megasync.py", line 18, in <module>
jsonRespone = simplejson.loads(file4)
File "D:\Projects\Mega Sync\simplejson\__init__.py", line 453, in loads
return _default_decoder.decode(s)
File "D:\Projects\Mega Sync\simplejson\decoder.py", line 429, in decode
obj, end = self.raw_decode(s)
File "D:\Projects\Mega Sync\simplejson\decoder.py", line 451, in raw_decode
raise JSONDecodeError("No JSON object could be decoded", s, idx)
simplejson.decoder.JSONDecodeError: No JSON object could be decoded: line 1 column 0 (char 0)
EDIT:
I was asked where I got the string from. It's a response to searching for a file using the Mega API. I'm using the module found here. https://github.com/richardasaurus/mega.py
The code itself looks like this:
from mega import Mega
mega = Mega({'verbose': True})
m = mega.login(email, password)
file = m.find('A Bullet For Pretty Boy - 1 - Dial M For Murder.mp3')
print file
The thing you are getting from m.find is just a python tuple, where the 1-st (next after the 0th) element is a dictionary:
(u'99M1Tazb',
{u'a': {u'n': u'test.txt'},
u'h': u'99M1Tazb',
u'k': (1145485578, 1435138417, 702505527, 274874292),
u'ts': 1373482712,
'iv': (1883603069, 763415510, 0, 0),
u'p': u'9td12YaY',
u's': 0,
'meta_mac': (1091379956, 402442960),
u'u': u'79_166PAQCA',
u't': 0,
'key': (872626551, 2013967015, 1758609603, 127858020, 1883603069, 763415510, 1091379956, 402442960)})
To get the filename, just use:
print file[1]['a']['n']
So, no need to use simplejson at all.

Categories