Based on ConfigParser module how can I filter out and throw every comments from an ini file?
import ConfigParser
config = ConfigParser.ConfigParser()
config.read("sample.cfg")
for section in config.sections():
print section
for option in config.options(section):
print option, "=", config.get(section, option)
eg. in the ini file below the above basic script prints out the further comments lines as well like:
something = 128 ; comment line1
; further comments
; one more line comment
What I need is having only the section names and pure key-value pairs inside them without any comments. Does ConfigParser handles this somehow or should I use regexp...or? Cheers
according to docs lines starting with ; or # will be ignored. it doesn't seem like your format satisfies that requirement. can you by any chance change format of your input file?
edit: since you cannot modify your input files, I'd suggest pre-parsing them with something along the lines:
tmp_fname = 'config.tmp'
with open(config_file) as old_file:
with open(tmp_fname, 'w') as tmp_file:
tmp_file.writelines(i.replace(';', '\n;') for i in old_lines.readlines())
# then use tmp_fname with ConfigParser
obviously if semi-colon is present in options you'll have to be more creative.
Best way is to write a commentless file subclass:
class CommentlessFile(file):
def readline(self):
line = super(CommentlessFile, self).readline()
if line:
line = line.split(';', 1)[0].strip()
return line + '\n'
else:
return ''
You could use it then with configparser (your code):
import ConfigParser
config = ConfigParser.ConfigParser()
config.readfp(CommentlessFile("sample.cfg"))
for section in config.sections():
print section
for option in config.options(section):
print option, "=", config.get(section, option)
It seems your comments are not on lines that start with the comment leader. It should work if the comment leader is the first character on the line.
As the doc said: "(For backwards compatibility, only ; starts an inline comment, while # does not.)" So use ";" and not "#" for inline comments. It is working well for me.
Python 3 comes with a build-in solution: The class configparser.RawConfigParser has constructor argument inline_comment_prefixes. Example:
class MyConfigParser(configparser.RawConfigParser):
def __init__(self):
configparser.RawConfigParser.__init__(self, inline_comment_prefixes=('#', ';'))
Related
i want to save magnet link in ini document.Thus,it's inevitable that i have to save "=" charactor in the file.However,python deem "=" charactor a "option to value" ,so python idle returns"configparser.DuplicateOptionError: While reading from 'history.ini' [line 3]: option 'magnet' in section '0' already exists" when i use
configparser.ConfigParser().read('history.ini')
If you have any idea to deal with this problem ,please let me know, thanks in advance.
I cannot reproduce the problem. I can save values containing = characters just fine:
test.ini
[Section]
Key=Val=ue
test.py
import configparser
cp = configparser.ConfigParser()
cp.read('ini.ini')
print(cp['Section']['Key']) # Val=ue
I think the actual issue is just that you use the magnet key twice in the config (two lines which both start with magnet=).
If you want to have a list of multiple magnet links you can try using something like magnet00000=, magnet00001=, magnet00002=, and so on. Or switch to JSON.
I am using the ConfigParser module like this:
from ConfigParser import ConfigParser
c = ConfigParser()
c.read("pymzq.ini")
However, the sections gets botched up like this:
>>> c.sections()
['pyzmq:platform.architecture()[0']
for the pymzq.ini file which has ] in tht title to mean something:
[pyzmq:platform.architecture()[0] == '64bit']
url = ${pkgserver:fullurl}/pyzmq/pyzmq-2.2.0-py2.7-linux-x86_64.egg
Looks like ConfigParser uses a regex that only parses section lines up to the first closing bracket, so this is as expected.
You should be able to subclass ConfigParser/RawConfigParser and change that regexp to something that better suits your case, such as ^\[(?P<header>.+)\]$, maybe.
Thanks for the pointer #AKX, I went with:
class MyConfigParser(ConfigParser):
_SECT_TMPL = r"""
\[ # [
(?P<header>[^$]+) # Till the end of line
\] # ]
"""
SECTCRE = re.compile(_SECT_TMPL, re.VERBOSE)
Please let me know if you have any better versions. Source code of the original ConfigParser.
I am creating a script which need to parse the yaml output that the puppet outputs.
When I does a request agains example https://puppet:8140/production/catalog/my.testserver.no I will get some yaml back that looks something like:
--- &id001 !ruby/object:Puppet::Resource::Catalog
aliases: {}
applying: false
classes:
- s_baseconfig
...
edges:
- &id111 !ruby/object:Puppet::Relationship
source: &id047 !ruby/object:Puppet::Resource
catalog: *id001
exported:
and so on... The problem is when I do an yaml.load(yamlstream), I will get an error like:
yaml.constructor.ConstructorError: could not determine a constructor for the tag '!ruby/object:Puppet::Resource::Catalog'
in "<string>", line 1, column 5:
--- &id001 !ruby/object:Puppet::Reso ...
^
As far as I know, this &id001 part is supported in yaml.
Is there any way around this? Can I tell the yaml parser to ignore them?
I only need a couple of lines from the yaml stream, maybe regex is my friend here?
Anyone done any yaml cleanup regexes before?
You can get the yaml output with curl like:
curl --cert /var/lib/puppet/ssl/certs/$(hostname).pem --key /var/lib/puppet/ssl/private_keys/$(hostname).pem --cacert /var/lib/puppet/ssl/certs/ca.pem -H 'Accept: yaml' https://puppet:8140/production/catalog/$(hostname)
I also found some info about this in the puppet mailinglist # http://www.mail-archive.com/puppet-users#googlegroups.com/msg24143.html. But I cant get it to work correctly...
I have emailed Kirill Simonov, the creator of PyYAML, to get help to parse Puppet YAML file.
He gladly helped with the following code. This code is for parsing Puppet log, but I'm sure you can modify it to parse other Puppet YAML file.
The idea is to create the correct loader for the Ruby object, then PyYAML can read the data after that.
Here goes:
#!/usr/bin/env python
import yaml
def construct_ruby_object(loader, suffix, node):
return loader.construct_yaml_map(node)
def construct_ruby_sym(loader, node):
return loader.construct_yaml_str(node)
yaml.add_multi_constructor(u"!ruby/object:", construct_ruby_object)
yaml.add_constructor(u"!ruby/sym", construct_ruby_sym)
stream = file('201203130939.yaml','r')
mydata = yaml.load(stream)
print mydata
I believe the crux of the matter is the fact that puppet is using yaml "tags" for ruby-fu, and that's confusing the default python loader. In particular, PyYAML has no idea how to construct a ruby/object:Puppet::Resource::Catalog, which makes sense, since that's a ruby object.
Here's a link showing some various uses of yaml tags: http://www.yaml.org/spec/1.2/spec.html#id2761292
I've gotten past this in a brute-force approach by simply doing something like:
cat the_yaml | sed 's#\!ruby/object.*$##gm' > cleaner.yaml
but now I'm stuck on an issue where the *resource_table* block is confusing PyYAML with its complex keys (the use of '? ' to indicate the start of a complex key, specifically).
If you find a nice way around that, please let me know... but given how tied at the hip puppet is to ruby, it may just be easier to do you script directly in ruby.
I only needed the classes section. So I ended up creating this little python function to strip it out...
Hope its usefull for someone :)
#!/usr/bin/env python
import re
def getSingleYamlClass(className, yamlList):
printGroup = False
groupIndent = 0
firstInGroup = False
output = ''
for line in yamlList:
# Count how many spaces in the beginning of our line
spaceCount = len(re.findall(r'^[ ]*', line)[0])
cleanLine = line.strip()
if cleanLine == className:
printGroup = True
groupIndent = spaceCount
firstInGroup = True
if printGroup and (spaceCount > groupIndent) or firstInGroup:
# Strip away the X amount of spaces for this group, so we get valid yaml
output += re.sub(r'^[ ]{%s}' % groupIndent, '', line) + '\n'
firstInGroup = False # Reset this
else:
# End of our group, reset
groupIndent = 0
printGroup = False
return output
getSingleYamlClass('classes:', open('puppet.yaml').readlines())
Simple YAML parser:
with open("file","r") as file:
for line in file:
re= yaml.load('\n'.join(line.split('?')[1:-1]).replace('?','\n').replace('""','\'').replace('"','\''))
# print '\n'.join(line.split('?')[1:-1])
# print '\n'.join(line.split('?')[1:-1]).replace('?','\n').replace('""','\'').replace('"','\'')
print line
print re
I'm trying to read a java multiline i18n properties file. Having lines like:
messages.welcome=Hello\
World!
messages.bye=bye
Using this code:
import configobj
properties = configobj.ConfigObj(propertyFileName)
But with multilines values it fails.
Any suggestions?
According to the ConfigObj documentation, configobj requires you to surround multiline values in triple quotes:
Values that contain line breaks
(multi-line values) can be surrounded
by triple quotes. These can also be
used if a value contains both types of
quotes. List members cannot be
surrounded by triple quotes:
If modifying the properties file is out of the question, I suggest using configparser:
In config parsers, values can span
multiple lines as long as they are
indented more than the key that holds
them. By default parsers also let
empty lines to be parts of values.
Here's a quick proof of concept:
#!/usr/bin/env python
# coding: utf-8
from __future__ import print_function
try:
import ConfigParser as configparser
except ImportError:
import configparser
try:
import StringIO
except ImportError:
import io.StringIO as StringIO
test_ini = """
[some_section]
messages.welcome=Hello\
World
messages.bye=bye
"""
config = configparser.ConfigParser()
config.readfp(StringIO.StringIO(test_ini))
print(config.items('some_section'))
Output:
[('messages.welcome', 'Hello World'),
('messages.bye', 'bye')]
Thanks for the answers, this is what I finally did:
Add the section to the fist line of the properties file
Remove empty lines
Parse with configparser
Remove first line (section added in first step)
This is a extract of the code:
#!/usr/bin/python
...
# Add the section
subprocess.Popen(['/bin/bash','-c','sed -i \'1i [default]\' '+srcDirectory+"/*.properties"], stdout=subprocess.PIPE)
# Remove empty lines
subprocess.Popen(['/bin/bash','-c','sed -i \'s/^$/#/g' '+srcDirectory+"/*.properties"], stdout=subprocess.PIPE)
# Get all i18n files
files=glob.glob(srcDirectory+"/"+baseFileName+"_*.properties")
config = ConfigParser.ConfigParser()
for propFile in files:
...
config.read(propertyFileName)
value=config.get('default',"someproperty")
...
# Remove section
subprocess.Popen(['/bin/bash','-c','sed -i \'1d\' '+srcDirectory+"/*.properties"], stdout=subprocess.PIPE)
I still have troubles with those multilines that doesn't start with an empty space. I just fixed them manually, but a sed could do the trick.
Format your properties file like this:
messages.welcome="""Hello
World!"""
messages.bye=bye
Give a try to ConfigParser
I don't understand anything in the Java broth, but a regex would help you, I hope:
import re
ch = '''messages.welcome=Hello
World!
messages.bye=bye'''
regx = re.compile('^(messages\.[^= \t]+)[ \t]*=[ \t]*(.+?)(?=^messages\.|\Z)',re.MULTILINE|re.DOTALL)
print regx.findall(ch)
result
[('messages.welcome', 'Hello\n World! \n'), ('messages.bye', 'bye')]
I'm writing a simple program in Python 2.7 using pycURL library to submit file contents to pastebin.
Here's the code of the program:
#!/usr/bin/env python2
import pycurl, os
def send(file):
print "Sending file to pastebin...."
curl = pycurl.Curl()
curl.setopt(pycurl.URL, "http://pastebin.com/api_public.php")
curl.setopt(pycurl.POST, True)
curl.setopt(pycurl.POSTFIELDS, "paste_code=%s" % file)
curl.setopt(pycurl.NOPROGRESS, True)
curl.perform()
def main():
content = raw_input("Provide the FULL path to the file: ")
open = file(content, 'r')
send(open.readlines())
return 0
main()
The output pastebin looks like standard Python list: ['string\n', 'line of text\n', ...] etc.
Is there any way I could format it so it looks better and it's actually human-readable? Also, I would be very happy if someone could tell me how to use multiple data inputs in POSTFIELDS. Pastebin API uses paste_code as its main data input, but it can use optional things like paste_name that sets the name of the upload or paste_private that sets it private.
First, use .read() as virhilo said.
The other step is to use urllib.urlencode() to get a string:
curl.setopt(pycurl.POSTFIELDS, urllib.urlencode({"paste_code": file}))
This will also allow you to post more fields:
curl.setopt(pycurl.POSTFIELDS, urllib.urlencode({"paste_code": file, "paste_name": name}))
import pycurl, os
def send(file_contents, name):
print "Sending file to pastebin...."
curl = pycurl.Curl()
curl.setopt(pycurl.URL, "http://pastebin.com/api_public.php")
curl.setopt(pycurl.POST, True)
curl.setopt(pycurl.POSTFIELDS, "paste_code=%s&paste_name=%s" \
% (file_contents, name))
curl.setopt(pycurl.NOPROGRESS, True)
curl.perform()
if __name__ == "__main__":
content = raw_input("Provide the FULL path to the file: ")
with open(content, 'r') as f:
send(f.read(), "yournamehere")
print
When reading files, use the with statement (this makes sure your file gets closed properly if something goes wrong).
There's no need to be having a main function and then calling it. Use the if __name__ == "__main__" construct to have your script run automagically when called (unless when importing this as a module).
For posting multiple values, you can manually build the url: just seperate different key, value pairs with an ampersand (&). Like this: key1=value1&key2=value2. Or you can build one with urllib.urlencode (as others suggested).
EDIT: using urllib.urlencode on strings which are to be posted makes sure content is encoded properly when your source string contains some funny / reserved / unusual characters.
use .read() instead of .readlines()
The POSTFIELDS should be sended the same way as you send Query String arguments. So, in the first place, it's necessary to encode the string that you're sending to paste_code, and then, using & you could add more POST arguments.
Example:
paste_code=hello%20world&paste_name=test
Good luck!