Using ConfigParser to read a file without section name - python

I am using ConfigParser to read the runtime configuration of a script.
I would like to have the flexibility of not providing a section name (there are scripts which are simple enough; they don't need a 'section'). ConfigParser will throw a NoSectionError exception, and will not accept the file.
How can I make ConfigParser simply retrieve the (key, value) tuples of a config file without section names?
For instance:
key1=val1
key2:val2
I would rather not write to the config file.

You can do this in a single line of code.
In python 3, prepend a fake section header to your config file data, and pass it to read_string().
from configparser import ConfigParser
parser = ConfigParser()
with open("foo.conf") as stream:
parser.read_string("[top]\n" + stream.read()) # This line does the trick.
You could also use itertools.chain() to simulate a section header for read_file(). This might be more memory-efficient than the above approach, which might be helpful if you have large config files in a constrained runtime environment.
from configparser import ConfigParser
from itertools import chain
parser = ConfigParser()
with open("foo.conf") as lines:
lines = chain(("[top]",), lines) # This line does the trick.
parser.read_file(lines)
In python 2, prepend a fake section header to your config file data, wrap the result in a StringIO object, and pass it to readfp().
from ConfigParser import ConfigParser
from StringIO import StringIO
parser = ConfigParser()
with open("foo.conf") as stream:
stream = StringIO("[top]\n" + stream.read()) # This line does the trick.
parser.readfp(stream)
With any of these approaches, your config settings will be available in parser.items('top').
You could use StringIO in python 3 as well, perhaps for compatibility with both old and new python interpreters, but note that it now lives in the io package and readfp() is now deprecated.
Alternatively, you might consider using a TOML parser instead of ConfigParser.

Alex Martelli provided a solution for using ConfigParser to parse .properties files (which are apparently section-less config files).
His solution is a file-like wrapper that will automagically insert a dummy section heading to satisfy ConfigParser's requirements.

Enlightened by this answer by jterrace, I come up with this solution:
Read entire file into a string
Prefix with a default section name
Use StringIO to mimic a file-like object
ini_str = '[root]\n' + open(ini_path, 'r').read()
ini_fp = StringIO.StringIO(ini_str)
config = ConfigParser.RawConfigParser()
config.readfp(ini_fp)
EDIT for future googlers: As of Python 3.4+ readfp is deprecated, and StringIO is not needed anymore. Instead we can use read_string directly:
with open('config_file') as f:
file_content = '[dummy_section]\n' + f.read()
config_parser = ConfigParser.RawConfigParser()
config_parser.read_string(file_content)

You can use the ConfigObj library to do that simply : http://www.voidspace.org.uk/python/configobj.html
Updated: Find latest code here.
If you are under Debian/Ubuntu, you can install this module using your package manager :
apt-get install python-configobj
An example of use:
from configobj import ConfigObj
config = ConfigObj('myConfigFile.ini')
config.get('key1') # You will get val1
config.get('key2') # You will get val2

The easiest way to do this is to use python's CSV parser, in my opinion. Here's a read/write function demonstrating this approach as well as a test driver. This should work provided the values are not allowed to be multi-line. :)
import csv
import operator
def read_properties(filename):
""" Reads a given properties file with each line of the format key=value. Returns a dictionary containing the pairs.
Keyword arguments:
filename -- the name of the file to be read
"""
result={ }
with open(filename, "rb") as csvfile:
reader = csv.reader(csvfile, delimiter='=', escapechar='\\', quoting=csv.QUOTE_NONE)
for row in reader:
if len(row) != 2:
raise csv.Error("Too many fields on row with contents: "+str(row))
result[row[0]] = row[1]
return result
def write_properties(filename,dictionary):
""" Writes the provided dictionary in key-sorted order to a properties file with each line of the format key=value
Keyword arguments:
filename -- the name of the file to be written
dictionary -- a dictionary containing the key/value pairs.
"""
with open(filename, "wb") as csvfile:
writer = csv.writer(csvfile, delimiter='=', escapechar='\\', quoting=csv.QUOTE_NONE)
for key, value in sorted(dictionary.items(), key=operator.itemgetter(0)):
writer.writerow([ key, value])
def main():
data={
"Hello": "5+5=10",
"World": "Snausage",
"Awesome": "Possum"
}
filename="test.properties"
write_properties(filename,data)
newdata=read_properties(filename)
print "Read in: "
print newdata
print
contents=""
with open(filename, 'rb') as propfile:
contents=propfile.read()
print "File contents:"
print contents
print ["Failure!", "Success!"][data == newdata]
return
if __name__ == '__main__':
main()

Having ran into this problem myself, I wrote a complete wrapper to ConfigParser (the version in Python 2) that can read and write files without sections transparently, based on Alex Martelli's approach linked on the accepted answer. It should be a drop-in replacement to any usage of ConfigParser. Posting it in case anyone in need of that finds this page.
import ConfigParser
import StringIO
class SectionlessConfigParser(ConfigParser.RawConfigParser):
"""
Extends ConfigParser to allow files without sections.
This is done by wrapping read files and prepending them with a placeholder
section, which defaults to '__config__'
"""
def __init__(self, *args, **kwargs):
default_section = kwargs.pop('default_section', None)
ConfigParser.RawConfigParser.__init__(self, *args, **kwargs)
self._default_section = None
self.set_default_section(default_section or '__config__')
def get_default_section(self):
return self._default_section
def set_default_section(self, section):
self.add_section(section)
# move all values from the previous default section to the new one
try:
default_section_items = self.items(self._default_section)
self.remove_section(self._default_section)
except ConfigParser.NoSectionError:
pass
else:
for (key, value) in default_section_items:
self.set(section, key, value)
self._default_section = section
def read(self, filenames):
if isinstance(filenames, basestring):
filenames = [filenames]
read_ok = []
for filename in filenames:
try:
with open(filename) as fp:
self.readfp(fp)
except IOError:
continue
else:
read_ok.append(filename)
return read_ok
def readfp(self, fp, *args, **kwargs):
stream = StringIO()
try:
stream.name = fp.name
except AttributeError:
pass
stream.write('[' + self._default_section + ']\n')
stream.write(fp.read())
stream.seek(0, 0)
return ConfigParser.RawConfigParser.readfp(self, stream, *args,
**kwargs)
def write(self, fp):
# Write the items from the default section manually and then remove them
# from the data. They'll be re-added later.
try:
default_section_items = self.items(self._default_section)
self.remove_section(self._default_section)
for (key, value) in default_section_items:
fp.write("{0} = {1}\n".format(key, value))
fp.write("\n")
except ConfigParser.NoSectionError:
pass
ConfigParser.RawConfigParser.write(self, fp)
self.add_section(self._default_section)
for (key, value) in default_section_items:
self.set(self._default_section, key, value)

Blueicefield's answer mentioned configobj, but the original lib only supports Python 2. It now has a Python 3+ compatible port:
https://github.com/DiffSK/configobj
APIs haven't changed, see it's doc.

Related

Custom function to read file from current path in Python needs refactoring

I had to write a custom function to load a yaml file from the current working directory. The function itself works and my intention was to write it in a pure fashion but my senior colleague told me that the way I wrote this function is utterly bad and I have to rewrite it.
Which commandment in Python did I violate? Can anyone tell me what I did wrong here and how a "professional" solution would look like?
from typing import Dict
import yaml
from yaml import SafeLoader
from pathlib import Path
import os
def read_yaml_from_cwd(file: str) -> Dict:
"""[reads a yaml file from current working directory]
Parameters
----------
file : str
[.yaml or .yml file]
Returns
-------
Dict
[Dictionary]
"""
path = os.path.join(Path.cwd().resolve(), file)
if os.path.isfile(path):
with open(path) as f:
content = yaml.load(f, Loader=SafeLoader)
return content
else:
return None
content = read_yaml_from_cwd("test.yaml")
print(content)
The significant parts of your function can be reduced to this:
import yaml
from yaml import SafeLoader
def read_yaml_from_cwd(file):
try:
with open(file) as f:
return yaml.load(f, Loader=SafeLoader)
except Exception:
pass
In this way, the function will either return a dict object or None if either the file cannot be opened or parsed by the yaml loader

configparser does not show sections

I added sections and its values to ini file, but configparser doesn't want to print what sections I have in total. What I've done:
import configparser
import os
# creating path
current_path = os.getcwd()
path = 'ini'
try:
os.mkdir(path)
except OSError:
print("Creation of the directory %s failed" % path)
# add section and its values
config = configparser.ConfigParser()
config['section-1'] = {'somekey' : 'somevalue'}
file = open(f'ini/inifile.ini', 'a')
with file as f:
config.write(f)
file.close()
# get sections
config = configparser.ConfigParser()
file = open(f'ini/inifile.ini')
with file as f:
config.read(f)
print(config.sections())
file.close()
returns
[]
The similar code was in the documentation, but doesn't work. What I do wrong and how I could solve this problem?
From the docs, config.read() takes in a filename (or list of them), not a file descriptor object:
read(filenames, encoding=None)
Attempt to read and parse an iterable of filenames, returning a list of filenames which were successfully parsed.
If filenames is a string, a bytes object or a path-like object, it is treated as a single filename. ...
If none of the named files exist, the ConfigParser instance will contain an empty dataset. ...
A file object is an iterable of strings, so basically the config parser is trying to read each string in the file as a filename. Which is sort of interesting and silly, because if you passed it a file that contained the filename of your actual config...it would work.
Anyways, you should pass the filename directly to config.read(), i.e.
config.read("ini/inifile.ini")
Or, if you want to use a file descriptor object instead, simply use config.read_file(f). Read the docs for read_file() for more information.
As an aside, you are duplicating some of the work the context manager is doing for no gain. You can use the with block without creating the object explicitly first or closing it after (it will get closed automatically). Keep it simple:
with open("path/to/file.txt") as f:
do_stuff_with_file(f)

Mocking a file in python

In my python package I have a configuration module that reads a yaml file (when creating the instance) at an explicit location, i.e. something like
class YamlConfig(object):
def __init__(self):
filename = os.path.join(os.path.expanduser('~'), '.hanzo\\config.yml')
with open(filename) as fs:
self.cfg = yaml.load(fs.read())
Now what should I do when writing my unit test if I don't want to use the explicitly specified file? Instead I want to create a temporary config.yml to be used for testing.
I could simply allow for a specified filename in __init__(), but I strongly prefer forcing the filename location. I.e. like this
class YamlConfig(object):
def __init__(self, filename=os.path.join(os.path.expanduser('~'), '.hanzo\\config.yml')):
with open(filename) as fs:
self.cfg = yaml.load(fs.read())
Is there other ways to solve my issue? I guess it might be possible using mock right way? Also feel free to give any comments about upside and downside.
You need to run the test inside the mock context:
import unittest.mock as umock
with umock.patch('__main__.open', umock.mock_open(read_data='yaml data')):
# Run your test here and open the file
with open('filename') as f:
You might need to use library yaml to generate yaml output for read_data
I did this for one of my projects. Here's what my function looked like (simplified):
def parse_template(cve: str) -> Dict:
"""Parse yaml templates and return relevant information
Args:
cve: CVE id to identify respective template
Returns:
Dict: return CVE name
"""
year = cve.split("-")[1]
file_path = f"/cves/{year}/{cve}.yaml"
with open(file_path, "r") as f:
data = yaml.load(f, Loader=yaml.SafeLoader)
return {
"name": data.get("name", "None")
}
To write tests for this, we first create a python fixture which patches the "open" function which is responsible for opening my yaml file.
#pytest.fixture
def mock_template(mocker: MockFixture):
"""Mock yaml file"""
sample_dict={
"name":"Example cve name"
}
mocked_yaml = mocker.mock_open(read_data=yaml.dump(sample_dict))
mocker.patch("builtins.open", mocked_yaml)
Then we just use this fixture to validate our function with something like:
def test_parse_template(mock_template:pytest.fixture) -> None:
"""Tests for opening yaml template and extract data"""
result=parse_template(cve="CVE-2022-1337")
assert "name" in result.keys() # or any other validation

Using ConfigParser to read non-standard config files

I am having a config file of the form
# foo.conf
[section1]
foo=bar
buzz=123
[section2]
line1
line2
line3
that I want to parse using the Python ConfigParser library. Note that section2 does not contain key/value pairs but some raw text instead. I would like to have a possibility to read all (raw) content of section2 to a variable.
Does ConfigParser allow me to read this file or can one of its classes be subclassed in an easy manner to do so?
Using the standard
import ConfigParser
config = ConfigParser.ConfigParser()
config.read('foo.conf')
yields ConfigParser.ParsingError: File contains parsing errors: foo.conf
You could try to use an io adapter to transform the input file in a format suitable for ConfigParser. A way for that would be to tranform plain line that are neither empty line, nor comment line, nor section lines not key=value line in linei=original_line, where i is increased at each line and starts at 1 in each section.
A possible code could be:
class ConfParsAdapter(io.RawIOBase):
#staticmethod
def _confParsAdapter(fd):
num=1
rxsec = re.compile('\[.*\]( *#.*)?$')
rxkv = re.compile('.+?=.*')
rxvoid = re.compile('(#.*)?$')
for line in fd:
if rxsec.match(line.strip()):
num=1
elif rxkv.match(line) or rxvoid.match(line.strip()):
pass
else:
line = 'line{}={}'.format(num, line)
num += 1
yield(line)
def __init__(self, fd):
self.fd = self._confParsAdapter(fd)
def readline(self, hint = -1):
try:
return next(self.fd)
except StopIteration:
return ""
That way, you could use with your current file without changing anything in it:
>>> parser = ConfigParser.RawConfigParser()
>>> parser.readfp(ConfParsAdapter(open('foo.conf'))
>>> parser.sections()
['section1', 'section2']
>>> parser.items('section2')
[('line1', 'line1'), ('line2', 'line2'), ('line3', 'line3')]
>>>
As far as I know,ConfigParser can not do this:
The ConfigParser class implements a basic configuration file parser
language which provides a structure similar to what you would find on
Microsoft Windows INI files.
It seems that your conf file is not a basic configuration file,so maybe two ways you can parse this conf file.
Read the conf file and modify it.
Generate a valid configuration file.

How to read from a text file compressed with 7z?

I would like to read (in Python 2.7), line by line, from a csv (text) file, which is 7z compressed. I don't want to decompress the entire (large) file, but to stream the lines.
I tried pylzma.decompressobj() unsuccessfully. I get a data error. Note that this code doesn't yet read line by line:
input_filename = r"testing.csv.7z"
with open(input_filename, 'rb') as infile:
obj = pylzma.decompressobj()
o = open('decompressed.raw', 'wb')
obj = pylzma.decompressobj()
while True:
tmp = infile.read(1)
if not tmp: break
o.write(obj.decompress(tmp))
o.close()
Output:
o.write(obj.decompress(tmp))
ValueError: data error during decompression
This will allow you to iterate the lines. It's partially derived from some code I found in an answer to another question.
At this point in time (pylzma-0.5.0) the py7zlib module doesn't implement an API that would allow archive members to be read as a stream of bytes or characters — its ArchiveFile class only provides a read() function that decompresses and returns the uncompressed data in a member all at once. Given that, about the best that can be done is return bytes or lines iteratively via a Python generator using that as a buffer.
The following does the latter, but may not help if the problem is the archive member file itself is huge.
The code below should work in Python 3.x as well as 2.7.
import io
import os
import py7zlib
class SevenZFileError(py7zlib.ArchiveError):
pass
class SevenZFile(object):
#classmethod
def is_7zfile(cls, filepath):
""" Determine if filepath points to a valid 7z archive. """
is7z = False
fp = None
try:
fp = open(filepath, 'rb')
archive = py7zlib.Archive7z(fp)
_ = len(archive.getnames())
is7z = True
finally:
if fp: fp.close()
return is7z
def __init__(self, filepath):
fp = open(filepath, 'rb')
self.filepath = filepath
self.archive = py7zlib.Archive7z(fp)
def __contains__(self, name):
return name in self.archive.getnames()
def readlines(self, name, newline=''):
r""" Iterator of lines from named archive member.
`newline` controls how line endings are handled.
It can be None, '', '\n', '\r', and '\r\n' and works the same way as it does
in StringIO. Note however that the default value is different and is to enable
universal newlines mode, but line endings are returned untranslated.
"""
archivefile = self.archive.getmember(name)
if not archivefile:
raise SevenZFileError('archive member %r not found in %r' %
(name, self.filepath))
# Decompress entire member and return its contents iteratively.
data = archivefile.read().decode()
for line in io.StringIO(data, newline=newline):
yield line
if __name__ == '__main__':
import csv
if SevenZFile.is_7zfile('testing.csv.7z'):
sevenZfile = SevenZFile('testing.csv.7z')
if 'testing.csv' not in sevenZfile:
print('testing.csv is not a member of testing.csv.7z')
else:
reader = csv.reader(sevenZfile.readlines('testing.csv'))
for row in reader:
print(', '.join(row))
If you were using Python 3.3+, you might be able to do this using the lzma module which was added to the standard library in that version.
See: lzma Examples
If you can use python 3, there is a useful library, py7zr, which supports partially 7zip decompression as below:
import py7zr
import re
filter_pattern = re.compile(r'<your/target/file_and_directories/regex/expression>')
with SevenZipFile('archive.7z', 'r') as archive:
allfiles = archive.getnames()
selective_files = [f if filter_pattern.match(f) for f in allfiles]
archive.extract(targets=selective_files)

Categories