how to create a class built on file.io - python

I am writing a module that converts between several different file formats (e.g. vhdl to verilog, excel table to vhdl etc). Its not so hard but there is a lot of language specific formatting to do. It just occurred to me that an elegant way to do this was to have a class type for each file format type by having a class built on file.io. The class would inherit methods of file but also the ability to read or write specific syntax to that file. I could not find any examples of a file io superclass and how to write it. My idea was that to instantiate it (open the file) i could use:
my_lib_file = Libfile(filename, 'w')
and to write a simple parameter to the libfile I could use something like
my_lib_file.simple_parameter(param, value)
Such a class would tie together the many file specific functions I currently have in a neat way. Actually I would prefer to be able to instantiate the class as part of a with statement e.g.:
with Libfile(filename, 'w') as my_lib_file:
for param, value in my_stuff.items():
my_lib_file.simple_parameter(param, value)

This is the wrong way to think about it.
You inherit in order to be reused. The base class provides an interface which others can use. For file-like objects it's mainly read and write. But, you only want to call another function simple_parameter. Calling write directly could mess up the format.
Really you don't want it to be a file-like object. You want to write to a file when the user calls simple_parameter. The implementation should delegate to a member file-like object instead, e.g.:
class LibFile:
def __init__(self, file):
self.file = file
def simple_parameter(self, param, value):
self.file.write('{}: {}\n'.format(param, value))
This is easy to test as you could pass in anything that supports write:
>>> import sys
>>> lib = LibFile(sys.stdout)
>>> lib.simple_parameter('name', 'Stephen')
name: Stephen
edit:
If you really want the class to manage the lifetime of the file you can provide a close function and use the closing context manager:
class Formatter:
def __init__(self, filename, mode):
self.file = open(filename, mode)
def close(self):
self.file.close()
Usage:
class LibFormatter(Formatter):
def simple_parameter(self, param, value):
self.file.write('{}: {}\n'.format(param, value))
from contextlib import closing
with closing(LibFormatter('library.txt', 'w')) as lib:
... # etc
2nd edit:
If you don't want to use closing, you can write your own context manager:
class ManagedFile:
def __init__(self, filename, mode):
self.file = open(filename, mode)
def __enter__(self):
return self
def __exit__(self, *args):
self.close()
def close(self):
self.file.close()
Usage:
class LibFormatter(ManagedFile):
def simple_parameter(self, param, value):
self.file.write('{}: {}\n'.format(param, value))
with LibFormatter('library.txt', 'w') as lib:
... # etc

My two line solution is as follows:
with open(lib_loc + '\\' + lib_name + '.lib', 'w') as lib_file_handle:
lib_file = Liberty(lib_file_handle)
# do stuff using lib_file
the class initialization is as follows:
def __init__(self, file):
''' associate this instance with the given file handle '''
self.f = file
now instead of passing the raw file handle I pass the class along with the functions to my functions.
The simplest function is:
def wr(self, line):
''' write line given to file'''
self.f.write(line + '\n')
Which means I am replicating the write function built into the file.io class. This was what I was trying to avoid.

I have now found a satisfactory way of doing what I wanted. The following is my base class, which is built on the base functions of file_io (but is not a subclass) and a simple example for writing CSV files. I also have Formatters for HTML, Verilog and others. Code is:
class Formatter():
''' base class to manage the opening of a file in order to build classes which write a file
with a specific format so as to be able to pass the formatting functions to a subroutine
along with the file handle
Designed to use with "with" statement and to shorten argument lists of functions which use
the file
'''
def __init__(self, filename):
''' associate this instance with the given file handle
'''
self.f = open(filename, 'w')
def wr(self, line, end='\n'):
''' write line given to file
'''
self.f.write(line + end)
def wr_newline(self, num):
''' write num newlines to file
'''
self.f.write('\n'*num)
def __enter__(self):
''' needed for use with with statement
'''
return self
def __exit__(self, *args):
''' needed for use with with statement
'''
self.close()
def close(self):
''' explicit file close for use with procedural progamming style
'''
self.f.close()
class CSV(Formatter):
''' class to write items using comma separated file format string formatting
inherrits:
=> wr(line, end='\n'):
=> wr_newline(n):
all file io functions via the self.f variable
'''
#staticmethod
def pp(item):
''' 'pretty-print - can't have , or \n in strings used in CSV files
'''
return str(item).replace('\n', '/').replace(',', '-')
def __init__(self, filename):
'''open filen given as a CSV file
'''
super().__init__(filename + '.csv')
def wp(self, item):
''' write a single item to the file
'''
self.f.write(self.pp(item)+', ')
def ws(self, itemlist):
''' write a csv list from a list variable
'''
self.wr(','.join([self.pp(item) for item in itemlist]))

Related

Python mixins with context managers don't solve "super" call correctly

I am writing a class representing a file. This class has some optional features: normally files are stored in memory, but sometimes there is a need for storing them on disk, sometimes I want to store them as zip files and so on. I decided to use mixins, where I can subclass File class and in case of need add mixins I actually need in some case. In such situation reading/writing to a file is an operations that requires some preparation and some cleanup (I need to zip file, perform some write e.g. and than again zip updated version). For this purpose I wanted to use custom context managers, to ensure these actions are performed even if there's an exception or return statement in the middle of with statement. Here's my code:
class File(object):
def read(self):
return "file content"
class ZipMixin(object):
def read(self):
with self:
return super(ZipMixin, self).read()
def __enter__(self):
print("Unzipping")
return self
def __exit__(self, *args):
print("Zipping back")
class SaveMixin(object):
def read(self):
with self:
return super(SaveMixin, self).read()
def __enter__(self):
print("Loading to memory")
return self
def __exit__(self, *args):
print("Removing from memory, saving on disk")
class SaveZipFile(SaveMixin, ZipMixin, File):
pass
f = SaveZipFile()
print(f.read())
However, the output is quite disappointing:
Loading to memory
Loading to memory
Removing from memory, saving on disk
Removing from memory, saving on disk
file content
while it should be:
Loading to memory from disk
Unzipping
Zipping back
Removing from memory, saving on disk
file content
Apparently, all calls to super in mixins with context managers are not passed "in chain" to all mixins, but rather two times to first mixin, then directly to superclass (omitting intermediate mixins). I tested it both with python 2 and 3, same result. What is wrong?
What happens?
The "super" call works as you expect it to work, the read methods of both of your mixins are called in the expected order?
However, you use with self: in both of your SaveMixin and ZipMixin classes read methods.
self is the same in both cases, resulting in the same __enter__ and __exit__ methods beeing used, regardless the declaring class.
According to the method resolution order of the SaveZipFile class, the methods of the SaveMixin class are used:
>>> SaveZipFile.__mro__
(<class '__main__.SaveZipFile'>, <class '__main__.SaveMixin'>, <class '__main__.ZipMixin'>, <class '__main__.File'>, <class 'object'>)
In short the read methods of your SaveMixin and ZipMixin classes are called in the correct order, but the with self: uses the __enter__ and __exit__ methods of the SaveMixinclass both times.
How can this be resolved?
It seems like the with statement is not optimal for the usage with Mixins, but a possible solution is using the Decorator Pattern:
class File(object):
def read(self):
return "file content"
class ZipDecorator(object):
def __init__(self, inner):
self.inner = inner
def read(self):
with self:
return self.inner.read()
def __enter__(self):
print("Unzipping")
return self
def __exit__(self, *args):
print("Zipping back")
class SaveDecorator(object):
def __init__(self, inner):
self.inner = inner
def read(self):
with self:
return self.inner.read()
def __enter__(self):
print("Loading to memory")
return self
def __exit__(self, *args):
print("Removing from memory, saving on disk")
class SaveZipFile(object):
def read(self):
decorated_file = SaveDecorator(
ZipDecorator(
File()
)
)
return decorated_file.read()
f = SaveZipFile()
print(f.read())
Output:
Loading to memory
Unzipping
Zipping back
Removing from memory, saving on disk
file content
The self that you're passing around is of type SaveZipFile. If you look at the MRO (method resolution order) of SaveZipFile, it's something like this:
object
/ | \
SaveMixin ZipMixin File
\ | /
SaveZipFile
When you call with self:, it ends up calling self.__enter__(). And since self is of type SaveZipFile, when we look at the MRO paths for that class (going "up" the graph, searching the paths left to right), and we find a match on the first path (in SaveMixin).
If you're going to offer the zip and save functionality as mixins, you're probably better off using the try/finally pattern and letting super determine which class's method should be called and in what order:
class File(object):
def read(self):
return "file content"
class ZipMixin(object):
def read(self):
try:
print("Unzipping")
return super(ZipMixin, self).read()
finally:
print("Zipping back")
class SaveMixin(object):
def read(self):
try:
print("Loading to memory")
return super(SaveMixin, self).read()
finally:
print("Removing from memory, saving on disk")
class SaveZipFile(SaveMixin, ZipMixin, File):
pass

Module-level functions to instantiate classes

Is it a good practice to instantiate a class using a module-level function in the same file? I use YAML to create instances of the object.
This is an example of what I mean.
#file.py
def newfile(path):
with open(path) as f:
return File(f.read())
class File(object):
def __init__(self,content=None):
self.content = content
#main.py
import newfile
file = newfile("/path/to/file.txt")
Another option could be to create a class method to do the same.
#file.py
class File(object):
def __init__(self,content=None):
self.content = content
#classmethod
def new(cls,path):
with open(path) as f:
return cls(f.read())
#main.py
import File
file = File.new("/path/to/file.txt")
The reason I need to do something like this is the fact that I load objects from YAML to read and write lots of files, so I would like to do this in an organized, clean way. I have to set the attributes to optional values, since sometimes I also need to use empty objects.
This is exactly what a class method is for: providing an alternate constructor for your object.
class File(object):
def __init__(self, content=None):
self.content = content
#classmethod
def from_file(cls, path):
with open(path) as f:
return cls(f.read())
However, a potentially cleaner method would be to have __init__ neither a file path nor raw data, but instead a file-like object. Then, your class only needs to support a single use case, namely reading from a file-like object.
class File(object):
def __init__(self, f):
self.content = f.read()
Now, your caller is responsible for either opening a file:
with open(path) as fh:
f = File(fh)
or creating some other file-like object:
import StringIO
f = File(StringIO.StringIO("some data here"))

Open file inside class

I have a noob question.
I need to do the class, which in init opening file and other function just appending to this opened file text. How I can do this?
Need to do something like this, but this is not working, so help.
file1.py
from logsystem import LogginSystem as logsys
file_location='/tmp/test'
file = logsys(file_location)
file.write('some message')
file2.py
class LogginSystem(object):
def __init__(self, file_location):
self.log_file = open(file_location, 'a+')
def write(self, message):
self.log_file.write(message)
Thanks
Like zwer already mentioned, you could use the __del__() method to achieve this behaviour.
__del__ is the Python equivalent of a destructor, and is called when the object is garbage collected. It is not guaranteed though that the object will actually be garbage collected (this is implementation dependent)!
Another, more safe approach would be the use of the __enter__ and __exit__ methods which can be implemented in the following way:
class LogginSystem(object):
def __enter__(self, file_location):
self.log_file = open(file_location, 'a+')
return self
def write(self, message):
self.log_file.write(message)
def __exit__(self):
self.log_file.close()
This allows you to use the with-statement for automatic cleanup:
from logsystem import LogginSystem as logsys
file_location='/tmp/test'
with logsys(file_location) as file:
file.write('some message')
You can read more about these methods, and the with-statement here
Files provide a context manager for use in with statements so the file is automaticlly closed when you are done with it.
You can leverage that in a class by deriving from contextlib.ExitStack like this:
class LogginSystem(contextlib.ExitStack):
def __init__(self, file_location, *args, **kwargs):
super().__init__(*args, **kwargs)
self.logfile = self.enter_context(open(file_location))
def write(self, message):
self.log_file.write(message)

python: Organizing object model of an application

I have the following problem. My application randomly takes different files, e.g. rar, zip, 7z. And I have different processors to extract and save them locally:
Now everything looks this way:
if extension == 'zip':
archive = zipfile.ZipFile(file_contents)
file_name = archive.namelist()[0]
file_contents = ContentFile(archive.read(file_name))
elif extension == '7z':
archive = py7zlib.Archive7z(file_contents)
file_name = archive.getnames()[0]
file_contents = ContentFile(
archive.getmember(file_name).read())
elif extension == '...':
And I want to switch to more object oriented approach, with one main Processor class and subclasses responsible for specific archives.
E.g. I was thinking about:
class Processor(object):
def __init__(self, filename, contents):
self.filename = filename
self.contents = contents
def get_extension(self):
return self.filename.split(".")[-1]
def process(self):
raise NotImplemented("Need to implement something here")
class ZipProcessor(Processor):
def process(self):
archive = zipfile.ZipFile(file_contents)
file_name = archive.namelist()[0]
file_contents = ContentFile(archive.read(file_name))
etc
But I am not sure, that's a correct way. E.g. I can't invent a way to call needed processor based on the file extension, if following this way
A rule of thumb is that if you have a class with two methods, one of which is __init__(), then it's not a class but a function is disguise.
Writing classes is overkill in this case, because you still have to use the correct class manually.
Since the handling of all kinds of archives will be subtly different, wrap each in a function;
def handle_zip(name):
print name, 'is a zip file'
return 'zip'
def handle_7z(name):
print name, 'is a 7z file'
return '7z'
Et cetera. Since functions are first-class objects in Python, you can use a dictionary using the extension as a key for calling the right function;
import os.path
filename = 'foo.zip'
dispatch = {'.zip': handle_zip, '.7z': handle_7z}
_, extension = os.path.splitext(filename)
try:
rv = dispatch[extension](filename)
except KeyError:
print 'Unknown extension', extension
rv = None
It is important to handle the KeyError here, since dispatch doesn't contain all possible extensions.
An idea that might make sense before (or instead) of writing a custom class to perform your operations generally, is making sure you offer a consistent interface to archives - wrapping zipfile.ZipFile and py7zlib.Archive7z into classes with, for example, a getfilenames method.
This method ensures that you don't repeat yourself, without needing to "hide" your operations in a class, if you don't want to
You may want to use a abc as a base class, to make things extra clear.
Then, you can simply:
archive_extractors= {'zip':MyZipExtractor, '7z':My7zExtractor}
extractor= archive_extractors[extension]
file_name = extractor.getfilenames()[0]
#...
If you want to stick to OOP, you could give Processor a static method to decide if a class can handle a certain file, and implement it in every subclass. Then, if you need to unpack a file, use the base class'es __subclasses__() method to iterate over the subclasses and create an instance of the appropriate one:
class Processor(object):
#staticmethod
def is_appropriate_for(name):
raise NotImplemented()
def process(self, name):
raise NotImplemented()
class ZipProcessor(Processor):
#staticmethod
def is_appropriate_for(name):
return name[-4:] == ".zip"
def process(self, name):
print ".. handling ", name
name = "test.zip"
handler = None
for cls in Processor.__subclasses__():
if cls.is_appropriate_for(name):
handler = cls()
print name, "handled by", handler

Best way to mix and match components in a python app

I have a component that uses a simple pub/sub module I wrote as a message queue. I would like to try out other implementations like RabbitMQ. However, I want to make this backend change configurable so I can switch between my implementation and 3rd party modules for cleanliness and testing.
The obvious answer seems to be to:
Read a config file
Create a modifiable settings object/dict
Modify the target component to lazily load the specified implementation.
something like :
# component.py
from test.queues import Queue
class Component:
def __init__(self, Queue=Queue):
self.queue = Queue()
def publish(self, message):
self.queue.publish(message)
# queues.py
import test.settings as settings
def Queue(*args, **kwargs):
klass = settings.get('queue')
return klass(*args, **kwargs)
Not sure if the init should take in the Queue class, I figure it would help in easily specifying the queue used while testing.
Another thought I had was something like http://www.voidspace.org.uk/python/mock/patch.html though that seems like it would get messy. Upside would be that I wouldn't have to modify the code to support swapping component.
Any other ideas or anecdotes would be appreciated.
EDIT: Fixed indent.
One thing I've done before is to create a common class that each specific implementation inherits from. Then there's a spec that can easily be followed, and each implementation can avoid repeating certain code they'll all share.
This is a bad example, but you can see how you could make the saver object use any of the classes specified and the rest of your code wouldn't care.
class SaverTemplate(object):
def __init__(self, name, obj):
self.name = name
self.obj = obj
def save(self):
raise NotImplementedError
import json
class JsonSaver(SaverTemplate):
def save(self):
file = open(self.name + '.json', 'wb')
json.dump(self.object, file)
file.close()
import cPickle
class PickleSaver(SaverTemplate):
def save(self):
file = open(self.name + '.pickle', 'wb')
cPickle.dump(self.object, file, protocol=cPickle.HIGHEST_PROTOCOL)
file.close()
import yaml
class PickleSaver(SaverTemplate):
def save(self):
file = open(self.name + '.yaml', 'wb')
yaml.dump(self.object, file)
file.close()
saver = PickleSaver('whatever', foo)
saver.save()

Categories