Subclassing a file object to "fake" it as an iterable - Python - python

My thought was to get rid of how users are constantly using seek(0) to reset the text file reading.
So instead I've tried to create a MyReader that's an collections.Iterator and then using .reset() to replace seek(0) and then it continues from where it last yielded by retaining a self.iterable object.
class MyReader(collections.Iterator):
def __init__(self, filename):
self.filename = filename
self.iterable = self.__iterate__()
def __iterate__(self):
with open(self.filename) as fin:
for line in fin:
yield line.strip()
def __iter__(self):
for line in self.iterable:
yield line
def __next__(self):
return next(self.iterable)
def reset(self):
self.iterable = self.__iterate__()
The usage would be something like:
$ cat english.txt
abc
def
ghi
jkl
$ python
>>> data = MyReader('english.txt')
>>> print(next(data))
abc
>>> print(next(data))
def
>>> data.reset()
>>> print(next(data))
abc
My question is does this already exist in Python-verse somewhere? Esp. if there's already a native object that does something like this, I would like to avoid reinventing the wheel =)
If it doesn't exist? Does the object look a little unpythonic? Since it says it's an iterator but the true Iterator is actually the self.iterable and the other functions are wrapping around it to do "resets".

I think it depends on what is your real situation. Let's say if you just want to get rid of file.seek(0), it can be simple:
class MyReader:
def __init__(self, filename, mode="r"):
self.file = open(filename, mode)
def __enter__(self):
return self
def __exit__(self, exc_type, exc_value, traceback):
self.close()
def __iter__(self):
self.file.seek(0)
for line in self.file:
yield line.strip()
def close(self):
self.file.close()
You can even use it like a normal context manager:
with MyReader("a.txt") as a:
for line in a:
print(line)
for line in a:
print(line)
output:
sdfas
asdf
asd
fas
df
asd
f
sdfas
asdf
asd
fas
df
asd
f

I have a couple of criticisms of your MyReader class. I was going to post an alternative that's a context manager but Sraw beat me to it. ;)
You shouldn't use names that start and end with double underscores like __iterate__. Such names are essentially reserved for the language implementors, and if an official __iterate__ magic method is added to the language your code will break. If you want a private method, you could name it _iterate.
There is a little problem with that __iterate__ method: its with block is only exited when the file has been completely read for the current self.iterable, so if the MyReader instance gets reset then you have an old open file sitting around, consuming a file descriptor. Sure, it'll get closed eventually, when the program exits (or you delete the MyReader instance), but it's messy IMHO.
Also, I'm not totally happy with the yield line.strip(). Sure, it's convenient most of the time when you're reading a text file, but in some cases the caller may want to look at any leading or trailing white space, and you've taken that option away from them.
BTW, that __iter__ method is redundant: your class still does what its supposed to do if you eliminate that method.

Related

manually open context manager

My question is, how can I execute any context manager without using with?
Python has the idea of context managers,
instead of
file = open('some_file', 'w')
try:
file.write('Hola!')
finally:
file.close()
# end try
you can write
with open('some_file', 'w') as opened_file:
opened_file.write('Hola!')
# end with
While in most cases the second one is the golden solution, however for the specific cases of testing in unit tests as well exploring in the interactive console, the first one can be much better used, as you can write it line by line.
>>> file = open('some_file', 'w')
>>> file.write('Hola!')
>>> file.close()
My question is, how can I execute any with context manager like this, best suited for exploring?
My actual use case follows below, but please try to give a answer which is generic and will work for other context managers too.
import flask
app = flask.Flask(__name__)
with app.test_request_context('/?name=Peter'):
assert flask.request.path == '/'
assert flask.request.args['name'] == 'Peter'
from flask docs
You can still use with syntax in the interactive console, however a context is based on 2 magic methods __enter__ and __exit__, so you can just use them:
class MyCtx(object):
def __init__(self, f):
self.f = f
def __enter__(self):
print("Enter")
return self.f
def __exit__(*args, **kwargs):
print("Exit")
def foo():
print("Hello")
usually you do:
with MyCtx(foo) as f:
f()
Same as:
ctx = MyCtx(foo)
f = ctx.__enter__()
f()
ctx.__exit__()
Here you have the live example
Remember that contexts __exit__ method are used for managing errors within the context, so most of them have a signature of __exit__(exception_type, exception_value, traceback), if you dont need to handle it for the tests, just give it some None values:
__exit__(None, None, None)
You can call app.test_request.context('/?name=Peter') to a variable (e.g. ctx), and then call ctx.__enter__() on it to enter the context manager, and ctx.__exit__(None, None, None) to perform the cleanup. Note that you lose the safety guarantees of context managers, unless you put the ctx.__exit__ in a finally clause.

how to create a class built on file.io

I am writing a module that converts between several different file formats (e.g. vhdl to verilog, excel table to vhdl etc). Its not so hard but there is a lot of language specific formatting to do. It just occurred to me that an elegant way to do this was to have a class type for each file format type by having a class built on file.io. The class would inherit methods of file but also the ability to read or write specific syntax to that file. I could not find any examples of a file io superclass and how to write it. My idea was that to instantiate it (open the file) i could use:
my_lib_file = Libfile(filename, 'w')
and to write a simple parameter to the libfile I could use something like
my_lib_file.simple_parameter(param, value)
Such a class would tie together the many file specific functions I currently have in a neat way. Actually I would prefer to be able to instantiate the class as part of a with statement e.g.:
with Libfile(filename, 'w') as my_lib_file:
for param, value in my_stuff.items():
my_lib_file.simple_parameter(param, value)
This is the wrong way to think about it.
You inherit in order to be reused. The base class provides an interface which others can use. For file-like objects it's mainly read and write. But, you only want to call another function simple_parameter. Calling write directly could mess up the format.
Really you don't want it to be a file-like object. You want to write to a file when the user calls simple_parameter. The implementation should delegate to a member file-like object instead, e.g.:
class LibFile:
def __init__(self, file):
self.file = file
def simple_parameter(self, param, value):
self.file.write('{}: {}\n'.format(param, value))
This is easy to test as you could pass in anything that supports write:
>>> import sys
>>> lib = LibFile(sys.stdout)
>>> lib.simple_parameter('name', 'Stephen')
name: Stephen
edit:
If you really want the class to manage the lifetime of the file you can provide a close function and use the closing context manager:
class Formatter:
def __init__(self, filename, mode):
self.file = open(filename, mode)
def close(self):
self.file.close()
Usage:
class LibFormatter(Formatter):
def simple_parameter(self, param, value):
self.file.write('{}: {}\n'.format(param, value))
from contextlib import closing
with closing(LibFormatter('library.txt', 'w')) as lib:
... # etc
2nd edit:
If you don't want to use closing, you can write your own context manager:
class ManagedFile:
def __init__(self, filename, mode):
self.file = open(filename, mode)
def __enter__(self):
return self
def __exit__(self, *args):
self.close()
def close(self):
self.file.close()
Usage:
class LibFormatter(ManagedFile):
def simple_parameter(self, param, value):
self.file.write('{}: {}\n'.format(param, value))
with LibFormatter('library.txt', 'w') as lib:
... # etc
My two line solution is as follows:
with open(lib_loc + '\\' + lib_name + '.lib', 'w') as lib_file_handle:
lib_file = Liberty(lib_file_handle)
# do stuff using lib_file
the class initialization is as follows:
def __init__(self, file):
''' associate this instance with the given file handle '''
self.f = file
now instead of passing the raw file handle I pass the class along with the functions to my functions.
The simplest function is:
def wr(self, line):
''' write line given to file'''
self.f.write(line + '\n')
Which means I am replicating the write function built into the file.io class. This was what I was trying to avoid.
I have now found a satisfactory way of doing what I wanted. The following is my base class, which is built on the base functions of file_io (but is not a subclass) and a simple example for writing CSV files. I also have Formatters for HTML, Verilog and others. Code is:
class Formatter():
''' base class to manage the opening of a file in order to build classes which write a file
with a specific format so as to be able to pass the formatting functions to a subroutine
along with the file handle
Designed to use with "with" statement and to shorten argument lists of functions which use
the file
'''
def __init__(self, filename):
''' associate this instance with the given file handle
'''
self.f = open(filename, 'w')
def wr(self, line, end='\n'):
''' write line given to file
'''
self.f.write(line + end)
def wr_newline(self, num):
''' write num newlines to file
'''
self.f.write('\n'*num)
def __enter__(self):
''' needed for use with with statement
'''
return self
def __exit__(self, *args):
''' needed for use with with statement
'''
self.close()
def close(self):
''' explicit file close for use with procedural progamming style
'''
self.f.close()
class CSV(Formatter):
''' class to write items using comma separated file format string formatting
inherrits:
=> wr(line, end='\n'):
=> wr_newline(n):
all file io functions via the self.f variable
'''
#staticmethod
def pp(item):
''' 'pretty-print - can't have , or \n in strings used in CSV files
'''
return str(item).replace('\n', '/').replace(',', '-')
def __init__(self, filename):
'''open filen given as a CSV file
'''
super().__init__(filename + '.csv')
def wp(self, item):
''' write a single item to the file
'''
self.f.write(self.pp(item)+', ')
def ws(self, itemlist):
''' write a csv list from a list variable
'''
self.wr(','.join([self.pp(item) for item in itemlist]))

How to mock the open function for unit test

I've got two files:
REF_FILE : it's a file with changing data
TEST_FILE: it's a file with fixed data (It's simply a REF_FILE at a given moment)
Now I want to test this function:
def get_info_from_extract(mpm):
fid = open(REF_FILE)
all_infos = json.load(fid)
fid.close()
for m in all_infos:
if m['mpm_id'] == mpm:
break
return m
class Test_info_magento(unittest.TestCase):
def test_should_have_value(self):
# GIVEN
mpm=107
expected_value = 1.345
# WHEN
#MOCK OPEN FUNCTION TO READ TEST_FILE
m = file_info.get_info_from_extract(mpm)
# THEN
self.assertEqual(m['value'], expected_value)
The problem is the 'REF_FILE' is changing often so I can't properly test it. So i need to use 'TEST_FILE' and for that purpose I need to mock my open function. I can't find out how to mock it and I would like some help to figure out how to properly mock it in order to make it return my 'TEST_FILE'
I would recommend rewriting the function so it accepts file-like object (it would be easier to test and maintain).
However if you can not, try this context-manager:
class MockOpen(object):
def __call__(self, *args, **kwargs):
#print('mocked')
return self.__open(TEST_FILE) #it would be better to return a file-like object instead
def __enter__(self):
global open
self.__open = open
open = self
def __exit__(self, exception_type, exception_value, traceback):
global open
open = self.__open
with MockOpen():
# here you run your test
...
The context manager replaces (within with statement block) built-in function referenced by global label open with itself. Every call to open() in the body of the with block is a call of the __call__() method, which ignores all its arguments and returns opened TEST_FILE.
It is not the best possible implementation, since:
it uses actual file, slowing your tests - a file-like object should be returned instead,
it is not configurable - a file name (or content) should be given to its constructor.

object has no attributes. New to classes in python

import praw
import time
class getPms():
r = praw.Reddit(user_agent="Test Bot By /u/TheC4T")
r.login(username='*************', password='***************')
cache = []
inboxMessage = []
file = 'cache.txt'
def __init__(self):
cache = self.cacheRead(self, self.file)
self.bot_run(self)
self.cacheSave(self, self.file)
time.sleep(5)
return self.inboxMessage
def getPms(self):
def bot_run():
inbox = self.r.get_inbox(limit=25)
print(self.cache)
# print(r.get_friends())#this works
for message in inbox:
if message.id not in self.cache:
# print(message.id)
print(message.body)
# print(message.subject)
self.cache.append(message.id)
self.inboxMessage.append(message.body)
# else:
# print("no messages")
def cacheSave(self, file):
with open(file, 'w') as f:
for s in self.cache:
f.write(s + '\n')
def cacheRead(self, file):
with open(file, 'r') as f:
cache1 = [line.rstrip('\n') for line in f]
return cache1
# while True: #threading is needed in order to run this as a loop. Probably gonna do this in the main method though
# def getInbox(self):
# return self.inboxMessage
The exception is:
cache = self.cacheRead(self, self.file)
AttributeError: 'getPms' object has no attribute 'cacheRead'
I am new to working with classes in python and need help with what I am doing wrong with this if you need any more information I can add some. It worked when it was all functions but now that I attempted to switch it to a class it has stopped working.
Your cacheRead function (as well as bot_run and cacheSave) is indented too far, so it's defined in the body of your other function getPms. Thus it is only accessible inside of getPms. But you're trying to call it from __init__.
I'm not sure what you're trying to achieve here because getPms doesn't have anything else in it but three function definitions. As far as I can tell you should just take out the def getPms line and unindent the three functions it contains so they line up with the __init__ method.
Here are few points:
Unless you're explicitly inheriting from some specific class, you can omit parenthesis:
class A(object):, class A():, class A: are equivalent.
Your class name and class method have the same name. I'm not sure does Python confuse about this or not, but you probably do. You can name your class PMS and your method get, for example, so you'll obtain PMS.get(...)
In the present version of indentation cacheRead and cacheSave functions are simply inaccessible from init; why not move them to generic class namespace?
When calling member functions, you don't need to specify self as the first argument since you're already calling the function from this object. So instead of cache = self.cacheRead(self, self.file) you have to do it like this: cache = self.cacheRead(self.file)

Accessing class file from multiple methods in Python

My question relates mostly to how you use the with keyword in a class in Python.
If you have a Class which contains a file object, how do you use the with statement, if at all.
For example, I don't use with here:
class CSVLogger:
def __init__(self, rec_queue, filename):
self.rec_queue = rec_queue
## Filename specifications
self.__file_string__ = filename
f = open(self.__file_string__, 'wb')
self.csv_writer = csv.writer(f, newline='', lineterminator='\n', dialect='excel')
If I then do things to the file in another method, For example:
def write_something(self, msg):
self.csv_writer(msg)
Is this suitable? Should I include the with somewhere? I'm just afraid that one the __init__ exits, the with exits and might close the file?
Yes, you are correct with automatically closes the file when its scope ends, so if you use with statement in your __init__() function, the write_something function would not work.
Maybe you can use the with statement in the main part of the program, and instead of opening the file in __init__() function you can pass in the file object as a parameter to the __init__() function. and then do all operations you would like to do in the file within the with block.
Example -
Class would look like -
class CSVLogger:
def __init__(self, rec_queue, filename, f):
self.rec_queue = rec_queue
## Filename specifications
self.__file_string__ = filename
self.csv_writer = csv.writer(f, newline='', lineterminator='\n', dialect='excel')
def write_something(self, msg):
self.csv_writer(msg)
The main program may look like -
with open('filename','wb') as f:
cinstance = CSVLogger(...,f) #file and other parameters
.... #other logic
cinstance.write_something("some message")
..... #other logic
Though if this complicates thing a-lot, you are better off not using the with statement and rather making sure that you close the file when when its need is over.

Categories