I would like to open a file using the a_reader function. I would then like to use a second function to print the file. I want to do this because I would like to be able to call the open file function later without printing it. Any ideas on the best way to do this? Here is a sample code that I know does not work but it may help explain what I want to do
def main ():
a_reader = open ('C:\Users\filexxx.csv','r')
fileName = a_reader.read()
a_reader.close()
def print():
print fileName
main()
print()
Please see this day old thread: What is the Pythonic way to avoid reference before assignment errors in enclosing scopes?
The user in that post had the exact same issue, he wanted to define a function within another function (in your case main) as adviced both by me and others you don't nest functions!
There's no need to use nested functions in Python, it just adds useless complexity that doesn't give you any real practical advantages.
I would do:
def main ():
a_reader = open ('C:\\Users\\filexxx.csv','r')
fileName = a_reader.read()
a_reader.close()
return fileName
print(main())
or
class main():
def __init__(self):
a_reader = open ('C:\\Users\\filexxx.csv','r')
self.fileName = a_reader.read()
a_reader.close()
def _print(self):
print(self.fileName)
a = main()
a._print()
It's never a good idea to define your function-/class-names as the same as the default Python functions/classes. print being one of them.
But here's a solution if you really wanna go with your original setup:
def main ():
a_reader = open ('C:\\Users\\filexxx.csv','r')
fileName = a_reader.read()
a_reader.close()
def _print():
print fileName
_print()
main()
Oh, and btw.. strings with backslashes should be escaped or you need to use r'..' :)
First - you can't name your function as print as this is a function already existing in python and will return an error.
And it seems like a class is what you are looking for to be your main(), not a function.
I don't want to rock-the-boat here, I actually agree with the above two answers. Perhaps a class is the right way to go, and almost certainly it would be unwise to override the native print() function.
One of Python's strengths is that it covers a whole range of programming paradigms. You want a direct, procedural implementation -Python! You want to re-use your code and make some generic, reusable classes - OOP and Python! Python also allows for functional programming - Python's built-in functions map and zip are classic examples of functional programming.
Not saying that this is what was being asked in this question, but you could do this functionally:
def my_name_function(n):
return ' '.join(['This is my Name:', n])
def say_hello(x):
print('Hello, world!', x)
say_hello(my_name_function('Nick'))
--> Hello, world! This is my Name: Nick
Again, I don't think this is what the question is really asking. I do agree that, in this case, the best implementation would be a class, in the OOP sense. (Probably the more Pythonic way to go even :p)
But, to say there is no need for nested functions in Python, when Python leaves this option open to us? When Python has recently (over the last few years) opened the door to functional programming concepts??? It does have its advantages (and disadvantages) - if it didn't there would be no way that Guido, as the Benevolent Dictator for Life, would have opened this box.
If you want a_reader to be 'function', you should call it as a function and not use it as a variable. In python that would be by using a_reader().
The following implements a class Reader of which an instance a_reader can be called as a function. Only at the point of calling is the file opened. Like others already have indicated, you should escape backslashes in string literals ("..\\.."), or use raw strings (r"..\.."). It is also good practise to put the normal code in python file under an if __name__ == '__main__': statement, that way you can import functions/classes from the file without invoking the (test) code.
class Reader(object):
def __init__(self, file_name):
self._file_name = file_name
self._fp = None
def __call__(self):
if self._fp:
return self._fp
return open(self._file_name, 'r')
def main():
a_reader = Reader(r"C:\\Users\\filexxx.csv")
# no file opened yet
file_content = a_reader().read()
a_reader().close()
print file_content
if __name__ == '__main__': # only call main() if not imported
main()
Related
I have a program I'm writing. It's becoming too large to manage well: it has a lot of code related to GUI functionality (PyQt5), and it's nearly 2000 lines. I understand that in Python, it is acceptable to have large scripts that are coded completely in one file, but writing this program has become unmanageable. I can't easily organize the code in one file. I previously tried to split it into multiple files (using methods I'd found here on StackOverflow), but to no avail. I finally just realized what I might have to do for my situation. (I know that realistically, you'd never split the following into two files, but I'm using it as an example.) Here is a very simple example of a script split in two (file 1):
# TestProg.py
class Name:
def __init__(self, name):
self.name = name
def testfunc():
nm.name += nm.name[int(len(nm.name)/2):] # Add to name, second half of name
return nm.name
if __name__ == '__main__':
import TestModule
print(TestModule.returnName(TestModule.nm))
input()
File 2:
# TestModule.py
import TestProg
Name = TestProg.Name
testfunc = TestProg.testfunc
nm = Name('Henry')
def returnName(name):
return name.name[::-1] + testfunc()
If you try running this, it won't work. I finally figured out it's because of the fact that testfunc is defined in TestProg and uses a global variable (which isn't defined in TestProg). To solve this problem, you must modify the globals dict of testfunc, with testfunc.__globals__['nm'] = nm, and place that in TestModule right after you define nm.
I understand this is not a pretty thing to do, but it seems to be the best way I can think of to do it. Now, I know, "Just use a local variable/parameter for testfunc, instead of global." The thing is, it is a global variable (which needs to be defined in a file separate from the function in the split). I don't want to have to pass the variable to the function every time, that's very tedious, repetitive, and a waste of time. My question is: is there a better way to do it? I would assume no. A secondary question is: is this method acceptable? Is it considered a hack/highly unpythonic, and therefore unwise to use?
Please remember, I really need the file to be split, but I can't really split it any nicer than that which requires this method to work (as far as I know).
I know best way to file opening for reading/writing is using with with
instead of using
f = open('file.txt', 'w')
f.write('something')
f.close()
we should write --
with open('file.txt', 'w') as f:
f.write('something')
but what if I want to simply read a file ? I can do this
with open('file.txt') as f:
print (f.read())
but what is the problem in below line
print (open('file.txt').read())
OR
alist = open('file.txt').readlines()
print (alist)
does it automatically close the file after executing this statement? is this a standard way to write? should we write like this?
other than this - should I open a file in a function and pass the pointer to other for writing or should I declared it as a global variable? i.e
def writeto(f):
#do some stuff
f.write('write stuff')
def main():
f = open('file.txt', 'w')
while somecondition:
writeto(f)
f. close()
OR
f = open('file.txt', 'w')
def writeto():
#do some stuff
f.write('write stuff')
def main():
while somecondition:
writeto()
f. close()
Addressing your questions in order,
"What's wrong with print(open(file).readlines())?"
Well, you throw away the file object after using it so you cannot close it. Yes, Python will eventually automatically close your file, but on its terms rather than yours. If you are just playing around in a shell or terminal, this is probably fine, because your session will likely be short and there isn't any resource competition over the files usually. However, in a production environment, leaving file handles open for the lifetime of your script can be devastating to performance.
As far as creating a function that takes a file object and writes to it, well, that's essentially what file.write is. Consider that a file handle is an object with methods, and those methods behind the scenes take self (that is, the object) as the first argument. So write itself is already a function that takes a file handle and writes to it! You can create other functions if you want, but you're essentially duplicating the default behavior for no tangible benefit.
Consider that your first function looks sort of like this:
def write_to(file_handle, text):
return file_handle.write_to(text)
But what if I named file_handle self instead?
def write_to(self, text):
return self.write(text)
Now it looks like a method instead of a stand-alone function. In fact, if you do this:
f = open(some_file, 'w') # or 'a' -- some write mode
write_to = f.write
you have almost the same function (just bound to that particular file_handle)!
As an exercise, you can also create your own context managers in Python (to be used with the with statement). You do this by defining __enter__ and __exit__. So technically you could redefine this as well:
class FileContextManager():
def __init__(self, filename):
self.filename = filename
self._file = None
def __enter__(self):
self._file = open(self.filename, 'w')
def __exit__(self, type, value, traceback):
self._file.close()
and then use it like:
with FileContextManager('hello.txt') as filename:
filename.write('Hi!')
and it would do the same thing.
The point of all of this is just to say that Python is flexible enough to do all of this if you need to reimplement this and add to the default behavior, but in the standard case there's no real benefit to doing so.
As far as the program in your example, there's nothing wrong with virtually any of these ways in the trivial case. However, you are missing an opportunity to use the with statement in your main function:
def main():
with open('file.txt') as filename:
while some_condition:
filename.write('some text')
# file closed here after we fall off the loop then the with context
if __name__ == '__main__':
main()
If you want to make a function that takes a file handle as an argument:
def write_stuff(file_handle, text):
return file_handle.write(text)
def main():
with open('file.txt', 'w') as filename:
while some_condition:
write_stuff(filename, 'some text')
# file closed here after we fall off the loop then the with context
if __name__ == '__main__':
main()
Again, you can do this numerous different ways, so what's best for what you're trying to do? What's the most readable?
"Should I open a file in a function and pass the pointer to other for writing or should I declared it as a module variable?"
Well, as you've seen, either will work. This question is highly context-dependent and normally best practice dictates having the file open for the least amount of time and in the smallest reasonable scope. Thus, what needs access to your file? If it is many things in the module, perhaps a module-level variable or class to hold it is a good idea. Again, in the trivial case, just use with as above.
The problem is you put a space between print and ().
This code works good in Python3:
print(open('yourfile.ext').read())
I'm curious what the relative merits are of calling functions in a program using a decorator to create an ordered map of the functions, and iterating through that map, versus directly calling the functions in the order I want. Below are two examples that yield the same result:
PROCESS_MAP = {}
def register(call_order):
def decorator(process_func):
PROCESS_MAP[call_order] = process_func
return process_func
return decorator
#register(99)
def func3():
print 'last function'
#register(1)
def func0():
print 'first function'
#register(50)
def func1():
print 'middle function'
def main():
for func in sorted(PROCESS_MAP):
foo = PROCESS_MAP[func]
foo()
if __name__ == '__main__':
main()
This prints:
first function
middle function
last function
Is this any better than doing the following?
def func2():
print 'last function'
def func0():
print 'first function'
def func1():
print 'middle function'
def main():
func0()
func1()
func2()
if __name__ == '__main__':
main()
Some more questions:
More Pythonic?
More work than it's worth?
Does it open the door to too many issues if you don't properly assign the order correctly?
Are there more intelligent designs for calling functions in the right order?
I would prefer the second (non-decorated) approach in all situations I can think of. Explicitly naming the functions is direct, obvious, and clear. Consider asking the question "where is this function called from?" and how you might answer that in each case.
There is a possible bug in your decorator where it would silently drop a function that has the same call order number as a different function.
The first one will let you change order dynamically if you need and will also let you change the order with an external script much easier.
If your program's logic is clear and simple OR if you program will one day become more complex and the decorator will be irrelevant/will need to go under major changes in order to fit- use the second one.
You can read about more of the decorations common uses here:
What are some common uses for Python decorators?
But to conclude, I would say it really depends on your design and what is the programs purpose.
Furthermore, I would suggest, if choosing the first way, writing a better mechanism that will log the order, handle two function and more. Those improvements will maybe make the decorator worth it.
I say the second approach, unless you somehow wants to call a lot of functions and too lazy to type (and if you're actually doing this, there might be something wrong with your structure). Not only it is more readable and thus help minimize unnecessary errors, but it's also shorter.
PEAK Rules has some interesting utilities for this with #before, #after decorators and an interesting system for making rules to enforce certain behaviors.
A simpler mechanism would be to assign a 'weight' to each function and then sort them in weighted order - although this assumes a single array of functions. You can do a topological sort on a DAG to get a linear array of nodes if there is a hierarchy.
I often do interactive work in Python that involves some expensive operations that I don't want to repeat often. I'm generally running whatever Python file I'm working on frequently.
If I write:
import functools32
#functools32.lru_cache()
def square(x):
print "Squaring", x
return x*x
I get this behavior:
>>> square(10)
Squaring 10
100
>>> square(10)
100
>>> runfile(...)
>>> square(10)
Squaring 10
100
That is, rerunning the file clears the cache. This works:
try:
safe_square
except NameError:
#functools32.lru_cache()
def safe_square(x):
print "Squaring", x
return x*x
but when the function is long it feels strange to have its definition inside a try block. I can do this instead:
def _square(x):
print "Squaring", x
return x*x
try:
safe_square_2
except NameError:
safe_square_2 = functools32.lru_cache()(_square)
but it feels pretty contrived (for example, in calling the decorator without an '#' sign)
Is there a simple way to handle this, something like:
#non_resetting_lru_cache()
def square(x):
print "Squaring", x
return x*x
?
Writing a script to be executed repeatedly in the same session is an odd thing to do.
I can see why you'd want to do it, but it's still odd, and I don't think it's unreasonable for the code to expose that oddness by looking a little odd, and having a comment explaining it.
However, you've made things uglier than necessary.
First, you can just do this:
#functools32.lru_cache()
def _square(x):
print "Squaring", x
return x*x
try:
safe_square_2
except NameError:
safe_square_2 = _square
There is no harm in attaching a cache to the new _square definition. It won't waste any time, or more than a few bytes of storage, and, most importantly, it won't affect the cache on the previous _square definition. That's the whole point of closures.
There is a potential problem here with recursive functions. It's already inherent in the way you're working, and the cache doesn't add to it in any way, but you might only notice it because of the cache, so I'll explain it and show how to fix it. Consider this function:
#lru_cache()
def _fact(n):
if n < 2:
return 1
return _fact(n-1) * n
When you re-exec the script, even if you have a reference to the old _fact, it's going to end up calling the new _fact, because it's accessing _fact as a global name. It has nothing to do with the #lru_cache; remove that, and the old function will still end up calling the new _fact.
But if you're using the renaming trick above, you can just call the renamed version:
#lru_cache()
def _fact(n):
if n < 2:
return 1
return fact(n-1) * n
Now the old _fact will call fact, which is still the old _fact. Again, this works identically with or without the cache decorator.
Beyond that initial trick, you can factor that whole pattern out into a simple decorator. I'll explain step by step below, or see this blog post.
Anyway, even with the less-ugly version, it's still a bit ugly and verbose. And if you're doing this dozens of times, my "well, it should look a bit ugly" justification will wear thin pretty fast. So, you'll want to handle this the same way you always factor out ugliness: wrap it in a function.
You can't really pass names around as objects in Python. And you don't want to use a hideous frame hack just to deal with this. So you'll have to pass the names around as strings. ike this:
globals().setdefault('fact', _fact)
The globals function just returns the current scope's global dictionary. Which is a dict, which means it has the setdefault method, which means this will set the global name fact to the value _fact if it didn't already have a value, but do nothing if it did. Which is exactly what you wanted. (You could also use setattr on the current module, but I think this way emphasizes that the script is meant to be (repeatedly) executed in someone else's scope, not used as a module.)
So, here that is wrapped up in a function:
def new_bind(name, value):
globals().setdefault(name, value)
… which you can turn that into a decorator almost trivially:
def new_bind(name):
def wrap(func):
globals().setdefault(name, func)
return func
return wrap
Which you can use like this:
#new_bind('foo')
def _foo():
print(1)
But wait, there's more! The func that new_bind gets is going to have a __name__, right? If you stick to a naming convention, like that the "private" name must be the "public" name with a _ prefixed, we can do this:
def new_bind(func):
assert func.__name__[0] == '_'
globals().setdefault(func.__name__[1:], func)
return func
And you can see where this is going:
#new_bind
#lru_cache()
def _square(x):
print "Squaring", x
return x*x
There is one minor problem: if you use any other decorators that don't wrap the function properly, they will break your naming convention. So… just don't do that. :)
And I think this works exactly the way you want in every edge case. In particular, if you've edited the source and want to force the new definition with a new cache, you just del square before rerunning the file, and it works.
And of course if you want to merge those two decorators into one, it's trivial to do so, and call it non_resetting_lru_cache.
However, I'd keep them separate. I think it's more obvious what they do. And if you ever want to wrap another decorator around #lru_cache, you're probably still going to want #new_bind to be the outermost decorator, right?
What if you want to put new_bind into a module that you can import? Then it's not going to work, because it will be referring to the globals of that module, not the one you're currently writing.
You can fix that by explicitly passing your globals dict, or your module object, or your module name as an argument, like #new_bind(__name__), so it can find your globals instead of its. But that's ugly and repetitive.
You can also fix it with an ugly frame hack. At least in CPython, sys._getframe() can be used to get your caller's frame, and frame objects have a reference to their globals namespace, so:
def new_bind(func):
assert func.__name__[0] == '_'
g = sys._getframe(1).f_globals
g.setdefault(func.__name__[1:], func)
return func
Notice the big box in the docs that tells you this is an "implementation detail" that may only apply to CPython and is "for internal and specialized purposes only". Take this seriously. Whenever someone has a cool idea for the stdlib or builtins that could be implemented in pure Python, but only by using _getframe, it's generally treated almost the same as an idea that can't be implemented in pure Python at all. But if you know what you're doing, and you want to use this, and you only care about present-day versions of CPython, it will work.
There is no persistent_lru_cache in the stdlib. But you can build one pretty easily.
The functools source is linked directly from the docs, because this is one of those modules that's as useful as sample code as it is for using it directly.
As you can see, the cache is just a dict. If you replace that with, say, a shelf, it will become persistent automatically:
def persistent_lru_cache(filename, maxsize=128, typed=False):
"""new docstring explaining what dbpath does"""
# same code as before up to here
def decorating_function(user_function):
cache = shelve.open(filename)
# same code as before from here on.
Of course that only works if your arguments are strings. And it could be a little slow.
So, you might want to instead keep it as an in-memory dict, and just write code that pickles it to a file atexit, and restores it from a file if present at startup:
def decorating_function(user_function):
# ...
try:
with open(filename, 'rb') as f:
cache = pickle.load(f)
except:
cache = {}
def cache_save():
with lock:
with open(filename, 'wb') as f:
pickle.dump(cache, f)
atexit.register(cache_save)
# …
wrapper.cache_save = cache_save
wrapper.cache_filename = filename
Or, if you want it to write every N new values (so you don't lose the whole cache on, say, an _exit or a segfault or someone pulling the cord), add this to the second and third versions of wrapper, right after the misses += 1:
if misses % N == 0:
cache_save()
See here for a working version of everything up to this point (using save_every as the "N" argument, and defaulting to 1, which you probably don't want in real life).
If you want to be really clever, maybe copy the cache and save that in a background thread.
You might want to extend the cache_info to include something like number of cache writes, number of misses since last cache write, number of entries in the cache at startup, …
And there are probably other ways to improve this.
From a quick test, with save_every=1, this makes the cache on both get_pep and fib (from the functools docs) persistent, with no measurable slowdown to get_pep and a very small slowdown to fib the first time (note that fib(100) has 100097 hits vs. 101 misses…), and of course a large speedup to get_pep (but not fib) when you re-run it. So, just what you'd expect.
I can't say I won't just use #abarnert's "ugly frame hack", but here is the version that requires you to pass in the calling module's globals dict. I think it's worth posting given that decorator functions with arguments are tricky and meaningfully different from those without arguments.
def create_if_not_exists_2(my_globals):
def wrap(func):
if "_" != func.__name__[0]:
raise Exception("Function names used in cine must begin with'_'")
my_globals.setdefault(func.__name__[1:], func)
def wrapped(*args):
func(*args)
return wrapped
return wrap
Which you can then use in a different module like this:
from functools32 import lru_cache
from cine import create_if_not_exists_2
#create_if_not_exists_2(globals())
#lru_cache()
def _square(x):
print "Squaring", x
return x*x
assert "_square" in globals()
assert "square" in globals()
I've gained enough familiarity with decorators during this process that I was comfortable taking a swing at solving the problem another way:
from functools32 import lru_cache
try:
my_cine
except NameError:
class my_cine(object):
_reg_funcs = {}
#classmethod
def func_key (cls, f):
try:
name = f.func_name
except AttributeError:
name = f.__name__
return (f.__module__, name)
def __init__(self, f):
k = self.func_key(f)
self._f = self._reg_funcs.setdefault(k, f)
def __call__(self, *args, **kwargs):
return self._f(*args, **kwargs)
if __name__ == "__main__":
#my_cine
#lru_cache()
def fact_my_cine(n):
print "In fact_my_cine for", n
if n < 2:
return 1
return fact_my_cine(n-1) * n
x = fact_my_cine(10)
print "The answer is", x
#abarnert, if you are still watching, I'd be curious to hear your assessment of the downsides of this method. I know of two:
You have to know in advance what attributes to look in for a name to associate with the function. My first stab at it only looked at func_name which failed when passed an lru_cache object.
Resetting a function is painful: del my_cine._reg_funcs[('__main__', 'fact_my_cine')], and the swing I took at adding a __delitem__ was unsuccessful.
This question already has answers here:
How do I forward-declare a function to avoid `NameError`s for functions defined later?
(17 answers)
Closed 8 months ago.
I use Python CGI. I cannot call a function before it is defined.
In Oracle PL/SQL there was this trick of "forward declaration": naming all the functions on top so the order of defining doesn't matter.
Is there such a trick in Python as well?
example:
def do_something(ds_parameter):
helper_function(ds_parameter)
....
def helper_function(hf_parameter):
....
def main():
do_something(my_value)
main()
David is right, my example is wrong.
What about:
<start of cgi-script>
def do_something(ds_parameter):
helper_function(ds_parameter)
....
def print_something():
do_something(my_value)
print_something()
def helper_function(hf_parameter):
....
def main()
....
main()
Can I "forward declare" the functions at the top of the script?
All functions must be defined before any are used.
However, the functions can be defined in any order, as long as all are defined before any executable code uses a function.
You don't need "forward declaration" because all declarations are completely independent of each other. As long as all declarations come before all executable code.
Are you having a problem? If so, please post the code that doesn't work.
In your example, print_something() is out of place.
The rule: All functions must be defined before any code that does real work
Therefore, put all the statements that do work last.
An even better illustration of your point would be:
def main():
print_something()
....
def do_something(ds_parameter):
helper_function(ds_parameter)
....
def print_something():
do_something(my_value)
def helper_function(hf_parameter):
....
main()
In other words, you can keep your definition of main() at the top, for editing convenience -- avoiding frequent scrolling, if most of the time is spent editing main.
Assuming you have some code snippet that calls your function main after it's been defined, then your example works as written. Because of how Python is interpreted, any functions that are called by the body of do_something do not need to be defined when the do_something function is defined.
The steps that Python will take while executing your code are as follows.
Define the function do_something.
Define the function helper_function.
Define the function main.
(Given my assumption above) Call main.
From main, call do_something.
From do_something, call helper_function.
The only time that Python cares that helper_function exists is when it gets to step six. You should be able to verify that Python makes it all the way to step six before raising an error when it tries to find helper_function so that it can call it.
I've never come across a case where "forward-function-definition" is necessary.. Can you not simply move print_something() to inside your main function..?
def do_something(ds_parameter):
helper_function(ds_parameter)
....
def print_something():
do_something(my_value)
def helper_function(hf_parameter):
....
def main()
print_something()
....
main()
Python doesn't care that helper_function() is defined after it's use on line 3 (in the do_something function)
I recommend using something like WinPDB and stepping through your code. It shows nicely how Python's parser/executor(?) works
def funB(d,c):
return funA(d,c)
print funB(2,3)
def funA(x,y):
return x+y
The above code will return Error. But, the following code is fine ...
def funB(d,c):
return funA(d,c)
def funA(x,y):
return x+y
print funB(2,3)
So, even though you must definitely define the function before any real work is done, it is possible to get away if you don't use the function explicitly. I think this is somewhat similar to prototyping in other languages.. .
For all the people, who despite bad practice want a workaround...
I had a similar problem and solved it like this:
import Configurations as this
'''Configurations.py'''
if __name__ == '__main__':
this.conf01()
'''Test conf'''
def conf01():
print ("working")
So I can change my targeted configuration at the top of the file. The trick is to import the file into itself.
Use multiprocessing module:
from multiprocessing import Process
p1 = Process(target=function_name, args=(arg1, arg2,))
p1.start()