Is it bad practice to have arguments called in main() in Python - python

Should the function name main() always be empty and have arguments called within the function itself or is it acceptable to have them as inputs to the function e.g main(arg1, arg2, arg3)?
I know it works but I'm wondering if it is poor programming practice. Apologies if this is a duplicate but I couldn't see the question specifically answered for Python.

In most other programming languages, you'd either have zero parameters or two parameters:
int main(char *argv[], int argc)
To denote the arguments passed through to the parameter. However, in Python these are accessed through the sys module:
import sys
def main():
print(sys.argv, len(sys.argv))
But then you could extend this so that you pass through argv and argc into your python function, similar to other languages yes:
import sys
def main(argv, arc):
print(argv, arc)
if __name__ == '__main__':
main(sys.argv, len(sys.argv))
But let's forget about argv/argc for now - why would you want to pass something through to main. You create something outside of main and want to pass it through to main. And this can happen in two instances:
You're calling main multiple times from other functions.
You've created variables outside main that you want to pass through.
Point number 1 is definitely bad practice. main should be unique and called only once at the beginning of your program. If you have the need to call it multiple times, then the code inside main doesn't belong inside main. Split it up.
Point number 2 may seem like it makes sense, but then you do it in practise:
def main(a, b):
print(a, b)
if __name__ == '__main__':
x = 4
y = 5
main(x, y)
But then aren't x and y global variables? And good practice would assume that these are at the top of your file (and multiple other properties - they're constant, etc), and that you wouldn't need to pass these through as arguments.

By following the pattern:
def main():
...stuff...
if __name__ == '__main__':
main()
It allows your script to both to be run directly, and if packaged using setup tools, to have an executable script generated automatically when the package is installed by specifying main as an entry point.
See: https://setuptools.readthedocs.io/en/latest/setuptools.html#automatic-script-creation
You would add to setup.py something like:
entry_points={
'console_scripts': [
'my_script = my_module:main'
]
}
And then when you build a package, people can install it in their virtual environment, and immediately get a script called my_script on their path.
Automatic script creation like this requires a function that takes no required arguments.
It's a good idea to allow you script to be imported and expose it's functionality both for code reuse, and also for testing. I would recommend something line this pattern:
import argparse
def parse_args():
parser = argparse.ArgumentParser()
#
# ... configure command line arguments ...
#
return parser.parse_args()
def do_stuff(args):
#
# ... main functionality goes in here ...
#
def main():
args = parse_args()
do_stuff(args)
if __name__ == '__main__':
main()
This allows you to run your script directly, have an automatically generated script that behaves the same way, and also import the script and call do_stuff to re-use or test the actual functionality.
This blog post was mentioned in the comments: https://www.artima.com/weblogs/viewpost.jsp?thread=4829 which uses a default argument on main to allow dependency injection for testing, however, this is a very old blog post; the getopt library has been superseded twice since then. This pattern is superior and still allows dependency injection.

I would definitely prefer to see main take arguments rather than accessing sys.argv directly.
This makes the reuse of the main function by other Python modules much easier.
import sys
def main(arg):
...
if __name__ == "__main__":
main(sys.argv[1])
Now if I want to execute this module as a script from another module I can just write (in my other module).
from main_script import main
main("use this argument")
If main uses sys.argv this is tougher.

Related

How can I import the value of a variable into another Python file? [duplicate]

I have a Python program I'm building that can be run in either of 2 ways: the first is to call python main.py which prompts the user for input in a friendly manner and then runs the user input through the program. The other way is to call python batch.py -file- which will pass over all the friendly input gathering and run an entire file's worth of input through the program in a single go.
The problem is that when I run batch.py, it imports some variables/methods/etc from main.py, and when it runs this code:
import main
at the first line of the program, it immediately errors because it tries to run the code in main.py.
How can I stop Python from running the code contained in the main module which I'm importing?
Because this is just how Python works - keywords such as class and def are not declarations. Instead, they are real live statements which are executed. If they were not executed your module would be empty.
The idiomatic approach is:
# stuff to run always here such as class/def
def main():
pass
if __name__ == "__main__":
# stuff only to run when not called via 'import' here
main()
It does require source control over the module being imported, however.
Due to the way Python works, it is necessary for it to run your modules when it imports them.
To prevent code in the module from being executed when imported, but only when run directly, you can guard it with this if:
if __name__ == "__main__":
# this won't be run when imported
You may want to put this code in a main() method, so that you can either execute the file directly, or import the module and call the main(). For example, assume this is in the file foo.py.
def main():
print "Hello World"
if __name__ == "__main__":
main()
This program can be run either by going python foo.py, or from another Python script:
import foo
...
foo.main()
Use the if __name__ == '__main__' idiom -- __name__ is a special variable whose value is '__main__' if the module is being run as a script, and the module name if it's imported. So you'd do something like
# imports
# class/function definitions
if __name__ == '__main__':
# code here will only run when you invoke 'python main.py'
Unfortunately, you don't. That is part of how the import syntax works and it is important that it does so -- remember def is actually something executed, if Python did not execute the import, you'd be, well, stuck without functions.
Since you probably have access to the file, though, you might be able to look and see what causes the error. It might be possible to modify your environment to prevent the error from happening.
Put the code inside a function and it won't run until you call the function. You should have a main function in your main.py. with the statement:
if __name__ == '__main__':
main()
Then, if you call python main.py the main() function will run. If you import main.py, it will not. Also, you should probably rename main.py to something else for clarity's sake.
There was a Python enhancement proposal PEP 299 which aimed to replace if __name__ == '__main__': idiom with def __main__:, but it was rejected. It's still a good read to know what to keep in mind when using if __name__ = '__main__':.
You may write your "main.py" like this:
#!/usr/bin/env python
__all__=["somevar", "do_something"]
somevar=""
def do_something():
pass #blahblah
if __name__=="__main__":
do_something()
I did a simple test:
#test.py
x = 1
print("1, has it been executed?")
def t1():
print("hello")
print("2, has it been executed?")
def t2():
print("world")
print("3, has it been executed?")
def main():
print("Hello World")
print("4, has it been executed?")
print("5, has it been executed?")
print(x)
# while True:
# t2()
if x == 1:
print("6, has it been executed?")
#test2.py
import test
When executing or running test2.py, the running result:
1, has it been executed?
5, has it been executed?
1
6, has it been executed?
Conclusion: When the imported module does not add if __name__=="__main__":, the current module is run, The code in the imported module that is not in the function is executed sequentially, and the code in the function is not executed when it is not called.
in addition:
def main():
# Put all your code you need to execute directly when this script run directly.
pass
if __name__ == '__main__':
main()
else:
# Put functions you need to be executed only whenever imported
A minor error that could happen (at least it happened to me), especially when distributing python scripts/functions that carry out a complete analysis, was to call the function directly at the end of the function .py file.
The only things a user needed to modify were the input files and parameters.
Doing so when you import you'll get the function running immediately. For proper behavior, you simply need to remove the inside call to the function and reserve it for the real calling file/function/portion of code
Another option is to use a binary environment variable, e.g. lets call it 'run_code'. If run_code = 0 (False) structure main.py to bypass the code (but the temporarily bypassed function will still be imported as a module). Later when you are ready to use the imported function (now a module) set the environment variable run_code = 1 (True). Use the os.environ command to set and retrieve the binary variable, but be sure to convert it to an integer when retrieving (or restructure the if statement to read a string value),
in main.py:
import os
#set environment variable to 0 (False):
os.environ['run_code'] = '0'
def binary_module():
#retrieve environment variable, convert to integer
run_code_val = int(os.environ['run_code'] )
if run_code_val == 0:
print('nope. not doing it.')
if run_code_val == 1:
print('executing code...')
# [do something]
...in whatever script is loading main.py:
import os,main
main.binary_module()
OUTPUT: nope. not doing it.
# now flip the on switch!
os.environ['run_code'] = '1'
main.binary_module()
OUTPUT: executing code...
*Note: The above code presumes main.py and whatever script imports it exist in the same directory.
Although you cannot use import without running the code; there is quite a swift way in which you can input your variables; by using numpy.savez, which stores variables as numpy arrays in a .npz file. Afterwards you can load the variables using numpy.load.
See a full description in the scipy documentation
Please note this is only the case for variables and arrays of variable, and not for methods, etc.
Try just importing the functions needed from main.py? So,
from main import SomeFunction
It could be that you've named a function in batch.py the same as one in main.py, and when you import main.py the program runs the main.py function instead of the batch.py function; doing the above should fix that. I hope.

Function is called multiple times when importing but runs only once when used in the .py file where it was created [duplicate]

I have a Python program I'm building that can be run in either of 2 ways: the first is to call python main.py which prompts the user for input in a friendly manner and then runs the user input through the program. The other way is to call python batch.py -file- which will pass over all the friendly input gathering and run an entire file's worth of input through the program in a single go.
The problem is that when I run batch.py, it imports some variables/methods/etc from main.py, and when it runs this code:
import main
at the first line of the program, it immediately errors because it tries to run the code in main.py.
How can I stop Python from running the code contained in the main module which I'm importing?
Because this is just how Python works - keywords such as class and def are not declarations. Instead, they are real live statements which are executed. If they were not executed your module would be empty.
The idiomatic approach is:
# stuff to run always here such as class/def
def main():
pass
if __name__ == "__main__":
# stuff only to run when not called via 'import' here
main()
It does require source control over the module being imported, however.
Due to the way Python works, it is necessary for it to run your modules when it imports them.
To prevent code in the module from being executed when imported, but only when run directly, you can guard it with this if:
if __name__ == "__main__":
# this won't be run when imported
You may want to put this code in a main() method, so that you can either execute the file directly, or import the module and call the main(). For example, assume this is in the file foo.py.
def main():
print "Hello World"
if __name__ == "__main__":
main()
This program can be run either by going python foo.py, or from another Python script:
import foo
...
foo.main()
Use the if __name__ == '__main__' idiom -- __name__ is a special variable whose value is '__main__' if the module is being run as a script, and the module name if it's imported. So you'd do something like
# imports
# class/function definitions
if __name__ == '__main__':
# code here will only run when you invoke 'python main.py'
Unfortunately, you don't. That is part of how the import syntax works and it is important that it does so -- remember def is actually something executed, if Python did not execute the import, you'd be, well, stuck without functions.
Since you probably have access to the file, though, you might be able to look and see what causes the error. It might be possible to modify your environment to prevent the error from happening.
Put the code inside a function and it won't run until you call the function. You should have a main function in your main.py. with the statement:
if __name__ == '__main__':
main()
Then, if you call python main.py the main() function will run. If you import main.py, it will not. Also, you should probably rename main.py to something else for clarity's sake.
There was a Python enhancement proposal PEP 299 which aimed to replace if __name__ == '__main__': idiom with def __main__:, but it was rejected. It's still a good read to know what to keep in mind when using if __name__ = '__main__':.
You may write your "main.py" like this:
#!/usr/bin/env python
__all__=["somevar", "do_something"]
somevar=""
def do_something():
pass #blahblah
if __name__=="__main__":
do_something()
I did a simple test:
#test.py
x = 1
print("1, has it been executed?")
def t1():
print("hello")
print("2, has it been executed?")
def t2():
print("world")
print("3, has it been executed?")
def main():
print("Hello World")
print("4, has it been executed?")
print("5, has it been executed?")
print(x)
# while True:
# t2()
if x == 1:
print("6, has it been executed?")
#test2.py
import test
When executing or running test2.py, the running result:
1, has it been executed?
5, has it been executed?
1
6, has it been executed?
Conclusion: When the imported module does not add if __name__=="__main__":, the current module is run, The code in the imported module that is not in the function is executed sequentially, and the code in the function is not executed when it is not called.
in addition:
def main():
# Put all your code you need to execute directly when this script run directly.
pass
if __name__ == '__main__':
main()
else:
# Put functions you need to be executed only whenever imported
A minor error that could happen (at least it happened to me), especially when distributing python scripts/functions that carry out a complete analysis, was to call the function directly at the end of the function .py file.
The only things a user needed to modify were the input files and parameters.
Doing so when you import you'll get the function running immediately. For proper behavior, you simply need to remove the inside call to the function and reserve it for the real calling file/function/portion of code
Another option is to use a binary environment variable, e.g. lets call it 'run_code'. If run_code = 0 (False) structure main.py to bypass the code (but the temporarily bypassed function will still be imported as a module). Later when you are ready to use the imported function (now a module) set the environment variable run_code = 1 (True). Use the os.environ command to set and retrieve the binary variable, but be sure to convert it to an integer when retrieving (or restructure the if statement to read a string value),
in main.py:
import os
#set environment variable to 0 (False):
os.environ['run_code'] = '0'
def binary_module():
#retrieve environment variable, convert to integer
run_code_val = int(os.environ['run_code'] )
if run_code_val == 0:
print('nope. not doing it.')
if run_code_val == 1:
print('executing code...')
# [do something]
...in whatever script is loading main.py:
import os,main
main.binary_module()
OUTPUT: nope. not doing it.
# now flip the on switch!
os.environ['run_code'] = '1'
main.binary_module()
OUTPUT: executing code...
*Note: The above code presumes main.py and whatever script imports it exist in the same directory.
Although you cannot use import without running the code; there is quite a swift way in which you can input your variables; by using numpy.savez, which stores variables as numpy arrays in a .npz file. Afterwards you can load the variables using numpy.load.
See a full description in the scipy documentation
Please note this is only the case for variables and arrays of variable, and not for methods, etc.
Try just importing the functions needed from main.py? So,
from main import SomeFunction
It could be that you've named a function in batch.py the same as one in main.py, and when you import main.py the program runs the main.py function instead of the batch.py function; doing the above should fix that. I hope.

How to execute auxiliary script with main script's variables?

I have a main python script I am running (which is quite lengthy), so I've compartmentalized sections of it into different python scripts. I thought any variables declared in the main script would be considered global to the auxiliary scripts (as long as the main script is calling the aux ones), but I'm running into issues. Here is a simplified example:
mainscript.py:
import auxscript
a = 1
auxscript.auxscript()
auxscript.py:
def auxscript():
print("the value of a is: %s" %a)
if __name__ == '__main__':
auxscript()
What is the proper way to do this? I would have expected auxscript to first look locally for a, and then after not finding it, look globally within the scope it's launched from (mainscript.py)
Edit: I know I could obviously pass variables to the function through inputs, but I was hoping not to have to do this. It becomes messy when there are many variables that need to be accessed by the auxiliary scripts.
You can pass a through to your function like this.
mainscript.py:
import auxscript
a = 1
auxscript.auxscript(a)
auxscript.py
def auxscript():
print("the value of a is: %s" %a)
if __name__ == '__main__':
auxscript(a)
Functions can generally only access variables passed to them or created within them.
Edit: Saw your edit. Why not use the above functionality but instead of passing a long list of variables you could pass a dictionary using **args. Then they will be read as keyword variables by your function.

Python getopt() in __main__

I'm a Python beginner and have successfully gotten my first program with CLI parameters passed in to run. Got lots of help from this Handling command line options.
My question is: Why in Example 5.45 a separate def main(argv) has been used, instead of calling the try/except block within __main__ itself.
Example 5.45
def main(argv):
grammar = "kant.xml"
try:
opts, args = getopt.getopt(argv, "hg:d", ["help", "grammar="]) 2
except getopt.GetoptError:
usage()
sys.exit(2)
...
if __name__ == "__main__":
main(sys.argv[1:])
Hope someone versed in Python could share your wisdom.
TIA -
Ashant
There is no strict technical reason, but it is quite idiomatic to keep the code outside functions as short as possible. Specifically, putting the code into the module scope would turn the variables grammar, opts and args into global public variables, even though they are only required inside the main code. Furthermore, using a dedicated main function simplifies unit-testing this function.
One advantage of using a main function is that it allows for easy code re-use:
import sys
import script
script.main(sys.argv[1:])
# or, e.g. script.main(['-v', 'file.txt']), etc
Any code in the script's __main__ block won't be run if it is imported as a module. So the main function acts as a simple interface providing access to all the normal functionality of the script. The __main__ block will then usually contain just a single call to main, plus any other non-essential code (such as tests).
Some tips from the author of Python on how to write a good main function can be found here.

run a function (from a .py file) from a linux console

maybe the title is not very clear, let me elaborate.
I have a python script that open a ppm file , apply a chosen filter(rotations...) and create a new picture. until here everything work fine.
but I want to do the same thing through a linux console like:
ppmfilter.py ROTD /path/imageIn.ppm /path/imageOut.ppm
here ROTD is the name of the function that apply a rotation.
I don't know how to do this, I'm looking for a library that'll allow me to do this.
looking forward for your help.
P.S.: I'm using python 2.7
There is a relatively easy way:
You can determine the global names (functions, variables, etc.) with the use of 'globals()'. This gives you a dictionary of all global symbols. You'll just need to check the type (with type() and the module types) and if it's a function, you can call it with sys.argv:
import types
import sys
def ROTD(infile, outfile):
# do something
if __name__ == '__main__':
symbol = globals().get(sys.argv[1])
if hasattr(symbol, '__call__'):
symbol(*sys.argv[2:])
This will pass the program argument (excluding the filename and the command name) to the function.
EDIT: Please, don't forget the error handling. I omited it for reasons of clarity.
Use main() function:
def main()
# call your function here
if __name__ == "__main__":
main()
A nice way to do it would be to define a big dictionary {alias: function} inside your module. For instance:
actions = {
'ROTD': ROTD,
'REFL': reflect_image,
'INVT': invIm,
}
You get the idea. Then take the first command-line argument and interpret it as a key of this dictionary, applying actions[k] to the rest of the arguments.
You can define in your ppmfilter.py main section doing this:
if __name__ == "__main__":
import sys
ROTD(sys.argv[1], sys.argv[2]) # change according to the signature of the function
and call it: python ppmfilter.py file1 file2
You can also run python -c in the directory that contains you *.py file:
python -c "import ppmfilter; ppmfilter.ROTD('/path/to/file1', '/path/to/file2')"

Categories