My package has the following structure:
mypackage
|-__main__.py
|-__init__.py
|-model
|-__init__.py
|-modelfile.py
|-simulation
|-sim1.py
|-sim2.py
The content of the file __main__.py is
from mypackage.simulation import sim1
if __name__ == '__main__':
sim1
So that when I execute python -m mypackage, the script sim1.py runs.
Now I would like to add an argument to the command line, so that python -m mypackage sim1 runs sim1.py and python -m mypackage sim2 runs sim2.py.
I've tried the follwing:
import sys
from mypackage.simulation import sim1,sim2
if __name__ == '__main__':
for arg in sys.argv:
arg
But it runs boths scripts instead of the one passed in argument.
In sim1.py and sim2.py I have the following code
from mypackage.model import modelfile
print('modelfile.ModelClass.someattr')
You can simply call __import__ with the module name as parameter, e.g.:
new_module = __import__(arg)
in your loop.
So, for example, you have your main program named example.py:
import sys
if __name__ == '__main__':
for arg in sys.argv[1:]:
module=__import__(arg)
print(arg, module.foo(1))
Note that sys.argv[0] contains the program name.
You have your sim1.py:
print('sim1')
def foo(n):
return n+1
and your sim2.py:
print('sim2')
def foo(n):
return n+2
then you can call
python example.py sim1 sim2
output:
sim1
sim1 2
sim2
sim2 3
Suppose you have you files with following content.
sim1.py
def simulation1():
print("This is simulation 1")
simulation1()
main.py
import sim1
sim1.simulation1()
output
This is simulation 1
This is simulation 1
When you import sim1 into main.py and calls its function simulation1, then This is simulation 1 gets printed 2 times.
Because, simulation1 is called inside sim1.py and also in main.py.
If you want to run that function in sim1.py, but don't want to run when sim1 is imported, then you can place it inside if __name__ == "__main__":.
sim1.py
def simulation1():
print("This is simulation 1")
if __name__ == "__main__":
simulation1()
main.py
import sim1
sim1.simulation1()
output
This is simulation 1
Your code doesn't do what you want it to do. Just sim1 doesn't actually call the function; the syntax to do that is sim1().
You could make your Python script evaluate random strings from the command line as Python expressions, but that's really not a secure or elegant way to solve this. Instead, have the strings map to internal functions, which may or may not have the same name. For example,
if __name__ == '__main__':
import sys
for arg in sys.argv[1:]:
if arg == 'sim1':
sim1()
if arg == 'mustard':
sim2()
if arg == 'ketchup':
sim3(sausages=2, cucumber=user in cucumberlovers)
else:
raise ValueError('Anguish! Don\'t know how to handle %s' % arg)
As this should hopefully illustrate, the symbol you accept on the command line does not need to correspond to the name of the function you want to run. If you want that to be the case, you can simplify this to use a dictionary:
if __name__ == '__main__':
import sys
d = {fun.__name__: fun for fun in (sim1, sim2)}
for arg in sys.argv[1:]:
if arg in d:
d[arg]()
else:
raise ValueError('Anguish! etc')
What's perhaps important to note here is that you select exactly which Python symbols you want to give the user access to from the command line, and allow no others to leak through. That would be a security problem (think what would happen if someone passed in 'import shutil; shutil.rmtree("/")' as the argument to run). This is similar in spirit to the many, many reasons to avoid eval, which you will find are easy to google (and you probably should if this is unfamiliar to you).
If sim1 is a module name you want to import only when the user specifically requests it, that's not hard to do either; see importing a module when the module name is in a variable but then you can't import it earlier on in the script.
if __name__ == '__main__':
import sys
modules = ['sim1', 'sim2']
for arg in sys.argv[1:]:
if arg in modules:
globals()[arg] = __import__(arg)
else:
raise ValueError('Anguish! etc')
But generally speaking, modules should probably only define functions, and leave it to the caller to decide if and when to run them at some time after they import the module.
Perhaps tangentially look into third-party libraries like click which easily allow you to expose selected functions as "subcommands" of your Python script, vaguely similarly to how git has subcommands init, log, etc.
Related
I have a Python program I'm building that can be run in either of 2 ways: the first is to call python main.py which prompts the user for input in a friendly manner and then runs the user input through the program. The other way is to call python batch.py -file- which will pass over all the friendly input gathering and run an entire file's worth of input through the program in a single go.
The problem is that when I run batch.py, it imports some variables/methods/etc from main.py, and when it runs this code:
import main
at the first line of the program, it immediately errors because it tries to run the code in main.py.
How can I stop Python from running the code contained in the main module which I'm importing?
Because this is just how Python works - keywords such as class and def are not declarations. Instead, they are real live statements which are executed. If they were not executed your module would be empty.
The idiomatic approach is:
# stuff to run always here such as class/def
def main():
pass
if __name__ == "__main__":
# stuff only to run when not called via 'import' here
main()
It does require source control over the module being imported, however.
Due to the way Python works, it is necessary for it to run your modules when it imports them.
To prevent code in the module from being executed when imported, but only when run directly, you can guard it with this if:
if __name__ == "__main__":
# this won't be run when imported
You may want to put this code in a main() method, so that you can either execute the file directly, or import the module and call the main(). For example, assume this is in the file foo.py.
def main():
print "Hello World"
if __name__ == "__main__":
main()
This program can be run either by going python foo.py, or from another Python script:
import foo
...
foo.main()
Use the if __name__ == '__main__' idiom -- __name__ is a special variable whose value is '__main__' if the module is being run as a script, and the module name if it's imported. So you'd do something like
# imports
# class/function definitions
if __name__ == '__main__':
# code here will only run when you invoke 'python main.py'
Unfortunately, you don't. That is part of how the import syntax works and it is important that it does so -- remember def is actually something executed, if Python did not execute the import, you'd be, well, stuck without functions.
Since you probably have access to the file, though, you might be able to look and see what causes the error. It might be possible to modify your environment to prevent the error from happening.
Put the code inside a function and it won't run until you call the function. You should have a main function in your main.py. with the statement:
if __name__ == '__main__':
main()
Then, if you call python main.py the main() function will run. If you import main.py, it will not. Also, you should probably rename main.py to something else for clarity's sake.
There was a Python enhancement proposal PEP 299 which aimed to replace if __name__ == '__main__': idiom with def __main__:, but it was rejected. It's still a good read to know what to keep in mind when using if __name__ = '__main__':.
You may write your "main.py" like this:
#!/usr/bin/env python
__all__=["somevar", "do_something"]
somevar=""
def do_something():
pass #blahblah
if __name__=="__main__":
do_something()
I did a simple test:
#test.py
x = 1
print("1, has it been executed?")
def t1():
print("hello")
print("2, has it been executed?")
def t2():
print("world")
print("3, has it been executed?")
def main():
print("Hello World")
print("4, has it been executed?")
print("5, has it been executed?")
print(x)
# while True:
# t2()
if x == 1:
print("6, has it been executed?")
#test2.py
import test
When executing or running test2.py, the running result:
1, has it been executed?
5, has it been executed?
1
6, has it been executed?
Conclusion: When the imported module does not add if __name__=="__main__":, the current module is run, The code in the imported module that is not in the function is executed sequentially, and the code in the function is not executed when it is not called.
in addition:
def main():
# Put all your code you need to execute directly when this script run directly.
pass
if __name__ == '__main__':
main()
else:
# Put functions you need to be executed only whenever imported
A minor error that could happen (at least it happened to me), especially when distributing python scripts/functions that carry out a complete analysis, was to call the function directly at the end of the function .py file.
The only things a user needed to modify were the input files and parameters.
Doing so when you import you'll get the function running immediately. For proper behavior, you simply need to remove the inside call to the function and reserve it for the real calling file/function/portion of code
Another option is to use a binary environment variable, e.g. lets call it 'run_code'. If run_code = 0 (False) structure main.py to bypass the code (but the temporarily bypassed function will still be imported as a module). Later when you are ready to use the imported function (now a module) set the environment variable run_code = 1 (True). Use the os.environ command to set and retrieve the binary variable, but be sure to convert it to an integer when retrieving (or restructure the if statement to read a string value),
in main.py:
import os
#set environment variable to 0 (False):
os.environ['run_code'] = '0'
def binary_module():
#retrieve environment variable, convert to integer
run_code_val = int(os.environ['run_code'] )
if run_code_val == 0:
print('nope. not doing it.')
if run_code_val == 1:
print('executing code...')
# [do something]
...in whatever script is loading main.py:
import os,main
main.binary_module()
OUTPUT: nope. not doing it.
# now flip the on switch!
os.environ['run_code'] = '1'
main.binary_module()
OUTPUT: executing code...
*Note: The above code presumes main.py and whatever script imports it exist in the same directory.
Although you cannot use import without running the code; there is quite a swift way in which you can input your variables; by using numpy.savez, which stores variables as numpy arrays in a .npz file. Afterwards you can load the variables using numpy.load.
See a full description in the scipy documentation
Please note this is only the case for variables and arrays of variable, and not for methods, etc.
Try just importing the functions needed from main.py? So,
from main import SomeFunction
It could be that you've named a function in batch.py the same as one in main.py, and when you import main.py the program runs the main.py function instead of the batch.py function; doing the above should fix that. I hope.
I have a Python program I'm building that can be run in either of 2 ways: the first is to call python main.py which prompts the user for input in a friendly manner and then runs the user input through the program. The other way is to call python batch.py -file- which will pass over all the friendly input gathering and run an entire file's worth of input through the program in a single go.
The problem is that when I run batch.py, it imports some variables/methods/etc from main.py, and when it runs this code:
import main
at the first line of the program, it immediately errors because it tries to run the code in main.py.
How can I stop Python from running the code contained in the main module which I'm importing?
Because this is just how Python works - keywords such as class and def are not declarations. Instead, they are real live statements which are executed. If they were not executed your module would be empty.
The idiomatic approach is:
# stuff to run always here such as class/def
def main():
pass
if __name__ == "__main__":
# stuff only to run when not called via 'import' here
main()
It does require source control over the module being imported, however.
Due to the way Python works, it is necessary for it to run your modules when it imports them.
To prevent code in the module from being executed when imported, but only when run directly, you can guard it with this if:
if __name__ == "__main__":
# this won't be run when imported
You may want to put this code in a main() method, so that you can either execute the file directly, or import the module and call the main(). For example, assume this is in the file foo.py.
def main():
print "Hello World"
if __name__ == "__main__":
main()
This program can be run either by going python foo.py, or from another Python script:
import foo
...
foo.main()
Use the if __name__ == '__main__' idiom -- __name__ is a special variable whose value is '__main__' if the module is being run as a script, and the module name if it's imported. So you'd do something like
# imports
# class/function definitions
if __name__ == '__main__':
# code here will only run when you invoke 'python main.py'
Unfortunately, you don't. That is part of how the import syntax works and it is important that it does so -- remember def is actually something executed, if Python did not execute the import, you'd be, well, stuck without functions.
Since you probably have access to the file, though, you might be able to look and see what causes the error. It might be possible to modify your environment to prevent the error from happening.
Put the code inside a function and it won't run until you call the function. You should have a main function in your main.py. with the statement:
if __name__ == '__main__':
main()
Then, if you call python main.py the main() function will run. If you import main.py, it will not. Also, you should probably rename main.py to something else for clarity's sake.
There was a Python enhancement proposal PEP 299 which aimed to replace if __name__ == '__main__': idiom with def __main__:, but it was rejected. It's still a good read to know what to keep in mind when using if __name__ = '__main__':.
You may write your "main.py" like this:
#!/usr/bin/env python
__all__=["somevar", "do_something"]
somevar=""
def do_something():
pass #blahblah
if __name__=="__main__":
do_something()
I did a simple test:
#test.py
x = 1
print("1, has it been executed?")
def t1():
print("hello")
print("2, has it been executed?")
def t2():
print("world")
print("3, has it been executed?")
def main():
print("Hello World")
print("4, has it been executed?")
print("5, has it been executed?")
print(x)
# while True:
# t2()
if x == 1:
print("6, has it been executed?")
#test2.py
import test
When executing or running test2.py, the running result:
1, has it been executed?
5, has it been executed?
1
6, has it been executed?
Conclusion: When the imported module does not add if __name__=="__main__":, the current module is run, The code in the imported module that is not in the function is executed sequentially, and the code in the function is not executed when it is not called.
in addition:
def main():
# Put all your code you need to execute directly when this script run directly.
pass
if __name__ == '__main__':
main()
else:
# Put functions you need to be executed only whenever imported
A minor error that could happen (at least it happened to me), especially when distributing python scripts/functions that carry out a complete analysis, was to call the function directly at the end of the function .py file.
The only things a user needed to modify were the input files and parameters.
Doing so when you import you'll get the function running immediately. For proper behavior, you simply need to remove the inside call to the function and reserve it for the real calling file/function/portion of code
Another option is to use a binary environment variable, e.g. lets call it 'run_code'. If run_code = 0 (False) structure main.py to bypass the code (but the temporarily bypassed function will still be imported as a module). Later when you are ready to use the imported function (now a module) set the environment variable run_code = 1 (True). Use the os.environ command to set and retrieve the binary variable, but be sure to convert it to an integer when retrieving (or restructure the if statement to read a string value),
in main.py:
import os
#set environment variable to 0 (False):
os.environ['run_code'] = '0'
def binary_module():
#retrieve environment variable, convert to integer
run_code_val = int(os.environ['run_code'] )
if run_code_val == 0:
print('nope. not doing it.')
if run_code_val == 1:
print('executing code...')
# [do something]
...in whatever script is loading main.py:
import os,main
main.binary_module()
OUTPUT: nope. not doing it.
# now flip the on switch!
os.environ['run_code'] = '1'
main.binary_module()
OUTPUT: executing code...
*Note: The above code presumes main.py and whatever script imports it exist in the same directory.
Although you cannot use import without running the code; there is quite a swift way in which you can input your variables; by using numpy.savez, which stores variables as numpy arrays in a .npz file. Afterwards you can load the variables using numpy.load.
See a full description in the scipy documentation
Please note this is only the case for variables and arrays of variable, and not for methods, etc.
Try just importing the functions needed from main.py? So,
from main import SomeFunction
It could be that you've named a function in batch.py the same as one in main.py, and when you import main.py the program runs the main.py function instead of the batch.py function; doing the above should fix that. I hope.
Should the function name main() always be empty and have arguments called within the function itself or is it acceptable to have them as inputs to the function e.g main(arg1, arg2, arg3)?
I know it works but I'm wondering if it is poor programming practice. Apologies if this is a duplicate but I couldn't see the question specifically answered for Python.
In most other programming languages, you'd either have zero parameters or two parameters:
int main(char *argv[], int argc)
To denote the arguments passed through to the parameter. However, in Python these are accessed through the sys module:
import sys
def main():
print(sys.argv, len(sys.argv))
But then you could extend this so that you pass through argv and argc into your python function, similar to other languages yes:
import sys
def main(argv, arc):
print(argv, arc)
if __name__ == '__main__':
main(sys.argv, len(sys.argv))
But let's forget about argv/argc for now - why would you want to pass something through to main. You create something outside of main and want to pass it through to main. And this can happen in two instances:
You're calling main multiple times from other functions.
You've created variables outside main that you want to pass through.
Point number 1 is definitely bad practice. main should be unique and called only once at the beginning of your program. If you have the need to call it multiple times, then the code inside main doesn't belong inside main. Split it up.
Point number 2 may seem like it makes sense, but then you do it in practise:
def main(a, b):
print(a, b)
if __name__ == '__main__':
x = 4
y = 5
main(x, y)
But then aren't x and y global variables? And good practice would assume that these are at the top of your file (and multiple other properties - they're constant, etc), and that you wouldn't need to pass these through as arguments.
By following the pattern:
def main():
...stuff...
if __name__ == '__main__':
main()
It allows your script to both to be run directly, and if packaged using setup tools, to have an executable script generated automatically when the package is installed by specifying main as an entry point.
See: https://setuptools.readthedocs.io/en/latest/setuptools.html#automatic-script-creation
You would add to setup.py something like:
entry_points={
'console_scripts': [
'my_script = my_module:main'
]
}
And then when you build a package, people can install it in their virtual environment, and immediately get a script called my_script on their path.
Automatic script creation like this requires a function that takes no required arguments.
It's a good idea to allow you script to be imported and expose it's functionality both for code reuse, and also for testing. I would recommend something line this pattern:
import argparse
def parse_args():
parser = argparse.ArgumentParser()
#
# ... configure command line arguments ...
#
return parser.parse_args()
def do_stuff(args):
#
# ... main functionality goes in here ...
#
def main():
args = parse_args()
do_stuff(args)
if __name__ == '__main__':
main()
This allows you to run your script directly, have an automatically generated script that behaves the same way, and also import the script and call do_stuff to re-use or test the actual functionality.
This blog post was mentioned in the comments: https://www.artima.com/weblogs/viewpost.jsp?thread=4829 which uses a default argument on main to allow dependency injection for testing, however, this is a very old blog post; the getopt library has been superseded twice since then. This pattern is superior and still allows dependency injection.
I would definitely prefer to see main take arguments rather than accessing sys.argv directly.
This makes the reuse of the main function by other Python modules much easier.
import sys
def main(arg):
...
if __name__ == "__main__":
main(sys.argv[1])
Now if I want to execute this module as a script from another module I can just write (in my other module).
from main_script import main
main("use this argument")
If main uses sys.argv this is tougher.
I have a number of scripts that reference a Python program via the:
python -c "execfile('myfile.py'); readFunc(param='myParam', input='blahblah')"
interface. What I'd like to do is conceptually simple: Develop a more modular system with a "main" and a normal Python CLI interface that then calls these functions, but also MAINTAINS the existing interface, so the scripts built to use it still work.
Is this possible?
Ideally, if I was to call
python myFile readFunc myParam blabblah
It'd be something like:
main(sys.argv):
readFunc(sys.argv[2], sys.arg[3])
I've tried something like that, but it hasn't quite worked. Is it possible to keep both interfaces/methods of invocation?
Thanks!
The first idea that comes to mind stems from the optional arguments to the execfile() function. You might be able to do something like this:
#!python
def main(args):
results = do_stuff()
return results
if __name__ == '__main__':
import sys
main(sys.argv[1:])
if __name__ == 'execfile':
main(args)
... and then when you want to call it via execfile() you supply a dictionary for its optional globals argument:
#!sh
python -c 'execfile(myfile, {"__name__":"execfile", "args":(1,2,3)}); ...'
This does require a little extra work when you're calling your functionality via -c as you have to remember to pass that dictionary and over-ride '__name__' ... through I suppose you could actually use any valid Python identifier. It's just that __name__ is closest to what you're actually doing.
The next idea feels a little dirty but relies on the apparent handling of the __file__ global identifier. That seems to be unset when calling python -c and set if the file is being imported or executed. So this works (at least for CPython 2.7.9):
#!/usr/bin/env python
foo='foo'
if __name__ == '__main__' and '__file__' not in globals():
print "Under -c:", foo
elif __name__ == '__main__':
print "Executed standalone:", foo
... and if you use that please don't give me credit. It looks ...
... ... ummm ....
... just ...
.... WRONG
If I understand this one
python myFile readFunc myParam blabblah
correctly, you want to parse argv[1] as a command name to be executed.
So just do
if __name__ == '__main__':
import sys
if len(sys.argv) < 2 or sys.argv[1].lower() == 'nop' or sys.argv[0] == '-c': # old, legacy interface
pass
elif sys.argv[1].lower() == 'readfunc': # new one
readFunc(sys.argv[2:])
where the 2nd part gets executed on a direct execution of the file (either via python file.py readFunc myParam blabblah or via python -m file readFunc myParam blabblah)
The "nop" / empty argv branch comes to play when using the "legacy" interface: in this case, you most probably have given no cmdline arguments and thus can assume that you don't want to execute anything.
This makes the situation as before: the readFunc identifier is exported and can be used from within the -c script as before.
This is my python hello.py script:
def hello(a,b):
print "hello and that's your sum:"
sum=a+b
print sum
import sys
if __name__ == "__main__":
hello(sys.argv[2])
The problem is that it can't be run from the windows command line prompt, I used this command:
C:\Python27>hello 1 1
But it didn't work unfortunately, may somebody please help?
import sys out of hello function.
arguments should be converted to int.
String literal that contain ' should be escaped or should be surrouned by ".
Did you invoke the program with python hello.py <some-number> <some-number> in command line?
import sys
def hello(a,b):
print "hello and that's your sum:", a + b
if __name__ == "__main__":
a = int(sys.argv[1])
b = int(sys.argv[2])
hello(a, b)
I found this thread looking for information about dealing with parameters; this easy guide was so cool:
import argparse
parser = argparse.ArgumentParser(description='Script so useful.')
parser.add_argument("--opt1", type=int, default=1)
parser.add_argument("--opt2")
args = parser.parse_args()
opt1_value = args.opt1
opt2_value = args.opt2
runs like:
python myScript.py --opt2 = 'hi'
Here are all of the previous answers summarized:
modules should be imported outside of functions.
hello(sys.argv[2]) needs to be indented since it is inside an if statement.
hello has 2 arguments so you need to call 2 arguments.
as far as calling the function from terminal, you need to call python .py ...
The code should look like this:
import sys
def hello(a, b):
print "hello and that's your sum:"
sum = a+b
print sum
if __name__== "__main__":
hello(int(sys.argv[1]), int(sys.argv[2]))
Then run the code with this command:
python hello.py 1 1
To execute your program from the command line, you have to call the python interpreter, like this :
C:\Python27>python hello.py 1 1
If you code resides in another directory, you will have to set the python binary path in your PATH environment variable, to be able to run it, too. You can find detailed instructions here.
Your indentation is broken. This should fix it:
import sys
def hello(a,b):
print 'hello and thats your sum:'
sum=a+b
print sum
if __name__ == "__main__":
hello(sys.argv[1], sys.argv[2])
Obviously, if you put the if __name__ statement inside the function, it will only ever be evaluated if you run that function. The problem is: the point of said statement is to run the function in the first place.
import sys
def hello(a, b):
print 'hello and that\'s your sum: {0}'.format(a + b)
if __name__ == '__main__':
hello(int(sys.argv[1]), int(sys.argv[2]))
Moreover see #thibauts answer about how to call python script.
There are more than a couple of mistakes in the code.
'import sys' line should be outside the functions as the function is itself being called using arguments fetched using sys functions.
If you want correct sum, you should cast the arguments (strings) into floats. Change the sum line to --> sum = float(a) + float(b).
Since you have not defined any default values for any of the function arguments, it is necessary to pass both arguments while calling the function --> hello(sys.argv[2], sys.argv[2])
import sys
def hello(a,b):
print ("hello and that's your sum:")
sum=float(a)+float(b)
print (sum)
if __name__ == "__main__":
hello(sys.argv[1], sys.argv[2])
Also, using "C:\Python27>hello 1 1" to run the code looks fine but you have to make sure that the file is in one of the directories that Python knows about (PATH env variable). So, please use the full path to validate the code.
Something like:
C:\Python34>python C:\Users\pranayk\Desktop\hello.py 1 1