What does `if name == "__main__"` mean in Python? [duplicate] - python

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
What does <if name==“main”:> do?
I have wrote scripts in Python for quite a while now and I study more of Python as I need it. When reading other people's code I meet if name == "__main__": construct quite often.
What is it good for?

This allows you to use the same file both as a library (by importing it) or as the starting point for an application.
For example, consider the following file:
# hello.py
def hello(to=__name__):
return "hello, %s" % to
if __name__ == "__main__":
print hello("world")
You can use that code in two ways. For one, you can write a program that imports it. If you import the library, __name__ will be the name of the library and thus the check will fail, and the code will not execute (which is the desired behavior):
#program.py
from hello import hello # this won't cause anything to print
print hello("world")
If you don't want to write this second file, you can directly run your code from the command line with something like:
$ python hello.py
hello, __main__
This behavior all depends on the special variable __name__ which python will set based on whether the library is imported or run directly by the interpreter. If run directly it will be set to __main__. If imported it will be set to the library name (in this case, hello).
Often this construct is used to add unit tests to your code. This way, when you write a library you can embed the testing code right in the file without worrying that it will get executed when the library is used in the normal way. When you want to test the library, you don't need any framework because you can just run the library as if it were a program.
See also __main__ in the python documentation (though it's remarkably sparse)

Basically,
There's a distinction between the "main" script file and external files which were imported or referenced in another way. If the script is the "main" script then the special variable __name__ will equal "__main__".
You can use this to protect the "execution" code from the classes and variables the script has. This enables you to import a script and use classes and variables without actually running that script's main code if it has any.
See also: What does if name == “main”: do?

Related

Is it possible to import a function from another script without importing the variables from such file?

I have two scripts 1 and 2, I would need these two script to work independently from one another, but I also need script 2 to be able to use the function in script 1 if needed. I tried to automate as much as possible, so each script has a number of variables inside already defined.
Here is an oversimplified example of the scrips that I am talking about.
Script 1:
def plustwo(n):
out = n + 2
print(out)
m=3
plustwo(m)
Result:
5
Script 2:
from file import plustwo
z=5
plustwo(z)
Result:
5
7
As you can see, when I import the function from script 1, even if I use from file import plustwo, also the variable is imported, this causes script 2 to return two results, one with variable m (5) and another with variable z (7).
Do I necessarily need to create a third script, identical to script 2 but without the m variable? Is there a way to exclude such variable from the import? (I just want the function from script 2, so that I can use script 1 with the variables it already has.)
In script 1, you can add if __name__ == '__main__':, so that it only executes
m=3
plustwo(m)
if the program is run directly from script1.py.
In the end, you only need to modify script 1 a little bit:
def plustwo(n):
out = n + 2
print(out)
if __name__ == '__main__':
m=3
plustwo(m)
No, it is not.
An import statement will always execute the whole target module the first time it runs. Even when the import is written to cherry pick just a few functions and names from the target file, the file is still executed from the first to last line - this is the way the language works.
The "module object" created in this execution (basically a namespace with all the functions, classes and variables defined globally in the module), is them made available on the sys.modules dictionary. On a second import statement referencing the same file, it will not run again: the cached version is used to pick any function or variables from it.
The practice for when a module file has direct side-effects on itself, and it still desirable that parts of it should be importable, is to put the code that you don't want to execute upon importing in a block that only runs if the automatic variable __name__ contains the string "__main__". Python does that so a module can "know" if a module is running as the main program, or has been imported as part of a greater system.
The idiom for this is ubiquous in Python projects and usually look like:
...
if __name__ == "__main__":
# this block only runs when the coded is executed as the main program
m = 3
plustwo(m)
(This comes, for obvious reasons, at the end of the file, after the function has been defined)
Otherwise, the only possible workaround for that would be to read the module file as a text file, parse and re-generate another source file dynamically, containing only the lines one is interested in, which would then be imported. And even so, the functions or classes imported in this way would not possibly address any other function, class or variable in the same module, of course.

For what uses do we need `sys` module in python?

I'm a bit experienced without other languages but, novice with Python. I have come across made codes in jupyter notebooks where sys is imported.
I can't see the further use of the sys module in the code. Can someone help me to understand what is the purpose of importing sys?
I do know about the module and it's uses though but can't find a concise reason of why is it used in many code blocks without any further use.
If nothing declared within sys is actually used, then there's no benefit to importing it. There's not a significant amount of cost either.
Sys module is a rather useful module as it allows you to work with your System and those things. Eg:
You can access any command line arguments using sys.argv[1:]
You can see the Path to files.
Version of your Python Interpreter using sys.version
Exit the running code with sys.exit
Mostly you will use it for accessing the Command Line arguments.
I'm a new pythonista bro, I learned to import it whenever I want to exit the program with a nice exit text in red
import sys
name = input("What's your name? ")
if name == "Vedant":
print(f"Hello There {name}.")
else:
sys.exit(f"You're not {name}!")
The sys includes "functions + variable " to help you control and change the python environment #runtime.
Some examples of this control includes:
1- using other sources data as input via using:
sys.stdin
2- using data in the other resources via using:
sys.stdout
3- writing errors when an exception happens, automatically in :
sys.stderr
4- exit from the program by printing a message like:
sys.exit("Finish with the calculations.")
5- The built-in variable to list the directories which the interpreter will looking for functions in them:
sys.pasth
6- Use a function to realize the number of bytes in anonymous datatype via:
sys.getsizeof(1)
sys.getsizeof(3.8)

How should a Python file be written such that it can be both a module and a script with command line options and pipe capabilities?

I'm considering how a Python file could be made to be an importable module as well as a script that is capable of accepting command line options and arguments as well as pipe data. How should this be done?
My attempt seems to work, but I want to know if my approach is how such a thing should be done (if such a thing should be done). Could there be complexities (such as when importing it) that I have not considered?
#!/usr/bin/env python
"""
usage:
program [options]
options:
--version display version and exit
--datamode engage data mode
--data=FILENAME input data file [default: data.txt]
"""
import docopt
import sys
def main(options):
print("main")
datamode = options["--datamode"]
filename_input_data = options["--data"]
if datamode:
print("engage data mode")
process_data(filename_input_data)
if not sys.stdin.isatty():
print("accepting pipe data")
input_stream = sys.stdin
input_stream_list = [line for line in input_stream]
print("input stream: {data}".format(data = input_stream_list))
def process_data(filename):
print("process data of file {filename}".format(filename = filename))
if __name__ == "__main__":
options = docopt.docopt(__doc__)
if options["--version"]:
print(version)
exit()
main(options)
That's it, you're good.
Nothing matters[1] except the if __name__ == '__main__', as noted elsewhere
From the docs (emphasis mine):
A module’s __name__ is set equal to '__main__' when read from standard input, a script, or from an interactive prompt. A module can discover whether or not it is running in the main scope by checking its own __name__, which allows a common idiom for conditionally executing code in a module when it is run as a script or with python -m but not when it is imported
I also like how python 2's docs poetically phrase it
It is this environment in which the idiomatic “conditional script” stanza causes a script to run:
That guard guarantees that the code underneath it will only be accepted if it is the main function being called; put all your argument-grabbing code there. If there is no other top-level code except class/function declarations, it will be safe to import.
Other complications?
Yes:
Multiprocessing (a new interpreter is started and things are re-imported). if __name__ == '__main__' covers that
If you're used to C coding, you might be thinking you can protect your imports with ifdef's and the like. There's some analogous hacks in python, but it's not what you're looking for.
I like having a main method like C and Java - when's that coming out? Never.
But I'm paranoid! What if someone changes my main function. Stop being friends with that person. As long as you're the user, I assume this isn't an issue.
I mentioned the -m flag. That sounds great, what's that?! Here and here, but don't worry about it.
Footnotes:
[1] Well, the fact that you put your main code in a function is nice. Means things will run slightly faster

Does it matter whether I code top-to-bottom or use if __name__ == "__main__" conventions?

Most of my Python scripts (mostly written for web-scraping/data science apps) follow this kind of format:
# import whatever packages
import x, y, z
# do some web-scraping and data manipulation
# write some niche function I need
# make some plots and basically end the script
This is all done via an interactive editor/console (like Eclipse). I basically write the code above, and then copy-paste the code below for testing.
Is there a more "standard" way to go about this? I know C has functions defined above the main function, and I see packages in Python with if __name__ == "__main__" conventions; is this the "appropriate" way to go about development? I suppose the core question is whether you want to be able to use the functions you write in other projects, as well.
The main reason to use if __name__ == "__main__" is so that code under it doesn't run if the file is imported from another file. When a file is imported in python, it simply run through the interpreter as if it had been pasted in at that point, but usually you just want the functions defined in the file, and not to run any code in the file.
For example, if you want to test the functions in your file, or use them in another file, you'd put the code that actually runs the functions under the if __name__ == "__main__".
Since you're writing code for your own consumption, it's totally up to you whether you want to put your code under if __name__ == "__main__", but it's recommended if you want to write your code in a reusable way.
There is no official, universally followed standard in Python with regards to how to organize your code. In relation to your specific question, for readability, and to follow common style patterns, I would recommend you put your functions at the top, and group your web scraping, data manipulation, and plots at the bottom.
I've seen Python packages with if __name__ == "__main__" written in different places within the file before, and a main() defined elsewhere in the same file. I believe, however, that it should go at the bottom of the file you intend to have as your main.
For example, it could look something like this:
import everything
define class1
define function1
define class2
define function2
define function3
define main()
call function3
call function4
do_things
define function4
define function5
if __name__ == "__main__":
run_main()
The sample above is a very generic version of something I've actually got running at work. When executed, it makes calls to a few other files, and works beautifully. It even has a check in the if __name__ == "__main__": block to check for a minimum version of Python.
Hope this helps.

What will happen if I modify a Python script while it's running?

Imagine a python script that will take a long time to run, what will happen if I modify it while it's running? Will the result be different?
Nothing, because Python precompiles your script into a PYC file and launches that.
However, if some kind of exception occurs, you may get a slightly misleading explanation, because line X may have different code than before you started the script.
When you run a python program and the interpreter is started up, the first thing that happens is the following:
the module sys and builtins is initialized
the __main__ module is initialized, which is the file you gave as an argument to the interpreter; this causes your code to execute
When a module is initialized, it's code is run, defining classes, variables, and functions in the process. The first step of your module (i.e. main file) will probably be to import other modules, which will again be initialized in just the same way; their resulting namespaces are then made available for your module to use. The result of an importing process is in part a module (python-) object in memory. This object does have fields that point to the .py and .pyc content, but these are not evaluated anymore: module objects are cached and their source never run twice. Hence, modifying the module afterwards on disk has no effect on the execution. It can have an effect when the source is read for introspective purposes, such as when exceptions are thrown, or via the module inspect.
This is why the check if __name__ == "__main__" is necessary when adding code that is not intended to run when the module is imported. Running the file as main is equivalent to that file being imported, with the exception of __name__ having a different value.
Sources:
What happens when a module is imported: The import system
What happens when the interpreter starts: Top Level Components
What's the __main__ module: __main__- Top-level code environment
This is a fun question. The answer is that "it depends".
Consider the following code:
"Example script showing bad things you can do with python."
import os
print('this is a good script')
with open(__file__, 'w') as fdesc:
fdesc.write('print("this is a bad script")')
import bad
Try saving the above as "/tmp/bad.py" then do "cd /tmp" and finally "python3 bad.py" and see what happens.
On my ubuntu 20 system I see the output:
this is a good script
this is a bad script
So again, the answer to your question is "it depends". If you don't do anything funky then the script is in memory and you are fine. But python is a pretty dynamic language so there are a variety of ways to modify your "script" and have it affect the output.
If you aren't trying to do anything funky, then probably one of the things to watch out for are imports inside functions.
Below is another example which illustrates the idea (save as "/tmp/modify.py" and do "cd /tmp" and then "python3 modify.py" to run). The fiddle function defined below simulates you modifying the script while it is running (if desired, you could remove the fiddle function, put in a time.sleep(300) at the second to last line, and modify the file yourself).
The point is that since the show function is doing an import inside the function instead of at the top of the module, the import won't happen until the function is called. If you have modified the script before you call show, then your modified version of the script will be used.
If you are seeing surprising or unexpected behavior from modifying a running script, I would suggest looking for import statements inside functions. There are sometimes good reasons to do that sort of thing so you will see it in people's code as well as some libraries from time to time.
Below is the demonstration of how an import inside a function can cause strange effects. You can try this as is vs commenting out the call to the fiddle function to see the effect of modifying a script while it is running.
"Example showing import in a function"
import time
def yell(msg):
"Yell a msg"
return f'#{msg}#'
def show(msg):
"Print a message nicely"
import modify
print(modify.yell(msg))
def fiddle():
orig = open(__file__).read()
with open(__file__, 'w') as fdesc:
modified = orig.replace('{' + 'msg' + '}', '{msg.upper()}')
fdesc.write(modified)
fiddle()
show('What do you think?')
No, the result will not reflect the changes once saved. The result will not change when running regular python files. You will have to save your changes and re-run your program.
If you run the following script:
from time import sleep
print("Printing hello world in: ")
for i in range(10, 0, -1):
print(f"{i}...")
sleep(1)
print("Hello World!")
Then change "Hello World!" to "Hello StackOverflow!" while it's counting down, it will still output "Hello World".
Nothing, as this answer. Besides, I did experiment when multiprocessing is involved. Save the script below as x.py:
import multiprocessing
import time
def f(x):
print(x)
time.sleep(10)
if __name__ == '__main__':
with multiprocessing.Pool(2) as pool:
for _ in pool.imap(f, ['hello'] * 5):
pass
After python3 x.py and after the first two 'hello' being printed out, I modified ['hello'] to ['world'] and observed what happend. Nothing interesting happened. The result was still:
hello
hello
hello
hello
hello
It happens nothing. Once the script is loaded in memory and running it will keep like this.
An "auto-reloading" feature can be implemented anyway in your code, like Flask and other frameworks does.
This is slightly different from what you describe in your question, but it works:
my_string = "Hello World!"
line = input(">>> ")
exec(line)
print(my_string)
Test run:
>>> print("Hey")
Hey
Hello World!
>>> my_string = "Goodbye, World"
Goodbye, World
See, you can change the behavior of your "loaded" code dynamically.
depending. if a python script links to other modified file, then will load newer version ofcourse. but if source doesnt point to any other file it'll just run all script from cache as long as its run. changes will be visible next time...
and if about auto-applying changes when they're made - yes, #pcbacterio was correct. its possible to do thar but script which does it just remembers last action/thing what was doing and checks when the file is modified to rerun it (so its almost invisible)
=]

Categories