Sorry if this is a very novice question, I was just wondering one thing.
When in python and your code is split up amongst multiple files, how would you avoid a ton of imports on the same thing?
Say I have 2 files. Main, and Content.
Main:
import pygame
from pygame.locals import *
pygame.display.init()
blah
Content:
import pygame
from pygame.locals import *
pygame.display.init()
load content and stuff
pygame is imported twice and display.init is called twice. This is problematic in other places. Is there anyway to get around this, or must it just import and import and import?
One situation I can think of is this: A script that writes to a file everytime it is imported. That way, if it's imported 3 times, it gets run 3 times, therefore writing to the file 3 times.
Thanks in advance!
You misunderstand what import does. It's not the same as include. Loaded modules are singletons and their corresponding files are not evaluated twice.
That said, a well-constructed module will not have side-effects on import. That's the purpose of the if __name__=='__main__' idiom.
Do not attempt to "clean up" your imports. Import everything you need to use from within a file. You could make less use of import *, but this is purely for code readability and maintainability.
You should avoid having anything happen at import(except, see further down). a python file is a module first, so it can and should be used by other python modules. If something "happens" in the import stage of a python file, then it may happen in an undesirable way when that file is imported by another module.
Each module should just define things to be used: classes, functions, constants, and just wait for something else to use them.
Obviously, if no script ever does at import, then it's not possible for anything to actually get used and make stuff "happen". There is a special idiom for the unusual case that a module was called directly. Each python file has a variable, __name__ automatically created with the module name it was imported as. When you run a script from the command line (or however you have started it), it wasn't imported, and there's no name for it to have, and so the __name__ variable will have a special value "__main__" that indicates that it's the script being executed. You can check for this condition and act accordingly:
# main.py
import pygame
from pygame.locals import *
import content
def init():
pygame.display.init()
def stuff():
content.morestuff()
if __name__ == '__main__':
init()
stuff()
# content.py
import pygame
from pygame.locals import *
def init():
pygame.display.init()
def morestuff():
"do some more stuff"
if __name__ == '__main__':
init()
morestuff()
This way; init() and thus pygame.display.init() are only ever called once, by the script that was run by the user. the code that runs assuming that init() has already been called is broken into another function, and called as needed by the main script (whatever that happens to be)
import statements are supposed to be declarations that you're using something from another module. They shouldn't be used to cause something to happen (like writing to a file). As noted by Francis Avila, Python will try not to execute the code of a module more than once anyway.
This has the consequence that whether a given import statement will cause anything to happen is a global property of the application at runtime; you can't tell just from the import statement and the source code of the module, because it depends on whether any other code anywhere else in the project has already imported that module.
So having a module "do something" when executed is generally a very fragile way to implement your application. There is no hard and fast definition of "do something" though, because obviously the module needs to create all the things that other modules will import from it, and that may involve reading config files, possibly even writing log files, or any number of other things. But anything that seems like the "action" of your program, rather than just "setting up" things to be imported from the module, should usually not be done in module scope.
Related
I have two python files. One is main.py which I execute. The other is test.py, from which I import a class in main.py.
Here is the sample code for main.py:
from test import Test
if __name__ == '__main__':
print("You're inside main.py")
test_object = Test()
And, here is the sample code for test.py:
class Test:
def __init__(self):
print("you're initializing the class.")
if __name__ == '__main__':
print('You executed test.py')
else:
print('You executed main.py')
Finally, here's the output, when you execute main.py:
You executed main.py
You're inside main.py
you're initializing the class.
From the order of outputs above, you can see that once you import a piece of a file, the whole file gets executed immediately. I am wondering why? what's the logic behind that?
I am coming from java language, where all files included a single class with the same name. I am just confused that why python behaves this way.
Any explanation would be appricated.
What is happening?
When you import the test-module, the interpreter runs through it, executing line by line. Since the if __name__ == '__main__' evaluates as false, it executes the else-clause. After this it continues beyond the from test import Test in main.py.
Why does python execute the imported file?
Python is an interpreted language. Being interpreted means that the program being read and evaluated one line at the time. Going through the imported module, the interpreter needs to evaluate each line, as it has no way to discern which lines are useful to the module or not. For instance, a module could have variables that need to be initialized.
Python is designed to support multiple paradigms. This behavior is used in some of the paradigms python supports, such as procedural programming.
Execution allows the designer of that module to account for different use cases. The module could be imported or run as a script. To accommodate this, some functions, classes or methods may need to be redefined. As an example, a script could output non-critical errors to the terminal, while an imported module to a log-file.
Why specify what to import?
Lets say you are importing two modules, both with a Test-class. If everything from those modules is imported, only one version of the Test-class can exist in our program. We can resolve this issue using different syntax.
import package1
import package2
package1.Test()
packade2.Test()
Alternatively, you can rename them with the as-keyword.
from package1 import Test
from package2 import Test as OtherTest
Test()
OtherTest()
Dumping everything into the global namepace (i.e from test import *) pollutes the namespace of your program with a lot of definitions you might not need and unintentionally overwrite/use.
where all files included a single class with the same name
There is not such requirement imposed in python, you can put multiple classes, functions, values in single .py file for example
class OneClass:
pass
class AnotherClass:
pass
def add(x,y):
return x+y
def diff(x,y):
return x-y
pi = 22/7
is legal python file.
According to interview with python's creator modules mechanism in python was influenced by Modula-2 and Modula-3 languages. So maybe right question is why creators of said languages elected to implement modules that way?
I know that when we do 'import module_name', then it gets loaded only once, irrespective of the number of times the code passes through the import statement.
But if we move the import statement into a function, then for each function call does the module get re-loaded? If not, then why is it a good practice to import a module at the top of the file, instead of in function?
Does this behavior change for a multi threaded or multi process app?
It does not get loaded every time.
Proof:
file.py:
print('hello')
file2.py:
def a():
import file
a()
a()
Output:
hello
Then why put it on the top?:
Because writing the imports inside a function will cause calls to that function take longer.
I know that when we do 'import module_name', then it gets loaded only once, irrespective of the number of times the code passes through the import statement.
Right!
But if we move the import statement into a function, then for each function call does the module get re-loaded?
No. But if you want, you can explicitly do it something like this:
import importlib
importlib.reload(target_module)
If not, then why is it a good practice to import a module at the top of the file, instead of in function?
When Python imports a module, it first checks the module registry (sys.modules) to see if the module is already imported. If that’s the case, Python uses the existing module object as is.
Even though it does not get reloaded, still it has to check if this module is already imported or not. So, there is some extra work done each time the function is called which is unnecessary.
It doesn't get reloaded after every function call and threading does not change this behavior. Here's how I tested it:
test.py:
print("Loaded")
testing.py:
import _thread
def call():
import test
for i in range(10):
call()
_thread.start_new_thread(call, ())
_thread.start_new_thread(call, ())
OUTPUT:
LOADED
To answer your second question, if you import the module at the top of the file, the module will be imported for all functions within the python file. This saves you from having to import the same module in multiple functions if they use the same module.
I am a python beginne, and am currently learning import modules in python.
So my question is:
Suppose I currently have three python files, which is module1.py, module2.py, and module3.py;
In module1.py:
def function1():
print('Hello')
In module2.py, in order to use those functions in module1.py:
import module1
#Also, I have some other public functions in this .py file
def function2():
print('Goodbye')
#Use the function in module1.py
if __name__ == '__main__':
module1.function1();
function2();
In module3.py, I would like to use both the functions from module1.py and module2.py.
import module1
import module2
def function3():
print('Nice yo meet you');
if __name__ == '__main__':
module1.function1()
function3()
module2.function2()
Seems like it works. But my questions are mainly on module3.py. The reason is that in module3.py, I imported both module1 and module2. However, module1 is imported by module2 already. I am just wondering if this is a good way to code? Is this effective? Should I do this? or Should I just avoid doing this and why?
Thank you so much. I am just a beginner, so if I ask stupid questions, please forgive me. Thank you!!
There will be no problem if you avoid circular imports, that is you never import a module that itself imports the current importing module.
A module does not see the importer namespace, so imports in the importer code don't become globals to the imported module.
Also module top-level code will run on first import only.
Edit 1:
I am answering Filipe's comments here because its easier.
"There will be no problem if you avoid circular imports" -> This is incorrect, python is fine with circular imports for the most part."
The fact that you sensed some misconception of mine, doesn't make that particular statement incorrect. It is correct and it is good advice.
(Saying it's fine for the most part looks a bit like saying something will run fine most of time...)
I see what you mean. I avoid it so much that I even thought your first example would give an error right away (it doesn't). You mean there is no need to avoid it because most of the time (actually given certain conditions) Python will go fine with it. I am also certain that there are cases where circular imports would be the easiest solution. That doesn't mean we should use them if we have a choice. That would promote the use of a bad architecture, where every module starts depending on every other.
It also means the coder has to be aware of the caveats.
This link I found here in SO states some of the worries about circular imports.
The previous link is somewhat old so info can be outdated by newer Python versions, but import confusion is even older and still apllies to 3.6.2.
The example you give works well because relevant or initialization module code is wrapped in a function and will not run at import time. Protecting code with an if __name__ == "__main__": also removes it from running when imported.
Something simple like this (the same example from effbot.org) won't work (remember OP says he is a beginner):
# file y.py
import x
x.func1()
# file x.py
import y
def func1():
print('printing from x.func1')
On your second comment you say:
"This is also incorrect. An imported module will become part of the namespace"
Yes. But I didn't mention that, nor its contrary. I just said that an imported module code doesn't know the namespace of the code making the import.
To eliminate the ambiguity I just meant this:
# w.py
def funcw():
print(z_var)
# z.py
import w
z_var = 'foo'
w.funcw() # error: z_var undefined in w module namespace
Running z.py gives the stated error. That's all that I meant.
Now going further, to get the access we want, we go circular...
# w.py
import z # go circular
def funcw():
'''Notice that we gain access not to the z module that imported
us but to the z module we import (yes its the same thing but
carries a different namespace). So the reference we obtain
points to a different object, because it really is in a
different namespace.'''
print(z.z_var, id(z.z_var))
...and we protect some code from running with the import:
# z.py
import w
z_var = ['foo']
if __name__ == '__main__':
print(z_var, id(z_var))
w.funcw()
By running z.py we confirm the objects are different (they can be the same with immutables, but that is python kerning - internal optimization, or implementation details - at work):
['foo'] 139791984046856
['foo'] 139791984046536
Finally I agree with your third comment about being explicit with imports.
Anyway I thank your comments. I actually improved my understanding of the problem because of them (we don't learn much about something by just avoiding it).
I have a module that wraps another module to insert some shim logic in some functions. The wrapped module uses a settings module mod.settings which I want to expose, but I don't want the users to import it from there, in case I would like to shim something there as well in the future. I want them to import wrapmod.settings.
Importing the module and exporting it works, but is a bit verbose on the client side. It results in having to write settings.thing instead of just thing.
I want the users to be able to do from wrapmod.settings import * and get the same results as if they did from mod.settings import * but right now, only from wrapmod import settings is available. How to I work around this?
If I understand the situation correctly, you're writing a module wrapmod that is intended to transform parts of an existing package mod. The specific part you're transforming is the submodule mod.settings. You've imported the settings module and made your changes to it, but even though it is available as wrapmod.settings, you can't use that module name in an from ... import ... statement.
I think the best way to fix that is to insert the modified module into sys.modules under the new dotted name. This makes Python accept that name as valid even though wrapmod isn't really a package.
So wrapmod would look something like:
import sys
from mod import settings
# modify settings here
sys.modules['wrapmod.settings'] = settings # add this line!
I ended up making a code-generator for a thin wrapper module instead, since the sys.module hacking broke all IDE integration.
from ... import mod
# this is just a pass-through wrapper around mod.settings
__all__ = mod.__all__
# generate pass-through wrapper around mod.settings; doesn't break IDE integration, unlike manual sys.modules editing.
if __name__ == "__main__":
for thing in settings.__all__:
print(thing + " = mod." + thing)
which when run as a script, outputs code that can then be appended to the end of this file.
I'm teaching myself Python (I have experience in other languages).
I found a way to import a "module". In PHP, this would just be named an include file. But I guess Python names it a module. I'm looking for a simple, best-practices approach. I can get fancy later. But right now, I'm trying to keep it simple while not developing bad habits. Here is what I did:
I created a blank file named __init__.py, which I stored in Documents (the folder on the Mac)
I created a file named myModuleFile.py, which I stored in Documents
In myModuleFile.py, I created a function:
def myFunction()
print("hello world")
I created another file: myMainFile.py, which I stored in Documents
In this file, I typed the following:
import myModuleFile.py
myModuleFile.myFunction()
This successfully printed out "hello world" to the console when I ran it on the terminal.
Is this a best-practices way to do this for my simple current workflow?
I'm not sure the dot notation means I'm onto something good or something bad. It throws an error if I try to use myFunction() instead of myModuleFile.myFunction(). I kind of think it would be good. If there were a second imported module, it would know to call myFunction() from myModuleFile rather than the other one. So the dot notation makes everybody know exactly which file you are trying to call the function from.
I think there is some advanced stuff using sys or some sort of exotic configuration stuff. But I'm hoping my simple little way of doing things is ok for now.
Thanks for any clarification on this.
For your import you don't need the ".py" extension
You can use:
import myModuleFile
myModuleFile.myFunction()
Or
from myModuleFile import myFunction
myFunction()
Last syntax is common if you import several functions or globals of your module.
Besides to use the "main" function, I'd put this on your module:
from myModuleFile import myFunction
if __name__ == '__main__':
myFunction()
Otherwise the main code could be executed in imports or other cases.
I'd use just one module for myModuleFile.py and myMainFile.py, using the previous pattern let you know if your module is called from command line or as import.
Lastly, I'd change the name of your files to avoid the CamelCase, that is, I'd replace myModuleFile.py by my_module.py. Python loves the lowercase ;-)
You only need to have init.py if you are creating a package (a package in a simple sense is a subdirectory which has one or more modules in it, but I think it may be more complex than you need right now).
If you have just one folder which has MyModule.py and MyMainFile.py - you don't need the init.py.
In MyMainFile.py you can write :
import myModuleFile
and then use
myModuleFile.MyFunction()
The reason for including the module name is that you may reuse the same function name in more than one module and you need a way of saying which module your program is using.
Module Aliases
If you want to you can do this :
import myModuleFile as MyM
and then use
MyM.MyFunction()
Here you have created MyM as an alias for myModuleFile, and created less typing.
Here Lies Dragons
You will sometimes see one other forms of IMport, which can be dangerous, especially for the beginner.
from myModuleFile import MyFunction
if you do this you can use :
MyFunction()
but this has a problem if you have used the same function name in MyMainFile, or in any other library you have used, as you now can't get to any other definition of the name MyFunction. This is often termed Contaminating the namespace - and should really be avoided unless you are absolutely certain it is safe.
there is a final form which I will show for completeness :
from myModuleFile import *
While you will now be able to access every function defined in myModuleFile without using myModuleFile in front of it, you have also now prevented your MyMainFile from using any function in any library which matches any name defined in myModuleFile.
Using this form is generally not considered to be a good idea.
I hope this helps.