'import' and 'from import' behaves differently - python

As per my understanding, the difference between 'import' and 'from import' in Python is: 'import' imports the whole library, 'from import' imports a specific member or members of the library, and there should not be any behavioral difference.
As per this, I was expecting both the test1.py and test2.py to show the same result i.e. "Lucky". But it is different for test2.py. Can anyone explain why?
mymodule.py
message = "Happy"
def set_message(msg):
global message
message = msg
test1.py
import mymodule
mymodule.set_message("Lucky")
print(mymodule.message) #output is Lucky
test2.py
from mymodule import *
set_message("Lucky")
print(message) #output is Happy

'import' imports the whole library, 'from import' imports a specific member or members of the library, and there should not be any behavioral difference.
Both versions - assuming the module has not already been imported - cause the top-level code to execute and a module object to be created.
import ... means that the module object is assigned to the corresponding name in the local namespace. (The reason the . syntax works is that the global variables from when the module's code was running, become attributes of that modle object.)
from ... import * means that Python iterates over the attributes of the module object, and assigns each name into the current namespace, similarly.
For subsequent code, the module object itself is the global namespace in which its functions run.
Your test2.py calls a method from mymodule, causing it to do a global lookup of message. That lookup finds the message attribute of the module object, and replaces it. The message global variable in your own code is unchanged, because neither was that name reassigned, nor was the value modified (it was replaced). It is the same as if you do:
# Functions are the easiest way to get an object with mutable attributes
def namespace(): pass
namespace.a = 3
a = namespace.a # "import" the name
namespace.a = 4 # replace the value; `a` does not change

Related

Module namespace initialisation before execution

I'm trying to dynamically update code during runtime by reloading modules using importlib.reload. However, I need a specific module variable to be set before the module's code is executed. I could easily set it as an attribute after reloading but each module would have already executed its code (e.g., defined its default arguments).
A simple example:
# module.py
def do():
try:
print(a)
except NameError:
print('failed')
# main.py
import module
module.do() # prints failed
module.a = 'succeeded'
module.do() # prints succeeded
The desired pseudocode:
import_module_without_executing_code module
module.initialise(a = 'succeeded')
module.do()
Is there a way to control module namespace initialisation (like with classes using metaclasses)?
It's not usually a good idea to use reload other than for interactive debugging. For example, it can easily create situations where two objects of type module.A are not the same type.
What you want is execfile. Pass a globals dictionary (you don't need an explicit locals dictionary) to keep each execution isolated; anything you store in it ahead of time acts exactly like the "pre-set" variables you want. If you do want to have a "real" module interface change, you can have a wrapper module that calls (or just holds as an attribute) the most recently loaded function from your changing file.
Of course, since you're using Python 3, you'll have to use one of the replacements for execfile.
Strictly speaking, I don't believe there is a way to do what you're describing in Python natively. However, assuming you own the module you're trying to import, a common approach with Python modules that need some initializing input is to use an init function.
If all you need is some internal variables to be set, like a in you example above, that's easy: just declare some module-global variables and set them in your init function:
Demo: https://repl.it/MyK0
Module:
## mymodule.py
a = None
def do():
print(a)
def init(_a):
global a
a = _a
Main:
## main.py
import mymodule
mymodule.init(123)
mymodule.do()
mymodule.init('foo')
mymodule.do()
Output:
123
foo
Where things can get trickier is if you need to actually redefine some functions because some dynamic internal something is dependent on the input you give. Here's one solution, borrowed from https://stackoverflow.com/a/1676860. Basically, the idea is to grab a reference to the current module by using the magic variable __name__ to index into the system module dictionary, sys.modules, and then define or overwrite the functions that need it. We can define the functions locally as inner functions, then add them to the module:
Demo: https://repl.it/MyHT/2
Module:
## mymodule.py
import sys
def init(a):
current_module = sys.modules[__name__]
def _do():
try:
print(a)
except NameError:
print('failed')
current_module.do = _do

Python - Unexpected Import Occuring

I'm hoping someone can provide some insight on some extra name bindings that Python3 is creating during an import. Here's the test case:
I created a test package called spam (original, I know). It contains 3 files as follows:
The contents of the files are as follows:
__init__.py:
from .foo import Foo
from .bar import Bar
foo.py:
def Foo():
pass
bar.py:
def Bar():
pass
Pretty simple stuff. When I import the spam package, I can see that it creates name bindings to the Foo() and Bar() functions in the spam namespace, which is expected. What isn't expected is that it also binds a name to the foo and bar modules in the spam namespace, as shown below.
What's even more interesting is that these extra name bindings to the module don't occur if I import the Foo() and Bar() functions in __main__, as shown below:
Reading through the documentation on the import statement (language ref and tutorial), I don't see anything that would cause this to be.
Can anyone shed some light on why, when importing a function from a module inside a package, it also binds a name to the module containing the function?
Yes - that is correct, and part of Python import mechanism.
When you import a module a lot of things happen, but we can focus on a few:
1) Python checks if the module is already loaded - to that means, it checks if it is qualifyed bane (name with dots) is under sys.modules
2) If not, it actually loads the module: that includes checking for pre-compiled cached bytecode files, parse, compile the .py file otherwise, etc...
3) It actually makes the name bindings as they are in the import command: that is "from .foo import Foo" creates a variable "Foo" in the current namespace that points to the "spam.foo.Foo" object.
Perceive that the module is always loaded as a whole - and associated in the sys.modules dictionary. Besides that, the import process makes all sub-modules to a package available in the module namespace visible in that package - that is what causes the names "foo" and "bar" do be visible in your Spam package.
You could, at the end of your __init__.py file delete the names "foo" and "bar" - but that will break the way expected import and usage of spam.foo works in funamental ways - basically: sys.modules["spam.foo"] will exist, butsys.modules["spam"].foo wont - meaning that after
one tries to do:
import spam.foo
spam.foo.Foo()
Python will yield a name error on "foo".
The import machinery will report it as existing (it is in sys.modules), so it does nothing. But "spam.foo" has been removed, so it can't be reached.

How do I detect if a class / variable was imported in Python 3?

This is the contents of script_one.py:
x = "Hello World"
This is the contents of script_two.py:
from script_one import x
print(x)
Now, if I ran script_two.py the output would be:
>>> Hello World
What I need is a way to detect if x was imported.
This is what I imagine the source code of script_one.py would look like:
x = "Hello World"
if x.has_been_imported:
print("You've just imported \"x\"!")
Then if I ran script_two.py the output "should" be:
>>> Hello World
>>> You've just imported "x"!
What is this called, does this feature exist in Python 3 and how do you use it?
You can't. Effort expended on trying to detect this are a waste of time, I'm afraid.
Python imports consist of the following steps:
Check if the module is already loaded by looking at sys.modules.
If the module hasn't been loaded yet, load it. This creates a new module object that is added to sys.modules, containing all objects resulting from executing the top-level code.
Bind names in the importing namespace. How names are bound depends on the exact import variant chosen.
import module binds the name module to the sys.modules[module] object
import module as othername binds the name othername to the sys.modules[module] object
from module import attribute binds the name attribute to the sys.modules[module].attribute object
from module import attribute as othername binds the name othername to the sys.modules[module].attribute object
In this context it is important to realise that Python names are just references; all Python objects (including modules) live on a heap and stand or fall with the number of references to them. See this great article by Ned Batchelder on Python names if you need a primer on how this works.
Your question then can be interpreted in two ways:
You want to know the module has been imported. The moment code in the module is executed (like x = "Hello World"), it has been imported. All of it. Python doesn't load just x here, it's all or nothing.
You want to know if other code is using a specific name. You'd have to track what other references exist to the object. This is a mammoth task involving recursively checking the gc.get_referrers() object chain to see what other Python objects might now refer to x.
The latter goal is made the harder all the further in any of the following scenarios:
import script_one, then use script_one.x; references like these could be too short-lived for you to detect.
from script_one import x, then del x. Unless something else still references the same string object within the imported namespace, that reference is now gone and can't be detected anymore.
import sys; sys.modules['script_one'].x is a legitimate way of referencing the same string object, but does this count as an import?
import script_one, then list(vars(script_one).values()) would create a list of all objects defined in the module, but these references are indices in a list, not named. Does this count as an import?
Looks like it is impossible previously. But ever since python 3.7+ introduces __getattr__ on module level, looks like it is possible now. At least we can distinguish whether a variable is imported by from module import varable or import module; module.variable.
The idea is to detect the AST node in the previous frame, whether it is an Attribute:
script_one.py
def _variables():
# we have to define the variables
# so that it dosen't bypass __getattr__
return {'x': 'Hello world!'}
def __getattr__(name):
try:
out = _variables()[name]
except KeyError as kerr:
raise ImportError(kerr)
import ast, sys
from executing import Source
frame = sys._getframe(1)
node = Source.executing(frame).node
if node is None:
print('`x` is imported')
else:
print('`x` is accessed via `script_one.x`')
return out
script_two.py
from script_one import x
print(x)
# `x` is imported
# 'Hello world!'
import script_one
print(script_one.x)
# `x` is accessed via `script_one.x`
# 'Hello world!'

Python error importing a child module

parent/__init__.py:
favorite_numbers = [1]
def my_favorite_numbers():
for num in favorite_numbers:
num
my_favorite_numbers()
from .child import *
my_favorite_numbers()
parent/child.py:
print favorite_numbers
favorite_numbers.append(7)
I then created a file one directory up from parent directory named tst.py:
import parent
So the directory structure looks like this:
parent (directory)
__init__.py (file)
child.py (file)
tst.py (file)
And I get this error upon execution:
NameError: name 'favorite_numbers' is not defined
How can I add a value to favorite_numbers within child.py so that when I execute the my_favorite_numbers() function, I get 1 and 7.
In Python, each module has its own separate globals. That's actually the whole point of modules (as opposed to, say, C preprocessor-style text inserts).
When you do from .child import *, that imports .child, then copies all of its globals into the current module's globals. They're still separate modules, with their own globals.
If you want to pass values between code in different modules, you probably want to wrap that code up in functions, then pass the values as function arguments and return values. For example:
parent/__init__.py:
from .child import *
favorite_numbers = [1]
def my_favorite_numbers():
for num in favorite_numbers:
num
my_favorite_numbers()
child_stuff(favorite_numbers)
my_favorite_numbers()
parent/child.py:
def child_stuff(favorite_numbers):
print favorite_numbers
favorite_numbers.append(7)
In fact, you almost always want to wrap up any code besides initialization (defining functions and classes, creating constants and other singletons, etc.) in a function anyway. When you import a module (including from … import), that only runs its top-level code the first time. If you import again, the module object already exists in memory (inside sys.modules), so Python will just use that, instead of running the code to build it again.
If you really want to push a value into another module's namespace, you can, but you have to do it explicitly. And this means you have to have the module object available by importing it, not just importing from it:
from . import child
child.favorite_numbers = favorite_numbers
But this is rarely a good idea.
Did you ever run setup.py or a way of "building" your library?
I would create a setup.py file and likely run it in develop mode. Python setup.py develop vs install

importing with * (asterisk) versus as a namespace in python

I know that its bad form to use import * in python, and I don't plan to make a habit of it. However I recently came across some curious behaviour that I don't understand, and wondered if someone could explain it to me.
Lets say I have three python scripts. The first, first_script.py, comprises:
MESSAGE = 'this is from the first script'
def print_message():
print MESSAGE
if __name__ == '__main__':
print_message()
Obviously running this script gives me the contents of MESSAGE. I have a second script called second_script.py, comprising:
import first_script
first_script.MESSAGE = 'this is from the second script'
if __name__ == '__main__':
first_script.print_message()
The behaviour (prints this is from the second script) makes sense to me. I've imported first_script.py, but overwritten a variable within its namespace, so when I call print_message() I get the new contents of that variable.
However, I also have third_script.py, comprising:
from first_script import *
MESSAGE = 'this is from the third script'
if __name__ == '__main__':
print MESSAGE
print_message()
This first line this produces is understandable, but the second doesn't make sense to me. My intuition was that because I've imported into my main namespace via * in the first line, I have a global variable
called MESSAGES. Then in the second line I overwrite MESSAGES. Why then does the function (imported from the first script) produce the OLD output, especially given the output of second_script.py. Any ideas?
import module, from module import smth and from module import *can have different use cases.
The simpler:
import tools
loads the tools module and adds a reference to it in the local namespace (also named tools). After that you can access any of the tools references by prepending tools to them like for example tools.var1
Variant:
import tools as sloot
Does exactly the same, but you use the alias to access the references from the module (eg: sloot.var1). It is mainly used for module having well known aliases like import numpy as np.
The other way
from tools import foo
directly imports some symbols from the tools module in the current namespace. That means that you can only use the specified symbols by they do not need to be qualified. A nice use case is when you could import a symbol from different modules giving same functionalities. For example
try:
from mod1 import foo
except ImportError:
from mod2 import foo
...
foo() # actually calls foo from mod1 if available else foo from mod2
This is commonly use as a portability trick.
The danger:
from tools import *
It is a common idiom, but may not do what you expect if the module does not document it. In fact, it imports all the public symbols from the module, by default all the symbols that have no initial _ which can contain unwanted things. In addition, a module can declare a special variable __all__ which is assumed to declare the public interface, and in that case only the symbols contained in __all__ will be imported.
Example:
mod.py
__all__ = ['foo', 'bar']
def baz(x):
return x * 2
def foo():
return baz('FOO')
def bar():
return baz('BAR')
You can use (assuming mod.py is accessible)
from mod import *
print(foo()) # should print FOOFOO
# ERROR HERE
x = baz("test") # will choke with NameError: baz is not defined
while
import mod
print(mod.baz("test")) # will display as expected testtest
So you should only use from tools import * if the documentation of the tools module declares it to be safe and lists the actually imported symbols.
This has to do with Scope. For a very excellent description of this, please see Short Description of the Scoping Rules?
For a detailed breakdown with tons of examples, see http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/scope_resolution_legb_rule.ipynb
Here's the details on your specific case:
The print_message function being called from your third test file is being asked to print out some MESSAGE object. This function will use the standard LEGB resolution order to identify which object this refers to. LEGB refers to Local, Enclosing function locals, Global, Builtins.
Local - Here, there is no MESSAGES defined within the print_message function.
Enclosing function locals - There are no functions wrapping this function, so this is skipped.
Global - Any explicitly declared variables in the outer code. It finds MESSAGE defined in the global scope of the first_script module. Resolution then stops, but i'll include the others for completeness.
Built-ins - The list of python built-ins, found here.
So, you can see that resolution of the variable MESSAGE will cease immediately in Global, since there was something defined there.
Another resource that was pointed out to me for this is Lexical scope vs Dynamic scope, which may help you understand scope better.
HTH
Direct assignment changes the reference of an object, but modification does not. For example,
a = []
print(id(a))
a = [0]
print(id(a))
prints two different IDs, but
a = []
print(id(a))
a.append(0)
print(id(a))
prints the same ID.
In second_script.py, the assignment merely modifies first_script, which is why both first_script.py and second_script.py can locate the same attribute MESSAGE of first_script. In third_script.py, the direct assignment changes the reference of MESSAGE; therefore, after the assignment, the variable MESSAGE in third_script.py is a different variable from MESSAGE in first_script.py.
Consider the related example:
first_script.py
MESSAGE = ['this is from the first script']
def print_message():
print(MESSAGE)
if __name__ == '__main__':
print_message()
third_script.py
from first_script import *
MESSAGE.append('this is from the third script')
if __name__ == '__main__':
print(MESSAGE)
print_message()
In this case, third_script.py prints two identical messages, demonstrating that names imported by import * can still be modified.

Categories