Import as statement working differently for different modules? - python

I am learning Python, and right now I am learning about the import statements in Python. I was testing out some code, and I came across something unusual. Here is the code I was testing.
from math import pow as power
import random as x
print(pow(2, 3))
print(power(2, 3))
print(x.randint(0, 5))
print(random.randint(0, 5))
I learned that in Python, you can reassign the names of modules using as, so I reassigned pow to power. I expected both pow(2, 3) and power(2, 3) to output the exact same stuff because all I did was change the name. However, pow(2, 3) outputs 8, which is an integer, while power(2, 3) outputs 8.0, which is a float. Why is that?
Furthermore, I imported the random module as well, and set its name to be x. In the case of the pow and power, both the old name, pow, and the new name, power, worked. But with this random module, only the new name, x, works, and the old name, random, doesn't work. print(x.randint(0, 5)) works, but random.randint(0, 5) doesn't work. Why is this so?
Can anyone please explain to a Python newbie such as myself why my code is not working the way I expect it to? I am using Python version 3.62, if that helps.

That's because when you import pow from math as power and then you call pow, the pow you are calling is a built-in function, not the pow from the math module.
For random there is no built-in function in python so you only import the x one.
The pow built-in function documentation

when you use pow you are actually using the inbuilt pow function.
but there is no inbuilt function called random thus it does not work
normally in python if you use 'as' you can only use the module what you imported it as not what it was originally called

Related

What does random.seed([list of int]) do?

I know what random.seed(int) does, like below:
random.seed(10)
But I saw a code which uses random.seed([list of int]), like below:
random.seed([1, 2, 1000])
What is the difference between passing a list and int to random.seed ?
The answer is basically in the comments, but putting it together: it appears the code you found imports random from numpy, instead of importing the standard Python random module:
from numpy import random
random.seed([1, 2, 1000])
Not recommended, to avoid exactly the confusion you're running into.
numpy can use a 1d array of integers as a seed (presumably because it uses a different pseudo-random function than Python itself to generate 'random' numbers, which can use a more complex seed), as described in the documentation for numpy.RandomState

In python, what's the difference between math.sin(x) and sin(x)?

Which one is better:
import math
math.sin(x)
or
from math import *
sin(x)
And what's the difference?
Try to avoid using a wildcard from module import *, since you dont know what exactly you have imported into your current namespace. This can lead to confusion if there are conflicts in the names of such items.
You should use from math import sin since it would make it very clear that you only require sin.
Using (import math and using math.sin) or from math import sin is more of a personal choice unless there is another variable or function having same name in the namespace. If such a case is happening, then using math.sin is the better way
It's mostly about reading it nicer, so that you know where sin came from.
However there is the danger of importing modules with clashing names. If another module has something named 'sin' that you also import with the * wildcard, then you actually will only have one of them.
Using math.sin is extra explicit to avoid those cases. And when you write larger programs where you've imported whole modules indiscriminately, it's easier to miss that you have clashing names.

When is 'from SomeFile import *' okay

This is sort of a best practice question. If I have a large number of custom classes, but don't want them in my main program is it acceptable to stick `from someFile import *' at the top? I know that I won't be doing anything to edit/redefine the classes, functions and variables from that file.
from someFile import *
a = someFunction()
#Other stuff
someFile would just contain various custom classes and function that I know work and I don't need to scroll past every time I'm working in the program. As long as I'm careful is there any reason to not do this?
If you are using a lot of classes, it is usually safer to avoid the use of this syntax. Especially if you use third-party classes, because some of them may have the same methods. (eg. sin, cos, etc) and you can get strange behaviors.
In my opinion, it is acceptable to use this syntax when you provide an example of the use of your code. In this way you only source your method like that to show the functionalities in a more clear way.
Personally I try to avoid this syntax. I prefer to explicitly call the "right" class. If you don't like to write long class/modules names, try just to load them as aliases like
import LongModuleName as LM
http://www.python.org/dev/peps/pep-0008/#imports
According to pep-8:
There is one defensible use case for a wildcard import, which is to republish an internal interface as part of a public API (for example, overwriting a pure Python implementation of an interface with the definitions from an optional accelerator module and exactly which definitions will be overwritten isn't known in advance).
In my opinion, probably the most useful case of the from someFile import * syntax is when you are using the python interpreter interactively. Image you want to do some quick math:
$ python
>>> from math import *
>>> sin(4.0)
This is especially useful when using pylab and ipython to turn your ipython session into a MATLAB clone with just the line from pylab import *.
One problem with from module import * is that it makes it difficult to see where different functions came from, compare:
import random
random.shuffle(...)
from random import shuffle
shuffle(...)
from random import *
shuffle(...)
Another is risk of name collision; if two modules have classes or functions with the same name, import * will shadow one with the other. With from x import y you can see right at the top that you might need from z import y as y2.
Using from someFile import * is fine as long as you're aware you're importing everything into the namespace of that module. This could lead to clashes if you happen to be reusing names.
Consider only importing the classes and functions that you really use in that module as it's more explicit for others who may be reading the source code. It also makes it easier to figure out where a function is defined: if I see a function and can't locate an explicit import line in the same file then I'm forced to grep / search more broadly to see where the function is defined.

Why is "import" implemented this way?

>>> import math
>>> math.pi
3.141592653589793
>>> math.pi = 3
>>> math.pi
3
>>> import math
>>> math.pi
3
Initial question: Why can't I get math.pi back?
I thought import would import all the defined variables and functions to the current scope. And if a variable name already exists in current scope, then it would replace it.
Yes, it does replace it:
>>> pi = 3
>>> from math import *
>>> pi
3.141592653589793
Then I thought maybe the math.pi = 3 assignment actually changed the property in the math class(or is it math module?), which the import math imported.
I was right:
>>> import math
>>> math.pi
3.141592653589793
>>> math.pi = 3
>>> from math import *
>>> pi
3
So, it seems that:
If you do import x, then it imports x as a class-like thing. And if you make changes to x.property, the change would persist in the module so that every time you import it again, it's a modified version.
Real question:
Why is import implemented this way? Why not let every import math import a fresh, unmodified copy of math? Why leave the imported math open to change?
Is there any workaround to get math.pi back after doing math.pi = 3 (except math.pi = 3.141592653589793, of course)?
Originally I thought import math is preferred over from math import *. But this behaviour leaves me worrying someone else might be modifying my imported module if I do it this way...How should I do the import?
Python only creates one copy of any given module. Importing a module repeatedly reuses the original. This is because if modules A and B imported C and D, which imported E and F, etc., C and D would get loaded twice, and E and F would get loaded 4 times, etc. With any but the most trivial of dependency graphs, you'd spend a few minutes loading redundant modules before running out of memory. Also, if A imported B and B imported A, you'd get stuck in a recursive loop and, again, run out of memory without doing anything useful.
The solution: Don't screw with the contents of other modules. If you do, that's an interpreter-wide change. There are occasionally situations where you'd want to do this, so Python lets you, but it's usually a bad idea.
A module may be imported many times. An import statement just loads the reference from sys.modules. If the import statement also reloaded the module from disk, it would be quite slow. Modifying a module like this is very unusual and is only done under rare, documented circumstances, so there’s no need to worry.
How to reload a module:
>>> import imp
>>> imp.reload(math)
<module 'math' (built-in)>
>>> math.pi
3.141592653589793
The import behavior is intended to allow modules to have state. For example, a module that runs initialization code may have all sorts of different behaviors based on what happens at init time (a good example is the os module, which transparently loads different versions of the path submodule depending on what OS you're on). The usual behavior exists to allow lots of different code to access the module without re-running the initialization over and over. Moreover, modules function sort of like static classes in other languages - they can maintain state and are often used as an alternative to global variables: eg, you might use the locale module to set local culture variables (currency format, etc) -- calling locale.setlocale in one part of your code and local.getlocale in another is a nice alternative to making a global variable.
Your example, of course, points out the weakness. One of the classic python principes is
We're all adults here
The language does not provide much of the privacy management features you'd find in, say, Java or C# which let the author lock down the contents of a module or class. You can, if you're feeling malicious (or just suicidal) do exactly the sort of thing done in your example: change pi to equal 3, or turn a function into a variable, or all sorts of other nasty stuff. The language is not designed to make that hard -- it's up to coders to be responsible.
#Josh Lee's answer shows how to use reload, which is the correct way of refreshing a module to it's disk-based state. The wisdom of using reload depends mostly on how much init code is in the module, and also on the web of other modules which import or are imported by the module in question.

Why does built-in sum behave wrongly after "from numpy import *"?

I have some code like:
import math, csv, sys, re, time, datetime, pickle, os, gzip
from numpy import *
x = [1, 2, 3, ... ]
y = sum(x)
The sum of the actual values in x is 2165496761, which is larger than the limit of 32bit integer. The reported y value is -2129470535, implying integer overflow.
Why did this happen? I thought the built-in sum was supposed to use Python's arbitrary-size integers?
See How to restore a builtin that I overwrote by accident? if you've accidentally done something like this at the REPL (interpreter prompt).
Doing from numpy import * causes the built-in sum function to be replaced with numpy.sum:
>>> sum(xrange(10**7))
49999995000000L
>>> from numpy import sum
>>> sum(xrange(10**7)) # assuming a 32-bit platform
-2014260032
To verify that numpy.sum is in use, try to check the type of the result:
>>> sum([721832253, 721832254, 721832254])
-2129470535
>>> type(sum([721832253, 721832254, 721832254]))
<type 'numpy.int32'>
To avoid this problem, don't use star import.
If you must use numpy.sum and want an arbitrary-sized integer result, specify a dtype for the result like so:
>>> sum([721832253, 721832254, 721832254],dtype=object)
2165496761L
or refer to the builtin sum explicitly (possibly giving it a more convenient binding):
>>> __builtins__.sum([721832253, 721832254, 721832254])
2165496761L
The reason why you get this invalid value is that you're using np.sum on a int32. Nothing prevents you from not using a np.int32 but a np.int64 or np.int128 dtype to represent your data. You could for example just use
x.view(np.int64).sum()
On a side note, please make sure that you never use from numpy import *. It's a terrible practice and a habit you must get rid of as soon as possible. When you use the from ... import *, you might be overwriting some Python built-ins which makes it very difficult to debug. Typical example, your overwriting of functions like sum or max...
Python handles large numbers with arbitrary precision:
>>> sum([721832253, 721832254, 721832254])
2165496761
Just sum them up!
To make sure you don't use numpy.sum, try __builtins__.sum() instead.

Categories