Why mock patching works with random but not with np? - python

I have a module where a number of different functions use random numbers or random choices.
I am trying to use mock and patch to inject pre-chosen values in place of these random selections but can't understand an error I am receiving.
In the function I am testing, I use
np.random.randint
when I use the code
from unittest import mock
import random
mocked_random_int = lambda : 7
with mock.patch('np.random.randint', mocked_random_int):
I get an error message no module named np. However, numpy is imported as np and other functions are calling it just fine.
Even more perplexing if I edit the code above to remove the 'np' at the front it does what I want:
with mock.patch('random.randint', mocked_random_int):
But I want to understand why the code works without the np. Thank you!

There is a difference between a module or package name and the variable it is assigned to in any given namespace. A simple import
import numpy
tells python to check its imported module list, import numpy as necessary, and assign the module to the variable "numpy"
import numpy as np
is almost the same, except that you assign to a variable "np". Its still the same numpy package, its just that you've aliased it differently.
mock.patch will import and patch the module regardless of whether you've already imported it, but you need to give the module name, not your current module's alias to the module.

Related

Why does import package.module.function fail in python?

I've tried googling (to little avail) to more clearly understand what different meaning period . has during an import statement, vs. once a module has already been imported.
For example, these all work:
import numpy
X = numpy.random.standard_normal
from numpy.random import standard_normal
import numpy.random
but this doesn't work:
import numpy.random.standard_normal
I'm a bit confused as to why this is. Why is there a difference in what the period . does when accessing a module before vs. after an import?
It's because standard_normal is a method
<built-in method standard_normal of numpy.random.mtrand.RandomState object at 0x0000029D722FBD40>
whenever you do from numpy.random import standard_normal you are importing the method
and i don't think you can do this import numpy.random.standard_normal cause standard_normal again is a method, this would be possible if standard_normal would be some module.
Take a look at this, you when I typed dir(standard_normal) I get the output of those things which are attributes and when I typed standard_normal it says <built-in method standard_normal of numpy.random.mtrand.RandomState object at 0x000002509D504740> cause it simply says it is a method
Now when I did this import numpy.random.standard_normal , you are expecting to import the method right? But what it really does is trying to import a module, Well... there is no such thing as standard_normal module or standard_normal.py file.
Take a look at this again. I imported the random module and I used the . operator to access the standard_normal function. You can see the sense of it right. Cause on the random.py module it has there a standard_normal function or method.
Sorry I had to use the CMD.

Correct way to create shareable module (package imports)

I want to create a module that I will share with others however I am quite new to this and am having issues with the final step of tidying it up for other's use. Imagine it is called something like my_module.py and looks like this:
import pandas as pd
def function_1(a,b):
return a*b
def function_2(c,d):
return pd.DataFrame(data=c,columns=d)
I want this to be able to be imported by someone else, so that they can use the underlying functions like:
my_module.function_1(a=5,b=2)
and so on. However, if I do import my_module then my_module.pd also appears in the autocomplete (as in the pandas import that my_module.py made).
This seems like terrible practice to me. So, what is the correct way to load these imports?
Ideally, this would be shareable so that someone could install it the way someone would install a stats module. I'm fine if the solution is just some kind of thing that checks to make sure things are imported in certain ways.
There's nothing inherently wrong with what you are doing. Your module requires pandas so you must import it. PEP8 specifies imports should go at the top, not nested within the functions. Doing this will add it as an attribute when you then import my_module. Because you are building upon pandas you cannot just share your module, you also need to share pandas (or check that they already have pandas installed with the correct or sufficient version).
Still, it might be overkill to import the entire pandas library when you have a single function that only uses the DataFrame class. In that case you can do:
from pandas import DataFrame
def function_1(a,b):
return a*b
def function_2(c,d):
return DataFrame(data=c,columns=d)
Now my_module will only have the .DataFrame class attached to it, not the entire pandas library. If you do wind up using more and more of the pandas library in your module, then importing separate parts is more of a nuisance, so just import pandas.
And to use pandas as an example, it's built upon numpy. Underlying every DataFrame is a numpy.ndarray so you may not have noticed it, but numpy is there:
import pandas as pd
pd.np?
Type: module
String form: <module 'numpy' from 'c:\\program files\\python36\\lib\\site-packages\\numpy\\__init__.py'>
File: c:\program files\python36\lib\site-packages\numpy\__init__.py
Docstring:
NumPy
=====
You can make it much more difficult to access the pandas attribute, but you need to reorganize how you distribute your Library. Let's say you want to share a library called MyLibrary which could be composed of several modules (that we will put with the module folder). They could each have their own functions, with names that should not overlap which we will need to import in a separate python script (api.py). Then you would do:
MyLibrary/
__init__.py
modules/
MyModule1.py
api.py
where we have the files:
__init__.py
from MyLibrary.modules.api import *
api.py
from MyLibrary.modules.MyModule1 import function_1, function_2
MyModule1.py
import pandas as pd
def function_1(a,b):
return a*b
def function_2(c,d):
return pd.DataFrame(data=c,columns=d)
Now we have access to the functions, but pd is no longer there:
import MyLibrary
MyLibrary.function_2([1], ['a'])
# a
#0 1
MyLibrary.pd
#AttributeError: module 'MyLibrary' has no attribute 'pd'
To be fair, pd is there, it's just hidden away much further down in MyLibrary.modules.MyModule1.pd. But then again, pandas has numpy everywhere. It's in pd.core.reshape.concat.np, pd.core.reshape.merge.np, pd.core.common.np and really almost every file, you cannot avoid it.

What's the correct way for importing the whole module as well as a couple of its functions in Python?

I know that from module import * will import all the functions in current namespace but it is a bad practice. I want to use two functions directly and use module.function when I have to use any other function from the module. What I am doing currently is:
import module
from module import func1, func2
# DO REST OF MY STUFF
Is it a good practice? Does the order of first two statements matter?
Is there a better way using which I can use these two functions directly and use rest of the functions as usual with the module's name prepended to them?
Using just import module results in very long statements with a lot of repetition if I use the same function from the given module five times in a single statement. That's what I want to avoid.
The order doesn't matter and it's not a pythonic way. When you import the module there is no need to import some of its functions separately again. If you are not sure how many of the functions you might need to use just import the module and access to the functions on demand with a simple reference.
# The only import you need
import module
# Use module.funcX when you need any of its functions
After all, if you want to use some of your functions (much) more than the others, as the cost of attribute access is greater than importing the functions separately, you better to import them as you've done.
And still, the order doesn't matter. You can do:
import module
from module import func1, func2
For more info read the documentation https://www.python.org/dev/peps/pep-0008/#imports
It is not good to do (may be opinion based):
import module
from module import func1, func2 # `func1` and `func2` are already part of module
Because you already hold a reference to module.
If I were you, I would import it in the form of import module. Since your issue is that module.func1() becomes too long. I may import the module and use as for creating a alias for the name. For example:
import module as mo
# ^ for illustration purpose. Even the name of
# your actual module wont be `module`.
# Alias should also be self-explanatory
# For example:
import database_manager as db_manager
Now I may access the functions as:
mo.func1()
mo.func2()
Edit: Based on the edit in actual question
If your are calling same function in the same line, there is possibility that your are already doing some thing wrong. It will be great if you can share what your that function does.
For example: Want to the rertun value of those functions to be passed as argument to another function? as:
test_func(mo.func1(x), mo.func1(y). mo.func1(z))
could be done as:
params_list = [x, y, z]
func_list = [mo.func1(param) for param in params_list]
test_func(*func_list)

When importing my class I lose access to functions from other modules

I'm trying to learn how to do object oriented coding for scientific computing running a simulation; I'm using using numpy, etc. I've created my first class, WC_unit, which is located at ./classes/WC_class.py (a subdirectory). I've created an __init__.py file (which is empty) in the classes directory.
The methods for the WC_unit class require some numpy functions, like exp
When I run the code (in ipython) from the terminal, using
%run WC_class.py
I can generate an instance of the class E1 = WC_unit() and I can run the associated methods on it, ie E1.update()
I can't really tell if it's working. I wrote some outer code in a script test.py located at . (above ./classes) to test the objects I'm generating and I'm trying to import the class by using
from classes.WC_class import WC_unit
Now, when I create an instance E1 of the class and run E1.update(), I get the error message global name 'exp' is not defined.
I've tried calling from numpy import * or also import numpy as np and changing the function call to np.exp() and I continue to get the error. Thinking that I had some sort of scoping problem or issues with namespace I've put this same import function at various locations, including in the test.py file, the top of the class file WC_class.py, even in the method:
class WC_unit:
def __init__(self): [assign default pars from a dict including r, dt, tau, and Iapp]...
def update(self):
from numpy import *
self.r += self.dt/self.tau * (-self.r + exp(self.Iapp))
I would really like to up my game and figure out how to write my own classes and use them with the awesome computing tools. I guess I'd like to know:
What am I doing wrong (probably a lot, I suspect). I think it's something with how I'm importing my class? but perhaps also scoping in the class itself.
Why does my class lose access to the numpy functions when I import it, but not when I run it like a script in the terminal?
I guess I also generally don't understand why people are so protective of their namespaces, i.e. why do so many code examples show import numpy as np and use all of the functions as np.exp(x), etc. I don't have much of a computer science background so I could benefit a lot from any explanations you could provide- the documentation is kind of cryptic to me.
Python version: 2.7.8 |Anaconda 2.1.0 (x86_64)| (default, Aug 21 2014, 15:21:46)
[GCC 4.2.1 (Apple Inc. build 5577)]
On Mac OSX 10.6.8
When you call %run WC_class.py in IPython, what you are doing is loading the contents of that source file directly into the interactive namespace. Because you've already called from numpy import * within your IPython session, exp is defined as numpy.exp within the set of globals for the current 'module' (which, in this case, is just the IPython interactive namespace), so when you call exp() in WC_unit.update() (or anywhere else within WC_class.py) it will work fine.
However, you do not do a from numpy import * at the top of test.py, therefore when you import WC_unit into your script exp has not been defined within the scope of the current module (which is now the test script).
You've tried from numpy import * within the WC_unit.update() method itself, but this will fail because import * is only allowed at a module level (in fact you should have seen a SyntaxWarning about this when you tried to import WC_unit!). Since the import fails, exp is still undefined and the WC_unit.update() method will raise the NameError you're seeing.
What you ought to do is have a single import line at the top of any source file that uses numpy functions:
import numpy as np
then refer to any numpy functions via the np. namespace.
Regarding your third point, the main reason to do
import numpy as np
x = np.exp(y) # etc.
rather than
from numpy import *
x = exp(y) # etc.
is that the latter method pollutes your global namespace.
Suppose you had already defined your own function called exp. When you do from numpy import *, you will be overwriting your own function called exp with numpy.exp, so when you later call exp(y) it might not do what you expect it to. For example, this is exactly what happens to some of the built-in Python functions such as sum and all:
print(sum.__module__)
# __builtin__
from numpy import *
print(sum.__module__)
# numpy.core.fromnumeric
What's more, this is more-or-less irreversible - once you've done a from module import * there's no easy way to get rid of the stuff you've imported to your namespace (or restore any old modules or variables you've clobbered by importing over the top of them).
As long as you keep all of the contents of each module in its own separate namespace there is no risk of namespace collisions, and no ambiguity about where each function or class comes from. By convention we use np to refer to the namespace for numpy, plt for matplotlib.pyplot etc.

python: importing module in package namespace

I wonder if there is some standard way to do something like
import scipy as sp
from scipy import interpolate as sp.interpolate
that is not allowed.
Specifically:
I'd like to know if there is some reason why the above is not allowed. If I'm developing my own package foo, it seems reasonable to pollute its namespace as little as possible.
Things like
import scipy as sp
__import__('scipy.interpolate')
do the job, but are not all that nice and the docs recommend not to use __import__, unless strictly necessarily. Similarly
import importlib
import scipy as sp
importlib.import_module('scipy.interpolate',sp)
does the job, but it is still ugly, even longer and puts importlib in the namespace...
Imported modules are treated like regular objects so if you really want to you can import a module and assign it to an arbitrary variable like so
import scipy as sp
from scipy import interpolate
sp.interpolate = interpolate
sp.interpolate will behave as expected, it just points to the interpolate module whenever you call sp.interpolate. They are the same object underneath, i.e
print sp.interpolate is interpolate
>>> True
Then to finally remove the original 'interpolate' pointer call
del interpolate

Categories