When I need to use numpy within a python function I'm defining, which method is correct/better/preferred/more pythonic?
Method 1
def do_something(arg):
import numpy as np
y = np.array(arg)
return y
or
Method 2
import numpy as np
def do_something(arg):
y = np.array(arg)
return y
My expectation is that method 2 is correct because it does not execute the import statement every time the function is called. Also, I would expect that importing within the function only makes numpy available within the scope of that function, which also seems bad.
Yes method 2 is correct as is your explanation. Import in Python is similar to #include header_file in C/C++. Module importing is quite fast, but not instant, put the imports at the top. It is also not true that method 1 makes the code slow.
Related
I'm writing a program which dynamically detects and imports python functions and detects which input parameters and outputs that is will expect/generate.
Like so:
def importFunctions(self, filename):
moduleImport = __import__(filename)
members = getmembers(moduleImport, isfunction)
functions = []
for m in members:
function = getattr(moduleImport, m[0])
number_of_inputs = function.__code__.co_argcount
inputs = function.__code__.co_varnames
if number_of_inputs > 1:
inputs = inputs[0:number_of_inputs-1]
elif number_of_inputs == 1:
inputOne = inputs[0]
inputs = []
inputs.append(inputOne)
outputs = function.__annotations__["return"]
functions.append([function, inputs, outputs])
return functions
This works only when I properly annotate the function, an example function could look something like this:
from numba import jit
#jit
def subtraction(a, b) -> ["difference"]:
a = float(a)
b = float(b)
difference = a - b
return (difference,)
This work perfectly fine without the decorator, but when I want to add the numba "jit" decorator to a function, I get an error saying that the imported function is missing the "return"-annotation.
UPDATE
Having tried to aces the original function by using "func.py_func" as suggested by #Rutger Kassies, my suspicions are that either getmembers or getattr it not proporely importing the numba to-be-compiled function.
It seems that getmembers finds "jit" as a separate member, and doesn't correctly associate it with the original function. The way it's written above, the 'function' named "jit", is of type function, as it should be. However, calling it returns a "<function _jit..wrapper". This has me scratching my head quite a bit but I suppose the 'getattr' is somehow behind this.
My guess is that I will have to fin another approach to dynamically importing functions that doesn't rely on "getattr".
If you're dealing with the numba.jit or numba.njit decorators, you can access the original function, in all it's annotated glory, by accessing the .py_func attribute. A simple example:
import numpy as np
import numba
from typing import get_type_hints, Annotated, Any
custom_output_type = Annotated[Any, "something"]
#numba.njit
def func(x: float) -> custom_output_type:
return x**2
# trigger compilation, not required
func(1.2)
get_type_hints(func.py_func, include_extras=True)
Which returns what you would expect from a regular Python function:
{'x': float, 'return': typing.Annotated[typing.Any, 'something']}
It would be similar when using the inspect module.
It gets more complicated when you use the other decorators lie vectorize & guvectorize, unfortunately. See for example:
https://numba.discourse.group/t/using-annotations-with-numba-gu-vectorize-functions/1008
It's probably best to rely as much as possible on the inspect & typing modules over accessing the private attributes of a function.
Is it okay to call any numpy function without using the library name before the function (example: numpy.linspace())? Can we call it simply
linspace()
instead of calling
numpy.linspace()
You can import it like this
from numpy import linspace
and then use it like this
a = linspace(1, 10)
yes, its completely fine when you are importing the function separately from the numpy such as
from numpy import linespace
#you can call the function by just writing its name
result=linespace(3,50)
but the convention is to use the name alias the pakage as np
import numpy as np
#then calling the function with short name
result = np.linespace(3,50)
alias can be helpful when working with large number of libraries.and it also improves the code readability.
If you import the function from the library directly there is nothing wrong with calling said function directly.
i.e.
from numpy import linspace
# Then call linspace by itself
a = linspace(1, 10)
That being said, many find that having numpy (often shortened to np) in front of function names help improve code readability. As almost everyone does this with certain libraries (Tensorflow as tf, Numpy as np, Pandas as pd) some may view it in a poor light if you simply directly import and use the function.
I would recommend importing the library as the shortened name and then using it appropriately.
i.e.
import numpy as np
# Then call np.linspace
a = np.linspace(1, 10)
I need to create a class which takes in a random number generator (i.e. a numpy.random.RandomState object) as a parameter. In the case this argument is not specified, I would like to assign it to the random generator that numpy uses when we run numpy.random.<random-method>. How do I access this global generator? Currently I am doing this by just assigning the module object as the random generator (since they share methods / duck typing). However this causes issues when pickling (unable to pickle module object) and deep-copying. I would like to use the RandomState object behind numpy.random
PS: I'm using python-3.4
As well as what kazemakase suggests, we can take advantage of the fact that module-level functions like numpy.random.random are really methods of a hidden numpy.random.RandomState by pulling the __self__ directly from one of those methods:
numpy_default_rng = numpy.random.random.__self__
numpy.random imports * from numpy.random.mtrand, which is an extension module written in Cython. The source code shows that the global state is stored in the variable _rand. This variable is not imported into the numpy.random scope but you can get it directly from mtrand.
import numpy as np
from numpy.random.mtrand import _rand as global_randstate
np.random.seed(42)
print(np.random.rand())
# 0.3745401188473625
np.random.RandomState().seed(42) # Different object, does not influence global state
print(np.random.rand())
# 0.9507143064099162
global_randstate.seed(42) # this changes the global state
print(np.random.rand())
# 0.3745401188473625
I don't know how to access the global state. However, you can use a RandomState object and pass it along. Random distributions are attached to it, so you call them as methods.
Example:
import numpy as np
def computation(parameter, rs):
return parameter*np.sum(rs.uniform(size=5)-0.5)
my_state = np.random.RandomState(seed=3)
print(computation(3, my_state))
I know that from module import * will import all the functions in current namespace but it is a bad practice. I want to use two functions directly and use module.function when I have to use any other function from the module. What I am doing currently is:
import module
from module import func1, func2
# DO REST OF MY STUFF
Is it a good practice? Does the order of first two statements matter?
Is there a better way using which I can use these two functions directly and use rest of the functions as usual with the module's name prepended to them?
Using just import module results in very long statements with a lot of repetition if I use the same function from the given module five times in a single statement. That's what I want to avoid.
The order doesn't matter and it's not a pythonic way. When you import the module there is no need to import some of its functions separately again. If you are not sure how many of the functions you might need to use just import the module and access to the functions on demand with a simple reference.
# The only import you need
import module
# Use module.funcX when you need any of its functions
After all, if you want to use some of your functions (much) more than the others, as the cost of attribute access is greater than importing the functions separately, you better to import them as you've done.
And still, the order doesn't matter. You can do:
import module
from module import func1, func2
For more info read the documentation https://www.python.org/dev/peps/pep-0008/#imports
It is not good to do (may be opinion based):
import module
from module import func1, func2 # `func1` and `func2` are already part of module
Because you already hold a reference to module.
If I were you, I would import it in the form of import module. Since your issue is that module.func1() becomes too long. I may import the module and use as for creating a alias for the name. For example:
import module as mo
# ^ for illustration purpose. Even the name of
# your actual module wont be `module`.
# Alias should also be self-explanatory
# For example:
import database_manager as db_manager
Now I may access the functions as:
mo.func1()
mo.func2()
Edit: Based on the edit in actual question
If your are calling same function in the same line, there is possibility that your are already doing some thing wrong. It will be great if you can share what your that function does.
For example: Want to the rertun value of those functions to be passed as argument to another function? as:
test_func(mo.func1(x), mo.func1(y). mo.func1(z))
could be done as:
params_list = [x, y, z]
func_list = [mo.func1(param) for param in params_list]
test_func(*func_list)
I'm trying to learn how to do object oriented coding for scientific computing running a simulation; I'm using using numpy, etc. I've created my first class, WC_unit, which is located at ./classes/WC_class.py (a subdirectory). I've created an __init__.py file (which is empty) in the classes directory.
The methods for the WC_unit class require some numpy functions, like exp
When I run the code (in ipython) from the terminal, using
%run WC_class.py
I can generate an instance of the class E1 = WC_unit() and I can run the associated methods on it, ie E1.update()
I can't really tell if it's working. I wrote some outer code in a script test.py located at . (above ./classes) to test the objects I'm generating and I'm trying to import the class by using
from classes.WC_class import WC_unit
Now, when I create an instance E1 of the class and run E1.update(), I get the error message global name 'exp' is not defined.
I've tried calling from numpy import * or also import numpy as np and changing the function call to np.exp() and I continue to get the error. Thinking that I had some sort of scoping problem or issues with namespace I've put this same import function at various locations, including in the test.py file, the top of the class file WC_class.py, even in the method:
class WC_unit:
def __init__(self): [assign default pars from a dict including r, dt, tau, and Iapp]...
def update(self):
from numpy import *
self.r += self.dt/self.tau * (-self.r + exp(self.Iapp))
I would really like to up my game and figure out how to write my own classes and use them with the awesome computing tools. I guess I'd like to know:
What am I doing wrong (probably a lot, I suspect). I think it's something with how I'm importing my class? but perhaps also scoping in the class itself.
Why does my class lose access to the numpy functions when I import it, but not when I run it like a script in the terminal?
I guess I also generally don't understand why people are so protective of their namespaces, i.e. why do so many code examples show import numpy as np and use all of the functions as np.exp(x), etc. I don't have much of a computer science background so I could benefit a lot from any explanations you could provide- the documentation is kind of cryptic to me.
Python version: 2.7.8 |Anaconda 2.1.0 (x86_64)| (default, Aug 21 2014, 15:21:46)
[GCC 4.2.1 (Apple Inc. build 5577)]
On Mac OSX 10.6.8
When you call %run WC_class.py in IPython, what you are doing is loading the contents of that source file directly into the interactive namespace. Because you've already called from numpy import * within your IPython session, exp is defined as numpy.exp within the set of globals for the current 'module' (which, in this case, is just the IPython interactive namespace), so when you call exp() in WC_unit.update() (or anywhere else within WC_class.py) it will work fine.
However, you do not do a from numpy import * at the top of test.py, therefore when you import WC_unit into your script exp has not been defined within the scope of the current module (which is now the test script).
You've tried from numpy import * within the WC_unit.update() method itself, but this will fail because import * is only allowed at a module level (in fact you should have seen a SyntaxWarning about this when you tried to import WC_unit!). Since the import fails, exp is still undefined and the WC_unit.update() method will raise the NameError you're seeing.
What you ought to do is have a single import line at the top of any source file that uses numpy functions:
import numpy as np
then refer to any numpy functions via the np. namespace.
Regarding your third point, the main reason to do
import numpy as np
x = np.exp(y) # etc.
rather than
from numpy import *
x = exp(y) # etc.
is that the latter method pollutes your global namespace.
Suppose you had already defined your own function called exp. When you do from numpy import *, you will be overwriting your own function called exp with numpy.exp, so when you later call exp(y) it might not do what you expect it to. For example, this is exactly what happens to some of the built-in Python functions such as sum and all:
print(sum.__module__)
# __builtin__
from numpy import *
print(sum.__module__)
# numpy.core.fromnumeric
What's more, this is more-or-less irreversible - once you've done a from module import * there's no easy way to get rid of the stuff you've imported to your namespace (or restore any old modules or variables you've clobbered by importing over the top of them).
As long as you keep all of the contents of each module in its own separate namespace there is no risk of namespace collisions, and no ambiguity about where each function or class comes from. By convention we use np to refer to the namespace for numpy, plt for matplotlib.pyplot etc.