Say I want to calculate a value for every point on a grid. I would define some function func that takes two values x and y as parameters and returns a third value. In the example below, calculating this value requires a look-up in an external dictionary. I would then generate a grid of points and evaluate func on each of them to get my desired result.
The code below does precisely this, but in a somewhat roundabout way. First I reshape both the X and Y coordinate matrices into one-dimensional arrays, calculate all the values, and then reshape the result back into a matrix. My questions is, can this be done in a more elegant manner?
import collections as c
# some arbitrary lookup table
a = c.defaultdict(int)
a[1] = 2
a[2] = 3
a[3] = 2
a[4] = 3
def func(x,y):
# some arbitrary function
return a[x] + a[y]
X,Y = np.mgrid[1:3, 1:4]
X = X.T
Y = Y.T
Z = np.array([func(x,y) for (x,y) in zip(X.ravel(), Y.ravel())]).reshape(X.shape)
print Z
The purpose of this code is to generate a set of values that I can use with pcolor in matplotlib to create a heatmap-type plot.
I'd use numpy.vectorize to "vectorize" your function. Note that despite the name, vectorize is not intended to make your code run faster -- Just simplify it a bit.
Here's some examples:
>>> import numpy as np
>>> #np.vectorize
... def foo(a, b):
... return a + b
...
>>> foo([1,3,5], [2,4,6])
array([ 3, 7, 11])
>>> foo(np.arange(9).reshape(3,3), np.arange(9).reshape(3,3))
array([[ 0, 2, 4],
[ 6, 8, 10],
[12, 14, 16]])
With your code, it should be enough to decorate func with np.vectorize and then you can probably just call it as func(X, Y) -- No raveling or reshapeing necessary:
import numpy as np
import collections as c
# some arbitrary lookup table
a = c.defaultdict(int)
a[1] = 2
a[2] = 3
a[3] = 2
a[4] = 3
#np.vectorize
def func(x,y):
# some arbitrary function
return a[x] + a[y]
X,Y = np.mgrid[1:3, 1:4]
X = X.T
Y = Y.T
Z = func(X, Y)
Related
I am wondering if it is possible to compute the greatest common divisor for more than 2 arrays using numpy.gcd().
Using the following arrays for x, y, z:
import numpy as np
x = np.array([[4,6,28],[2,5,6]])
y = np.array([[2,1,7],[7,23,6]])
z = np.array([[3,0,4],[7,4,3]])
Here the gcd code taking the 3 arrays:
result = np.gcd(x,y,z)
Which leads to:
array([[2, 1, 7],
[1, 1, 6]])
result[0,2]
7
Instead of 7 shouldn't this be 1? Given the numbers 28, 7, 4, the following returns 1.
numpy.gcd.reduce([28, 7, 4])
So my question is if I am making a mistake at some point, or is numpy.gcd not capable of taking as input 3 arrays and simply computing the gcd over the first two arrays it receives as input?
From https://numpy.org/doc/stable/reference/generated/numpy.gcd.html
The function signature is this:
def numpy.gcd(x1, x2, /, out=None, *, ...):
If you call gcd with three arguments, you are basically doing z = gcd(x, y). Therefore you need to come up with your own function. It could be something like
def my_gcd(x, y, z):
return np.gcd(np.gcd(x, y), z)
which would return
[[1 1 1]
[1 1 3]]
I have the following python function:
def npnearest(u: np.ndarray, X: np.ndarray, Y: np.ndarray, distance: 'callbale'=npdistance):
'''
Finds x1 so that x1 is in X and u and x1 have a minimal distance (according to the
provided distance function) compared to all other data points in X. Returns the label of x1
Args:
u (np.ndarray): The vector (ndim=1) we want to classify
X (np.ndarray): A matrix (ndim=2) with training data points (vectors)
Y (np.ndarray): A vector containing the label of each data point in X
distance (callable): A function that receives two inputs and defines the distance function used
Returns:
int: The label of the data point which is closest to `u`
'''
xbest = None
ybest = None
dbest = float('inf')
for x, y in zip(X, Y):
d = distance(u, x)
if d < dbest:
ybest = y
xbest = x
dbest = d
return ybest
Where, npdistance simply gives distance between two points i.e.
def npdistance(x1, x2):
return(np.sum((x1-x2)**2))
I want to optimize npnearest by performing nearest neighbor search directly in numpy. This means that the function cannot use for/while loops.
Thanks
Since you don't need to use that exact function, you can simply change the sum to work over a particular axis. This will return a new list with the calculations and you can call argmin to get the index of the minimum value. Use that and lookup your label:
import numpy as np
def npdistance_idx(x1, x2):
return np.argmin(np.sum((x1-x2)**2, axis=1))
Y = ["label 0", "label 1", "label 2", "label 3"]
u = np.array([[1, 5.5]])
X = np.array([[1,2], [1, 5], [0, 0], [7, 7]])
idx = npdistance_idx(X, u)
print(Y[idx]) # label 1
Numpy supports vectorized operations (broadcasting)
This means you can pass in arrays and operations will be applied to entire arrays in an optimized way (SIMD - single instruction, multiple data)
You can then get the address of the array minimum using .argmin()
Hope this helps
In [9]: numbers = np.arange(10); numbers
Out[9]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [10]: numbers -= 5; numbers
Out[10]: array([-5, -4, -3, -2, -1, 0, 1, 2, 3, 4])
In [11]: numbers = np.power(numbers, 2); numbers
Out[11]: array([25, 16, 9, 4, 1, 0, 1, 4, 9, 16])
In [12]: numbers.argmin()
Out[12]: 5
I'll try to explain my issue here without going into too much detail on the actual application so that we can stay grounded in the code. Basically, I need to do operations to a vector field. My first step is to generate the field as
x,y,z = np.meshgrid(np.linspace(-5,5,10),np.linspace(-5,5,10),np.linspace(-5,5,10))
Keep in mind that this is a generalized case, in the program, the bounds of the vector field are not all the same. In the general run of things, I would expect to say something along the lines of
u,v,w = f(x,y,z).
Unfortunately, this case requires so more difficult operations. I need to use a formula similar to
where the vector r is defined in the program as np.array([xgrid-x,ygrid-y,zgrid-z]) divided by its own norm. Basically, this is a vector pointing from every point in space to the position (x,y,z)
Now Numpy has implemented a cross product function using np.cross(), but I can't seem to create a "meshgrid of vectors" like I need.
I have a lambda function that is essentially
xgrid,ygrid,zgrid=np.meshgrid(np.linspace(-5,5,10),np.linspace(-5,5,10),np.linspace(-5,5,10))
B(x,y,z) = lambda x,y,z: np.cross(v,np.array([xgrid-x,ygrid-y,zgrid-z]))
Now the array v is imported from another class and seems to work just fine, but the second array, np.array([xgrid-x,ygrid-y,zgrid-z]) is not a proper shape because it is a "vector of meshgrids" instead of a "meshgrid of vectors". My big issue is that I cannot seem to find a method by which to format the meshgrid in such a way that the np.cross() function can use the position vector. Is there a way to do this?
Originally I thought that I could do something along the lines of:
x,y,z = np.meshgrid(np.linspace(-2,2,5),np.linspace(-2,2,5),np.linspace(-2,2,5))
A = np.array([x,y,z])
cross_result = np.cross(np.array(v),A)
This, however, returns the following error, which I cannot seem to circumvent:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\numpy\core\numeric.py", line 1682, in cross
raise ValueError(msg)
ValueError: incompatible dimensions for cross product
(dimension must be 2 or 3)
There's a work around with reshape and broadcasting:
A = np.array([x_grid, y_grid, z_grid])
# A.shape == (3,5,5,5)
def B(v, p):
'''
v.shape = (3,)
p.shape = (3,)
'''
shape = A.shape
Ap = A.reshape(3,-1) - p[:,None]
return np.cross(v[None,:], Ap.reshape(3,-1).T).reshape(shape)
print(B(v,p).shape)
# (3, 5, 5, 5)
I think your original attempt only lacks the specification of the axis along which the cross product should be executed.
x, y, z = np.meshgrid(np.linspace(-2, 2, 5),np.linspace(-2, 2, 5), np.linspace(-2, 2, 5))
A = np.array([x, y, z])
cross_result = np.cross(np.array(v), A, axis=0)
I tested this with the code below. As an alternative to np.array([x, y, z]), you can also use np.stack(x, y, z, axis=0), which clearly shows along which axis the meshgrids are stacked to form a meshgrid of vectors, the vectors being aligned with axis 0. I also printed the shape each time and used random input for testing. In the test, the output of the formula is compared at a random index to the cross product of the input-vector at the same index with vector v.
import numpy as np
x, y, z = np.meshgrid(np.linspace(-5, 5, 10), np.linspace(-5, 5, 10), np.linspace(-5, 5, 10))
p = np.random.rand(3) # random reference point
A = np.array([x-p[0], y-p[1], z-p[2]]) # vectors from positions to reference
A_bis = np.stack((x-p[0], y-p[1], z-p[2]), axis=0)
print(f"A equals A_bis? {np.allclose(A, A_bis)}") # the two methods of stacking yield the same
v = -1 + 2*np.random.rand(3) # random vector v
B = np.cross(v, A, axis=0) # cross-product for all points along correct axis
print(f"Shape of v: {v.shape}")
print(f"Shape of A: {A.shape}")
print(f"Shape of B: {B.shape}")
print("\nComparison for random locations: ")
point = np.random.randint(0, 9, 3) # generate random multi-index
a = A[:, point[0], point[1], point[2]] # look up input-vector corresponding to index
b = B[:, point[0], point[1], point[2]] # look up output-vector corresponding to index
print(f"A[:, {point[0]}, {point[1]}, {point[2]}] = {a}")
print(f"v = {v}")
print(f"Cross-product as v x a: {np.cross(v, a)}")
print(f"Cross-product from B (= v x A): {b}")
The resulting output looks like:
A equals A_bis? True
Shape of v: (3,)
Shape of A: (3, 10, 10, 10)
Shape of B: (3, 10, 10, 10)
Comparison for random locations:
A[:, 8, 1, 1] = [-4.03607312 3.72661831 -4.87453077]
v = [-0.90817859 0.10110274 -0.17848181]
Cross-product as v x a: [ 0.17230515 -3.70657882 -2.97637688]
Cross-product from B (= v x A): [ 0.17230515 -3.70657882 -2.97637688]
I have two object arrays not necessarily of the same length:
import numpy as np
class Obj_A:
def __init__(self,n):
self.type = 'a'+str(n)
def __eq__(self,other):
return self.type==other.type
class Obj_B:
def __init__(self,n):
self.type = 'b'+str(n)
def __eq__(self,other):
return self.type==other.type
a = np.array([Obj_A(n) for n in range(2)])
b = np.array([Obj_B(n) for n in range(3)])
I would like to generate the matrix
mat = np.array([[[a[0],b[0]],[a[0],b[1]],[a[0],b[2]]],
[[a[1],b[0]],[a[1],b[1]],[a[1],b[2]]]])
this matrix has shape (len(a),len(b),2). Its elements are
mat[i,j] = [a[i],b[j]]
A solution is
mat = np.empty((len(a),len(b),2),dtype='object')
for i,aa in enumerate(a):
for j,bb in enumerate(b):
mat[i,j] = np.array([aa,bb],dtype='object')
but this is too expensive for my problem, which has O(len(a)) = O(len(b)) = 1e5.
I suspect there is a clean numpy solution involving np.repeat, np.tile and np.transpose, similar to the accepted answer here, but the output in this case does not simply reshape to the desired result.
I would suggest using np.meshgrid(), which takes two input arrays and repeats both along different axes so that looking at corresponding positions of the outputs gets you all possible combinations. For example:
>>> x, y = np.meshgrid([1, 2, 3], [4, 5])
>>> x
array([[1, 2, 3],
[1, 2, 3]])
>>> y
array([[4, 4, 4],
[5, 5, 5]])
In your case, you can put the two arrays together and transpose them into the proper configuration. Based on some experimentation I think this should work for you:
>>> np.transpose(np.meshgrid(a, b), (2, 1, 0))
Is there a way that you can preform a dot product of two lists that contain values without using NumPy or the Operation module in Python? So that the code is as simple as it could get?
For example:
V_1=[1,2,3]
V_2=[4,5,6]
Dot(V_1,V_2)
Answer: 32
Without numpy, you can write yourself a function for the dot product which uses zip and sum.
>>> def dot(v1, v2):
... return sum(x*y for x, y in zip(v1, v2))
...
>>> dot([1, 2, 3], [4, 5, 6])
32
As of Python 3.10, you can use zip(v1, v2, strict=True) to ensure that v1 and v2 have the same length.
def dot_product(x, y):
dp = 0
for i in range(len(x)):
dp += (x[i]*y[i])
return dp
sample1 = [1,2,3,4,5]
sample2 = [2,1,1,1,1]
dot_product(sample1, sample2) #16
We can simply use # operator from python.
For example:
import numpy as np
x = np.array([25, 2, 5])
y = np.array([0, 1, 2])
print(x#y)
12