Numpy.gcd using more than 2 arrays - python

I am wondering if it is possible to compute the greatest common divisor for more than 2 arrays using numpy.gcd().
Using the following arrays for x, y, z:
import numpy as np
x = np.array([[4,6,28],[2,5,6]])
y = np.array([[2,1,7],[7,23,6]])
z = np.array([[3,0,4],[7,4,3]])
Here the gcd code taking the 3 arrays:
result = np.gcd(x,y,z)
Which leads to:
array([[2, 1, 7],
[1, 1, 6]])
result[0,2]
7
Instead of 7 shouldn't this be 1? Given the numbers 28, 7, 4, the following returns 1.
numpy.gcd.reduce([28, 7, 4])
So my question is if I am making a mistake at some point, or is numpy.gcd not capable of taking as input 3 arrays and simply computing the gcd over the first two arrays it receives as input?

From https://numpy.org/doc/stable/reference/generated/numpy.gcd.html
The function signature is this:
def numpy.gcd(x1, x2, /, out=None, *, ...):
If you call gcd with three arguments, you are basically doing z = gcd(x, y). Therefore you need to come up with your own function. It could be something like
def my_gcd(x, y, z):
return np.gcd(np.gcd(x, y), z)
which would return
[[1 1 1]
[1 1 3]]

Related

How to perform a vectorized function on a 2D numpy array?

vecs = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
def find_len(vector):
return (vector[0] ** 2 + vector[1] ** 2 + vector[2] ** 2) ** 0.5
vec_len = np.vectorize(find_len)
I want to apply find_len to every vector in the 2d array and create a new numpy array with the values returned. How can I do this?
try this
res= []
for i in range(vecs.shape[0]):
res.append(find_len(vecs[i]))
res=np.array(res)
results in
array([ 3.74165739, 8.77496439, 13.92838828])
you can also make this in one line:
res = np.array([find_len(x) for x in vecs[range(vecs.shape[0])]])
Are you just looking for this result:
array([ 3.74165739, 8.77496439, 13.92838828])
because you can achieve that without vectorize, just use:
(vecs**2).sum(axis=1)**0.5
This also has the advantage of not being specific to vectors of length 3.
Operations are already applied element-wise, so you can handle the squaring and square rooting normally. sum(axis=1) says to sum along the rows.

Optimize the python function with numpy without using the for loop

I have the following python function:
def npnearest(u: np.ndarray, X: np.ndarray, Y: np.ndarray, distance: 'callbale'=npdistance):
'''
Finds x1 so that x1 is in X and u and x1 have a minimal distance (according to the
provided distance function) compared to all other data points in X. Returns the label of x1
Args:
u (np.ndarray): The vector (ndim=1) we want to classify
X (np.ndarray): A matrix (ndim=2) with training data points (vectors)
Y (np.ndarray): A vector containing the label of each data point in X
distance (callable): A function that receives two inputs and defines the distance function used
Returns:
int: The label of the data point which is closest to `u`
'''
xbest = None
ybest = None
dbest = float('inf')
for x, y in zip(X, Y):
d = distance(u, x)
if d < dbest:
ybest = y
xbest = x
dbest = d
return ybest
Where, npdistance simply gives distance between two points i.e.
def npdistance(x1, x2):
return(np.sum((x1-x2)**2))
I want to optimize npnearest by performing nearest neighbor search directly in numpy. This means that the function cannot use for/while loops.
Thanks
Since you don't need to use that exact function, you can simply change the sum to work over a particular axis. This will return a new list with the calculations and you can call argmin to get the index of the minimum value. Use that and lookup your label:
import numpy as np
def npdistance_idx(x1, x2):
return np.argmin(np.sum((x1-x2)**2, axis=1))
Y = ["label 0", "label 1", "label 2", "label 3"]
u = np.array([[1, 5.5]])
X = np.array([[1,2], [1, 5], [0, 0], [7, 7]])
idx = npdistance_idx(X, u)
print(Y[idx]) # label 1
Numpy supports vectorized operations (broadcasting)
This means you can pass in arrays and operations will be applied to entire arrays in an optimized way (SIMD - single instruction, multiple data)
You can then get the address of the array minimum using .argmin()
Hope this helps
In [9]: numbers = np.arange(10); numbers
Out[9]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [10]: numbers -= 5; numbers
Out[10]: array([-5, -4, -3, -2, -1, 0, 1, 2, 3, 4])
In [11]: numbers = np.power(numbers, 2); numbers
Out[11]: array([25, 16, 9, 4, 1, 0, 1, 4, 9, 16])
In [12]: numbers.argmin()
Out[12]: 5

List of tuples of vectors --> two matrices

In Python, I have a list of tuples, each of them containing two nx1 vectors.
data = [(np.array([0,0,3]), np.array([0,1])),
(np.array([1,0,4]), np.array([1,1])),
(np.array([2,0,5]), np.array([2,1]))]
Now, I want to split this list into two matrices, with the vectors as columns.
So I'd want:
x = np.array([[0,1,2],
[0,0,0],
[3,4,5]])
y = np.array([[0,1,2],
[1,1,1]])
Right now, I have the following:
def split(data):
x,y = zip(*data)
np.asarray(x)
np.asarray(y)
x.transpose()
y.transpose()
return (x,y)
This works fine, but I was wondering whether a cleaner method exists, which doesn't use the zip(*) function and/or doesn't require to convert and transpose the x and y matrices.
This is for pure entertainment, since I'd go with the zip solution if I were to do what you're trying to do.
But a way without zipping would be vstack along your axis 1.
a = np.array(data)
f = lambda axis: np.vstack(a[:, axis]).T
x,y = f(0), f(1)
>>> x
array([[0, 1, 2],
[0, 0, 0],
[3, 4, 5]])
>>> y
array([[0, 1, 2],
[1, 1, 1]])
Comparing the best elements of all previously proposed methods, I think it's best as follows*:
def split(data):
x,y = zip(*data) #splits the list into two tuples of 1xn arrays, x and y
x = np.vstack(x[:]).T #stacks the arrays in x vertically and transposes the matrix
y = np.vstack(y[:]).T #stacks the arrays in y vertically and transposes the matrix
return (x,y)
* this is a snippet of my code

generating matrix of pairs from two object vectors using numpy

I have two object arrays not necessarily of the same length:
import numpy as np
class Obj_A:
def __init__(self,n):
self.type = 'a'+str(n)
def __eq__(self,other):
return self.type==other.type
class Obj_B:
def __init__(self,n):
self.type = 'b'+str(n)
def __eq__(self,other):
return self.type==other.type
a = np.array([Obj_A(n) for n in range(2)])
b = np.array([Obj_B(n) for n in range(3)])
I would like to generate the matrix
mat = np.array([[[a[0],b[0]],[a[0],b[1]],[a[0],b[2]]],
[[a[1],b[0]],[a[1],b[1]],[a[1],b[2]]]])
this matrix has shape (len(a),len(b),2). Its elements are
mat[i,j] = [a[i],b[j]]
A solution is
mat = np.empty((len(a),len(b),2),dtype='object')
for i,aa in enumerate(a):
for j,bb in enumerate(b):
mat[i,j] = np.array([aa,bb],dtype='object')
but this is too expensive for my problem, which has O(len(a)) = O(len(b)) = 1e5.
I suspect there is a clean numpy solution involving np.repeat, np.tile and np.transpose, similar to the accepted answer here, but the output in this case does not simply reshape to the desired result.
I would suggest using np.meshgrid(), which takes two input arrays and repeats both along different axes so that looking at corresponding positions of the outputs gets you all possible combinations. For example:
>>> x, y = np.meshgrid([1, 2, 3], [4, 5])
>>> x
array([[1, 2, 3],
[1, 2, 3]])
>>> y
array([[4, 4, 4],
[5, 5, 5]])
In your case, you can put the two arrays together and transpose them into the proper configuration. Based on some experimentation I think this should work for you:
>>> np.transpose(np.meshgrid(a, b), (2, 1, 0))

How do I apply some function to a python meshgrid?

Say I want to calculate a value for every point on a grid. I would define some function func that takes two values x and y as parameters and returns a third value. In the example below, calculating this value requires a look-up in an external dictionary. I would then generate a grid of points and evaluate func on each of them to get my desired result.
The code below does precisely this, but in a somewhat roundabout way. First I reshape both the X and Y coordinate matrices into one-dimensional arrays, calculate all the values, and then reshape the result back into a matrix. My questions is, can this be done in a more elegant manner?
import collections as c
# some arbitrary lookup table
a = c.defaultdict(int)
a[1] = 2
a[2] = 3
a[3] = 2
a[4] = 3
def func(x,y):
# some arbitrary function
return a[x] + a[y]
X,Y = np.mgrid[1:3, 1:4]
X = X.T
Y = Y.T
Z = np.array([func(x,y) for (x,y) in zip(X.ravel(), Y.ravel())]).reshape(X.shape)
print Z
The purpose of this code is to generate a set of values that I can use with pcolor in matplotlib to create a heatmap-type plot.
I'd use numpy.vectorize to "vectorize" your function. Note that despite the name, vectorize is not intended to make your code run faster -- Just simplify it a bit.
Here's some examples:
>>> import numpy as np
>>> #np.vectorize
... def foo(a, b):
... return a + b
...
>>> foo([1,3,5], [2,4,6])
array([ 3, 7, 11])
>>> foo(np.arange(9).reshape(3,3), np.arange(9).reshape(3,3))
array([[ 0, 2, 4],
[ 6, 8, 10],
[12, 14, 16]])
With your code, it should be enough to decorate func with np.vectorize and then you can probably just call it as func(X, Y) -- No raveling or reshapeing necessary:
import numpy as np
import collections as c
# some arbitrary lookup table
a = c.defaultdict(int)
a[1] = 2
a[2] = 3
a[3] = 2
a[4] = 3
#np.vectorize
def func(x,y):
# some arbitrary function
return a[x] + a[y]
X,Y = np.mgrid[1:3, 1:4]
X = X.T
Y = Y.T
Z = func(X, Y)

Categories