Let say I have 2 numpy arrays
import numpy as np
x = np.array([1,2,3])
y = np.array([1,2,3,4])
With this, I want to create a 2-dimensional array as below
Is there any method available to directly achieve this?
You problem is about writing the Cartesian product. In numpy, you can write it using repeat and tile:
out = np.c_[np.repeat(x, len(y)), np.tile(y, len(x))]
Python's builtin itertools module has a method designed for this: product:
from itertools import product
out = np.array(list(product(x,y)))
Output:
array([[1, 1],
[1, 2],
[1, 3],
[1, 4],
[2, 1],
[2, 2],
[2, 3],
[2, 4],
[3, 1],
[3, 2],
[3, 3],
[3, 4]])
Related
Is there anyway to add two numpy arrays of different length in a Descartian fashion without iterating over columns a? See example below.
a = np.array([[1, 2], [3, 4]])
b = np.array([[1, 1], [2, 2], [3, 3]])
c = dec_sum(a, b) # c = np.array([[[2, 3], [3, 4], [3, 5]], [[4, 4], [5, 6], [6, 7]]])
Given a 2x2 numpy array a and 3x2 numpy array b, c= dec_sum(a, b) and c is 2x3x2.
i have a numpy array like the XY coordinates here below:
2d_coords = [
[1,2]
[1,1]
[2,1]
[3,1]
...
]
either [1,1] or [1,2] need to go (doesn't care which one) , only one point on the X coordinate is possible.
How can I do that ?
numpy.unique would be helpful. For example,
import numpy as np
l = np.asarray([
[1, 2],
[1, 1],
[2, 1],
[3, 1],
])
_, unique_indices = np.unique(l[:, 0], return_index=True) # get the indices with unique x coordinates
print(l[unique_indices])
The example output:
[[1 2]
[2 1]
[3 1]]
You can use NumPy and matplotlib:
import numpy as np
import matplotlib.pyplot as plt
coords = np.array([[1, 2], [1, 1], [2, 1], [3, 1]])
plot_coords = coords[np.unique(coords[:,0])].T
plt.plot(plot_coords[0], plot_coords[1])
plt.show()
What about pandas?
pd.DataFrame(coords).drop_duplicates(0).values
array([[1, 2],
[2, 1],
[3, 1]])
Without using any external library, you can use a conditional list comprehension:
d_coords = [[1,2],[1,1],[2,1],[3,1]]
new_list = [d_coords[i] for i in range(len(d_coords)) if d_coords[i][0] not in [k[0] for k in d_coords[:i]]]
# new_list: [[1, 2], [2, 1], [3, 1]]
NOTE: don't start variable names with numbers
I am learning Python and solving a machine learning problem.
class_ids=np.arange(self.x.shape[0])
np.random.shuffle(class_ids)
self.x=self.x[class_ids]
This is a shuffle function in NumPy but I can't understand what self.x=self.x[class_ids] means. because I think it gives the value of the array to a variable.
It's a very complicated way to shuffle the first dimension of your self.x. For example:
>>> x = np.array([[1, 1], [2, 2], [3, 3], [4, 4], [5, 5]])
>>> x
array([[1, 1],
[2, 2],
[3, 3],
[4, 4],
[5, 5]])
Then using the mentioned approach
>>> class_ids=np.arange(x.shape[0]) # create an array [0, 1, 2, 3, 4]
>>> np.random.shuffle(class_ids) # shuffle the array
>>> x[class_ids] # use integer array indexing to shuffle x
array([[5, 5],
[3, 3],
[1, 1],
[4, 4],
[2, 2]])
Note that the same could be achieved just by using np.random.shuffle because the docstring explicitly mentions:
This function only shuffles the array along the first axis of a multi-dimensional array. The order of sub-arrays is changed but their contents remains the same.
>>> np.random.shuffle(x)
>>> x
array([[5, 5],
[3, 3],
[1, 1],
[2, 2],
[4, 4]])
or by using np.random.permutation:
>>> class_ids = np.random.permutation(x.shape[0]) # shuffle the first dimensions indices
>>> x[class_ids]
array([[2, 2],
[4, 4],
[3, 3],
[5, 5],
[1, 1]])
Assuming self.x is a numpy array:
class_ids is a 1-d numpy array that is being used as an integer array index in the expression: x[class_ids]. Because the previous line shuffled class_ids, x[class_ids] evaluates to self.x shuffled by rows.
The assignment self.x=self.x[class_ids] assigns the shuffled array to self.x
This question already has answers here:
Quick way to upsample numpy array by nearest neighbor tiling [duplicate]
(3 answers)
Closed 7 years ago.
Is there a function in numpy/scipy to over-sample a 2D numpy array?
example:
>>> x = [[1,2]
[3,4]]
>>>
>>> y = oversample(x, (2, 3))
would returns
y = [[1,1,2,2],
[1,1,2,2],
[1,1,2,2],
[3,3,4,4],
[3,3,4,4],
[3,3,4,4]]
At the moment I've implemented my own function:
index_x = np.arange(newdim) / olddim
index_y = np.arange(newdim) / olddim
xx, yy = np.meshgrid(index_x, index_y)
return x[yy, xx, ...]
but it doesn't look like the best way as it only works for 2D reshaping as well as being a bit slow...
Any suggestions?
Thank you very much
EDIT Didnt see the comment until after post, delete if needed
Original
check np.repeat to repeat patterns. shown verbosely
>>> import numpy as np
>>> a = np.array([[1,2],[3,4]])
>>> a
array([[1, 2],
[3, 4]])
>>> b=a.repeat(3,axis=0)
>>> b
array([[1, 2],
[1, 2],
[1, 2],
[3, 4],
[3, 4],
[3, 4]])
>>> c = b.repeat(2,axis=1)
>>> c
array([[1, 1, 2, 2],
[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4],
[3, 3, 4, 4]])
I have two arrays of the form:
a = np.array([1,2,3])
b = np.array([4,5,6])
Is there a NumPy function which I can apply to these arrays to get the followng output?
[[1,4],[2,5][3,6]]
np.vstack((a,b)).T
returns
array([[1, 4],
[2, 5],
[3, 6]])
and
np.vstack((a,b)).T.tolist()
returns exactly what you need:
[[1, 4], [2, 5], [3, 6]]