I have been trying to concatenate two 1D arrays using np.concatenate but it doesn't work as expected. Can someone please let me know where I'm making a mistake?
My code is as follows:
x = np.array([1.13793103, 0.24137931, 0.48275862, 1.24137931, 1.00000000, 1.89655172])
y = np.array([0.03666667, 0.00888889, 0.01555556, 0.04 , 0.03222222, 0.06111111])
z = np.concatenate((x,y), axis=0)
print(z)
array([1.13793103, 0.24137931, 0.48275862, ... 0.04, 0.03222222, 0.06111111])
print(f'{type(x)} {type(y)} {type(z)}')
<class 'numpy.ndarray'> <class 'numpy.ndarray'> <class 'numpy.ndarray'>
print(f'{x.shape} {y.shape} {z.shape}')
(6,) (6,) (12,)
So, instead of adding y as a new array, it's joining the two arrays which isn't my intention. I am looking for something as follows:
array([1.13793103, 0.24137931, 0.48275862, 1.24137931, 1.00000000, 1.89655172],
[0.03666667, 0.00888889, 0.01555556, 0.04 , 0.03222222, 0.06111111])
You can use np.concatenate to concatenate along some axis if that dimension exists in the arrays that you want to concatenate:
x = np.array([1,2,3])
y = np.array([4,5,6])
here, x and y have shape (3,) so only one axis.
This means you can only concatenate along that axis (i.e. axis=0):
z = np.concatenate((x,y))
z.shape
out : (6,)
concatenating along axis=1 will throw an error:
z = np.concatenate((x,y), axis=1)
AxisError: axis 1 is out of bounds for array of dimension 1
You can make np.concatenate work, if you reshape x and y:
x, y = x.reshape(-1,1), y.reshape(-1,1)
Now both have shape (3,1) and can be concatenated along axis 1:
z = np.concatenate((x.reshape(-1,1),y.reshape(-1,1)),axis=1)
z.shape
(6,2)
alternatively, you can reshape to (1,3) and concatenate along axis 0:
z = np.concatenate((x.reshape(1,-1),y.reshape(1,-1)),axis=0)
z.shape
(2,6)
or you use np.vstack, which does not require the reshaping.
Related
I have to write a python function where i need to compute the vector
For A is n by n and xn is n by 1
r_n = Axn - (xn^TAxn)xn
Im using numpy but .T doesn't work on vectors and when I just do
r_n = A#xn - (xn#A#xn)#xn but xn#A#xn gives me a scaler.
I've tried changing the A with the xn but nothing seems to work.
Making a 3x1 numpy array like this...
import numpy as np
a = np.array([1, 2, 3])
...and then attempting to take its transpose like this...
a_transpose = a.T
...will, confusingly, return this:
# [1 2 3]
If you want to define a (column) vector whose transpose you can meaningfully take, and get a row vector in return, you need to define it like this:
a = np.reshape(np.array([1, 2, 3]), (3, 1))
print(a)
# [[1]
# [2]
# [3]]
a_transpose = a.T
print(a_transpose)
# [[1 2 3]]
If you want to define a 1 x n array whose transpose you can take to get an n x 1 array, you can do it like this:
a = np.array([[1, 2, 3]])
and then get its transpose by calling a.T.
If A is (n,n) and xn is (n,1):
A#xn - (xn#A#xn)#xn
(n,n)#(n,1) - ((n,1)#(n,n)#(n,1)) # (n,1)
(n,1) error (1 does not match n)
If xn#A#xn gives scalar that's because xn is (n,) shape; as per np.matmul docs that's a 2d with two 1d arrays
(n,)#(n,n)#(n,) => (n,)#(n,) -> scalar
I think you want
(1,n) # (n,n) # (n,1) => (1,1)
Come to think of it that (1,1) array should be same single values as the scalar.
Sample calculation; 1st with the (n,) shape:
In [6]: A = np.arange(1,10).reshape(3,3); x = np.arange(1,4)
In [7]: A#x
Out[7]: array([14, 32, 50]) # (3,3)#(3,)=>(3,)
In [8]: x#A#x # scalar
Out[8]: 228
In [9]: (x#A#x)#x
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[9], line 1
----> 1 (x#A#x)#x
ValueError: matmul: Input operand 0 does not have enough dimensions (has 0, gufunc core with signature (n?,k),(k,m?)->(n?,m?) requires 1)
matmul does not like to work with scalars. But we can use np.dot instead, or simply multiply:
In [10]: (x#A#x)*x
Out[10]: array([228, 456, 684]) # (3,)
In [11]: A#x - (x#A#x)*x
Out[11]: array([-214, -424, -634])
Change the array to (3,1):
In [12]: xn = x[:,None]; xn.shape
Out[12]: (3, 1)
In [13]: A#xn - (xn.T#A#xn)*xn
Out[13]:
array([[-214],
[-424],
[-634]]) # same numbers but in (3,1) shape
I've split an image up into 16 figures to plot regression and now I want to join it back together into one image.
I've written a for loop to do this but I'm having trouble understanding the advice from previous questions and where I'm going wrong. Please could someone explain why my input arrays do not have the same number of dimensions.
from scipy import interpolate
allArrays = np.array([])
for i in range(len(a)):
fig = plt.figure()
ax = fig.add_axes([0.,0.,1.,1.])
if np.amax(a[i]) > 0:
x, y = np.where(a[i]>0)
f = interpolate.interp1d(y, x)
xnew = np.linspace(min(y), max(y), num=40)
ynew = f(xnew)
plt.plot(xnew, ynew, '-')
plt.ylim(256, 0)
plt.xlim(0,256)
fig.canvas.draw()
X = np.array(fig.canvas.renderer._renderer)
myArray = color.rgb2gray(X)
print(myArray.shape)
allArrays = np.concatenate([allArrays, myArray])
print(allArrays.shape)
else:
plt.xlim(0,256)
plt.ylim(0,256)
fig.canvas.draw()
X = np.array(fig.canvas.renderer._renderer)
myArray = color.rgb2gray(X)
print(myArray.shape)
allArrays = np.concatenate([allArrays, myArray])
print(allArrays.shape)
i += 1
Output: myArray.shape (480, 640)
Error message: all the input arrays must have same number of dimensions
I'm sure it's really simple but I can't figure it out. Thanks.
In [226]: allArrays = np.array([])
In [227]: allArrays.shape
Out[227]: (0,)
In [228]: allArrays.ndim
Out[228]: 1
In [229]: myArray=np.ones((480,640))
In [230]: myArray.shape
Out[230]: (480, 640)
In [231]: myArray.ndim
Out[231]: 2
1 does not equal 2 in most worlds!
To concatenate with myArray on the default axis 0, allArrays would have to start as np.zeros((0,640), myArray.dtype). After n iterations it would grow to (n*480, 640).
In the linked answer, the new arrays are all 1d, so starting with shape (0,) is ok. But wim's answer is better - collect all arrays in a list, and do one concatenate at the end.
Repeated concatenate in a loop is hard to get right (you have to understand shapes and dimensions), and slower than list appends.
I have three numpy arrays:
X1.shape = (500,)
X2.shape = (5000,)
Y.shape = (5000,500)
I can run X - X2 without a problem.
But Y - X1 results in:
ValueError: operands could not be broadcast together with shapes (5000,500) (5000,)
If I change to Y - X1[:,None] this seems to work while Y - X2[:,None] gives the error:
ValueError: operands could not be broadcast together with shapes (5000,500) (500,1)
Please clarify!
In the context of unsupervised nearest neighbors with scikit-learn, I have implemented my own distance function to deal with my uncertain points (i.e. a point is represented as a normal distribution):
def my_mahalanobis_distance(x, y):
'''
x: array of shape (4,) x[0]: mu_x_1, x[1]: mu_x_2,
x[2]: cov_x_11, x[3]: cov_x_22
y: array of shape (4,) y[0]: mu_ y_1, y[1]: mu_y_2,
y[2]: cov_y_11, y[3]: cov_y_22
'''
cov_inv = np.linalg.inv(np.diag(x[:2])+np.diag(y[:2]))
return sp.spatial.distance.mahalanobis(x[:2], y[:2], cov_inv)
However, when I set my nearest neighbors:
nnbrs = NearestNeighbors(n_neighbors=1, metric='pyfunc', func=my_mahalanobis_distance)
nearest_neighbors = nnbrs.fit(X)
where X is a (N, 4) (n_samples, n_features) array, if I print x and y in my my_mahalanobis_distance, I get shapes of (10,) instead of (4,) as I would expect.
Example:
I add the following line to my_mahalanobis_distance:
print(x.shape)
Then in my main:
n_features = 4
n_samples = 10
# generate X array:
X = np.random.rand(n_samples, n_features)
nnbrs = NearestNeighbors(n_neighbors=1, metric='pyfunc', func=my_mahalanobis_distance)
nearest_neighbors = nnbrs.fit(X)
The result is:
(10,)
ValueError: shapes (2,) and (8,8) not aligned: 2 (dim 0) != 8 (dim 0)
I perfectly understand the error, but I do not understand why my x.shape is (10,) while my number of features is 4 in X.
I am using Python 2.7.10 and scikit-learn 0.16.1.
EDIT:
replacing return sp.spatial.distance.mahalanobis(x[:2], y[:2], cov_inv) by return 1 just for testing return:
(10,)
(4,)
(4,)
(4,)
(4,)
(4,)
(4,)
(4,)
(4,)
(4,)
(4,)
So only the first call to my_mahalanobis_distance is wrong. Looking at the x and y values at this first iteration, my observations are:
x and y are identical
if I run my code multiple times, x and y are still identical but their values have change compared to the previous run.
these values seem coming from a numpy.random function.
I would conclude that such a first call is a debugging piece of code which has not been removed.
This is not an answer, yet too long for a comment. I can not reproduce the error.
Using:
Python 3.5.2 and
Sklearn 0.18.1
with the code:
from sklearn.neighbors import NearestNeighbors
import numpy as np
import scipy as sp
n_features = 4
n_samples = 10
# generate X array:
X = np.random.rand(n_samples, n_features)
def my_mahalanobis_distance(x, y):
cov_inv = np.linalg.inv(np.diag(x[:2])+np.diag(y[:2]))
print(x.shape)
return sp.spatial.distance.mahalanobis(x[:2], y[:2], cov_inv)
n_features = 4
n_samples = 10
# generate X array:
X = np.random.rand(n_samples, n_features)
nnbrs = NearestNeighbors(n_neighbors=1, metric=my_mahalanobis_distance)
nearest_neighbors = nnbrs.fit(X)
The output is
(4,)
(4,)
(4,)
(4,)
(4,)
(4,)
(4,)
(4,)
(4,)
(4,)
I customed my my_mahalanobis_distance to handle this issue:
def my_mahalanobis_distance(x, y):
'''
x: array of shape (4,) x[0]: mu_x_1, x[1]: mu_x_2,
x[2]: cov_x_11, x[3]: cov_x_22
y: array of shape (4,) y[0]: mu_ y_1, y[1]: mu_y_2,
y[2]: cov_y_11, y[3]: cov_y_22
'''
if (x.size, y.size) == (4, 4):
return sp.spatial.distance.mahalanobis(x[:2], y[:2],
np.linalg.inv(np.diag(x[2:])
+ np.diag(y[2:])))
# to handle the buggy first call when calling NearestNeighbors.fit()
else:
warnings.warn('x and y are respectively of size %i and %i' % (x.size, y.size))
return sp.spatial.distance.euclidean(x, y)
I am Matlab/Octave user. Numpy documentation says the array is much more advisable to use rather than matrix. Is there a convenient way to deal with rank-1 arrays, without reshaping it constantly?
Example:
data = np.loadtxt("ex1data1.txt", usecols=(0,1), delimiter=',',dtype=None)
X = data[:, 0]
y = data[:, 1]
m = len(y)
print X.shape, y.shape
>>> (97L, ) (97L, )
I can't add new column to X using concatenate, vstack, append, except np.c_ which is slower, without reshaping X:
X = np.concatenate((np.ones((m, 1)), X), axis = 1)
>>> ValueError: all the input arrays must have same number of dimensions
X - y, couldn't be done without reshaping y np.reshape(y, (-1, 1))
A simpler equivalent to np.reshape(y, (-1, 1)) is y[:, np.newaxis]. Since np.newaxis is an alias for None, y[:, None] also works. It's also worth mentioning np.expand_dims(y, axis=1).