Numpy 3D array to 2D row major array - python

I have a numpy array with dimensions (28, 28, 60000), containing 60000 28x28 images, represented as pixel brightness. I'm trying to transform it so that I have a 60000 x 784 array, with the 784 representing the original 28x28 image in row major format. How do I do this? I'm assuming I use numpy.reshape, but I'm unsure as to how it rearranges things. Example:
[[1,2], [[1,2,3,4],
[3,4]] [5,6,7,8]]
... ->
[[5,6],
[7,8]]

This code:
import numpy
a = numpy.array([[[1,2],[3,4]],[[5,6],[7,8]]])
print(numpy.reshape(a, (2,4)))
Returns:
[[1 2 3 4]
[5 6 7 8]]

Try to experiment with something like this:
a = np.arange(6).reshape((3, 2))
b = np.reshape(a, (1, 6))
print a
print b
a =
[[0 1]
[2 3]
[4 5]]
b = [[0 1 2 3 4 5]]

Related

Generalized version of np.roll

I have a 2D array
a = np.array([[0,1,2,3],[4,5,6,7]])
that is a 2x4 array. I need to shift the elements of each of the two arrays in axis 0 in but with different steps, say 1 for the first and 2 for the second, so that the output will be
np.array([[1,2,3,0],[6,7,4,5]])
With np.roll it doesn't seem possible to do it, at least looking at the documentation, I don't see any useful hint. There exists another function doing this?
This is an attempt at a generalized version of numpy.roll.
import numpy as np
a = np.array([[0,1,2,3],[4,5,6,7]])
def roll(a, shifts, axis):
assert a.shape[axis] == len(shifts)
return np.stack([
np.roll(np.take(a, i, axis), shifts[i]) for i in range(len(shifts))
], axis)
print(a)
print(roll(a, [-1, -2], 0))
print(roll(a, [1, 2, 1, 0], 1))
prints
[[0 1 2 3]
[4 5 6 7]]
[[1 2 3 0]
[6 7 4 5]]
[[4 1 6 3]
[0 5 2 7]]
Here, the parameter a is a numpy.array, shifts is an Iterable containing the shift amounts per element and axis is the axis along which to shift. Note that was only tested on two-dimensional arrays however.

Get part of np array with parameters

I am using python and numpy. I am using n dimensional array.
I want to select all elements with index like
arr[a,b,:,c]
but I want to be able to select slice position like parameter. For example if the parameter
#pos =2
arr[a,b,:,c]
#pos =1
arr[a,:,b,c]
I would move the axis of interest (at pos) to the front with numpy.moveaxis(array,pos,0)[1] and then simply slice with [:,a,b,c].
There is also numpy.take[2], but in your case you would still need to loop over each dimension a,b,c, so I think moveaxis is more convenient. Maybe there is an even more direct way to do this.
The idea of moving the slicing axis to one end is a good one. Various numpy functions use that idea.
In [171]: arr = np.ones((2,3,4,5),int)
In [172]: arr[0,0,:,0].shape
Out[172]: (4,)
In [173]: arr[0,:,0,0].shape
Out[173]: (3,)
Another idea is to build a indexing tuple:
In [176]: idx = (0,0,slice(None),0)
In [177]: arr[idx].shape
Out[177]: (4,)
In [178]: idx = (0,slice(None),0,0)
In [179]: arr[idx].shape
Out[179]: (3,)
To do this programmatically it may be easier to start with a list or array that can be modified, and then convert it to a tuple for indexing. Details will vary depending on how you prefer to specify the axis and variables.
If any of a,b,c are arrays (or lists), you may get some shape surprises, since it's a case of mixing advanced and basic indexing. But as long as they are scalars, that's not an issue.
You could np.transpose the array arr based on your preferences before you try to slice it, since you move your axis of interest (i.e. the :) "to the back". This way, you can rearrange arr, s.t. you can always call arr[a,b,c].
Example with only a and b:
import numpy as np
a = 0
b = 2
target_axis = 1
# Generate some random data
arr = np.random.randint(10, size=[3, 3, 3], dtype=int)
print(arr)
#[[[0 8 2]
# [3 9 4]
# [0 3 6]]
#
# [[8 5 4]
# [9 8 5]
# [8 6 1]]
#
# [[2 2 5]
# [5 3 3]
# [9 1 8]]]
# Define transpose s.t. target_axis is the last axis
transposed_shape = np.arange(arr.ndim)
transposed_shape = np.delete(transposed_shape, target_axis)
transposed_shape = np.append(transposed_shape, target_axis)
print(transposed_shape)
#[0 2 1]
# Caution! These 0 and 2 above do not come from a or b.
# Instead they are the indices of the axes.
# Transpose arr
arr_T = np.transpose(arr, transposed_shape)
print(arr_T)
#[[[0 3 0]
# [8 9 3]
# [2 4 6]]
#
# [[8 9 8]
# [5 8 6]
# [4 5 1]]
#
# [[2 5 9]
# [2 3 1]
# [5 3 8]]]
print(arr_T[a,b])
#[2 4 6]

best matrix full of same number

I would like to create a n x n matrix with n being a large number, what is the fastest way to do this in sage ?
I would like a matrix like for example n = 3 == [(3,3,3)(3,3,3),(3,3,3)]
I currently do this with
ones_matrix(n) * somenumber
However this takes a while with a big n, is there a faster way to do this in sage?
Thx for helping
You can use the numpy.full() function like so:
>>> import numpy as np
>>> arr = np.full((3, 3), 3)
>>> arr
[[3 3 3]
[3 3 3]
[3 3 3]]
>>> arr2 = np.full((3, 3), 7)
>>> arr2
[[7 7 7]
[7 7 7]
[7 7 7]]
A shortcut way would be:
n=int(input())
tup=tuple((n,))*n
#gives (n,n,n,...…,n) n times
ar=list((tup,)*n)
#would give ar=[(n,n,.....,n),(n,n,n,.....,n),...…..(n,n,n,...…,n)]
or doing in a single stroke:
ar=list((tuple((n,))*n,)*n)
If you want to work with Sage matrices, you can do this:
sage: M = MatrixSpace(ZZ, 10000, 10000)
sage: a = M.matrix([3]*10000*10000)
On my computer, this takes about 6 seconds, the same time as ones_matrix(10000), faster than 3 * ones_matrix(10000). Not nearly as fast as the numpy solution, but then the result is a Sage matrix. Note that if you want to work with noninteger entries, you should change the ZZ to the appropriate ring.

How does slice indexing work in numpy array

Suppose we have an array
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
Now I have below
row_r1 = a[1, :]
row_r2 = a[1:2, :]
print(row_r1.shape)
print(row_r2.shape)
I don't understand why row_r1.shape is (4,) and row_r2.shape is (1,4)
Shouldn't their shape all equal to (4,)?
I like to think of it this way. The first way row[1, :], states go get me all values on row 1 like this:
Returning:
array([5, 6, 7, 8])
shape
(4,) Four values in a numpy array.
Where as the second row[1:2, :], states go get me a slice of data between index 1 and index 2:
Returning:
array([[5, 6, 7, 8]]) Note: the double brackets
shape
(1,4) Four values in on one row in a np.array.
Their shapes are different because they aren't the same thing. You can verify by printing them:
import numpy as np
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
row_r1 = a[1, :]
row_r2 = a[1:2, :]
print("{} is shape {}".format(row_r1, row_r1.shape))
print("{} is shape {}".format(row_r2, row_r2.shape))
Yields:
[5 6 7 8] is shape (4,)
[[5 6 7 8]] is shape (1, 4)
This is because indexing will return an element, whereas slicing will return an array. You can however manipulate them to be the same thing using the .resize() function available to numpy arrays.
The code:
import numpy as np
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
row_r1 = a[1, :]
row_r2 = a[1:2, :]
print("{} is shape {}".format(row_r1, row_r1.shape))
print("{} is shape {}".format(row_r2, row_r2.shape))
# Now resize row_r1 to be the same shape
row_r1.resize((1, 4))
print("{} is shape {}".format(row_r1, row_r1.shape))
print("{} is shape {}".format(row_r2, row_r2.shape))
Yields
[5 6 7 8] is shape (4,)
[[5 6 7 8]] is shape (1, 4)
[[5 6 7 8]] is shape (1, 4)
[[5 6 7 8]] is shape (1, 4)
Showing that you are in fact now dealing with the same shaped object. Hope this helps clear it up!

which numpy command could I use to subtract vectors with different dimensions many times?

i have to write this function:
in which x is a vector with dimensions [150,2] and c is [N,2] (lets suppose N=20). From each component xi (i=1,2) I have to subtract the components of c in this way ([x11-c11,x12-c12])...([x11-cN1, x12-cN2])for all the 150 sample.
I've trasformed them in a way I have the same dimensions and I can subtract them, but the result of the function should be a vector. Maybe How can I write this in numpy?
Thank you
Ok, lets suppose x=(5,2) and c=(3,2)
this is what I have obtained transforming dimensions of the two arrays. the problem is that, I have to do this but with a iteration "for loop" because the exp function should give me as a result a vector. so I have to obtain a sort of matrix divided in N blocks.
From what I understand of the issue, the problem seems to be in the way you are calculating the vector norm, not in the subtraction. Using your example, but calculating exp(-||x-c||), try:
x = np.linspace(8,17,10).reshape((5,2))
c = np.linspace(1,6,6).reshape((3,2))
sub = np.linalg.norm(x[:,None] - c, axis=-1)
np.exp(-sub)
array([[ 5.02000299e-05, 8.49325705e-04, 1.43695961e-02],
[ 2.96711024e-06, 5.02000299e-05, 8.49325705e-04],
[ 1.75373266e-07, 2.96711024e-06, 5.02000299e-05],
[ 1.03655678e-08, 1.75373266e-07, 2.96711024e-06],
[ 6.12664624e-10, 1.03655678e-08, 1.75373266e-07]])
np.exp(-sub).shape
(5, 3)
numpy.linalg.norm will try to return some kind of matrix norm across all the dimensions of its input unless you tell it explicitly which axis represents the vector components.
I I understand, try if this give the expected result, but there is still the problem that the result has the same shape of x:
import numpy as np
x = np.arange(10).reshape(5,2)
c = np.arange(6).reshape(3,2)
c_col_sum = np.sum(c, axis=0)
for (h,k), value in np.ndenumerate(x):
x[h,k] = c.shape[0] * x[h,k] - c_col_sum[k]
Initially x is:
[[0 1]
[2 3]
[4 5]
[6 7]
[8 9]]
And c is:
[[0 1]
[2 3]
[4 5]]
After the function x becomes:
[[-6 -6]
[ 0 0]
[ 6 6]
[12 12]
[18 18]]

Categories