slicing numpy array into two parts - python

I have a 2d numpy array
Something like this:
[[ 1 2 3 4],
[4,5,6,7]..
...... ] ]
Now I want to divide this into two parts.
lets say the first numpy array has the first two elements.
and the second numpy array has rest of the elements
something like this
B = [[1 2 3 4],
[4 5 6 7]]
C = [[ rest of the elements]]
How do i do this
Thanks

This is covered in the Indexing, Slicing, and Iterating portion of the tutorial:
>>> import numpy as np
>>> A = np.array([[1,2,3,4],[4,5,6,7],[7,8,9,10]])
>>> B = A[:2]
>>> C = A[2:]
>>> B
array([[1, 2, 3, 4],
[4, 5, 6, 7]])
>>> C
array([[ 7, 8, 9, 10]])

Related

How to modify every third element in matrix?

I had to make a matrix using numpy.array method. How can I now update every third element of my matrix? I have made a for loop for the problem but that is not the optimal solution. Is there a way to avoid loops? For example if I have this matrix:
matrix = np.array([[1,2,3,4],
[5,6,7,8],
[4,7,6,9]])
is there a way to add 1 to every third element and get this matrix:
[[2,2,3,5],[5,6,8,8],[4,8,6,9]]
Solution:
matrix = np.ascontiguousarray(matrix)
matrix.ravel()[::3] += 1
Why does the ascontiguousarray is needed? Because matrix may not be c-contiguous (for example matrix may have fortran-order - column major). It that case ravel returns a copy instead of a view so a simple inplace operation matrix.ravel()[::3] += 1 will not work as expected.
Example 1
import numpy as np
arr = np.array([
[1, 2, 3, 4],
[5, 6, 7, 8],
[4, 7, 6, 9]])
arr.ravel()[::3] += 1
print(arr)
Works as expected:
[[2 2 3 5]
[5 6 8 8]
[4 8 6 9]]
Example 2
But with fortran-order
import numpy as np
arr = np.array([
[1, 2, 3, 4],
[5, 6, 7, 8],
[4, 7, 6, 9]])
arr = np.asfortranarray(arr)
arr.ravel()[::3] += 1
print(arr)
produces:
[[1 2 3 4]
[5 6 7 8]
[4 7 6 9]]
Example 3
Will work as expected in both cases
import numpy as np
arr = np.array([
[1, 2, 3, 4],
[5, 6, 7, 8],
[4, 7, 6, 9]])
# arr = np.asfortranarray(arr)
arr = np.ascontiguousarray(arr)
arr.ravel()[::3] += 1
print(arr)

Can not flatten a numpy array

Why isn't flatten working? I have looked at example code and I am doing exactly what they are doing in the example. I've even copied their code and ran it but the array still doesn't come out as a flattened array.
I don't know if it matters but I am running Python 3.7.4.
code:
import numpy as np
array1 = np.array([[1, 2, 3, 2, 5, 8], [9, 5, 1, 7, 5, 3]])
array1.flatten()
print(array1)
output:
[[1 2 3 2 5 8]
[9 5 1 7 5 3]]
desired output:
[1 2 3 2 5 8 9 5 1 7 5 3]
array1.flatten() returns the flattened array but does not change in place. Try equating it back should work.
Code:
import numpy as np
array1 = np.array([[1, 2, 3, 2, 5, 8], [9, 5, 1, 7, 5, 3]])
array1 = array1.flatten()
print(array1)
You have to assign the array1.flatten() to a variable, so something like this could work array2 = array1.flatten().

How to append a "label" to a numpy array

I have a numpy array created such as:
x = np.array([[1,2,3,4],[5,6,7,8]])
y = np.asarray([x])
which prints out
x=[[1 2 3 4]
[5 6 7 8]]
y=[[[1 2 3 4]
[5 6 7 8]]]
What I would like is an array such as
[0 [[1 2 3 4]
[5 6 7 8]]]
What's the easiest way to go about this?
Thanks!
To do what you're asking, just use the phrase
labeledArray = [0, x]
This way, you will get a standard list with 0 as the first element and a Numpy array as the second element.
However, in practice, you are probably trying to label for the purpose of later recall. In that case, I'd recommend you use a dictionary, as it is less confusing to keep track of:
myArrays = {}
myArrays[0] = x
Which can be used as follows:
>>> myArrays
{0: array([[1, 2, 3, 4],
[5, 6, 7, 8]])}
>>> myArrays[0]
array([[1, 2, 3, 4],
[5, 6, 7, 8]])

concatinate numpy matrices to get an array with dimension 3

I want to concatenate numpy matrices that have different shapes in order to get an array with dimension=3.
example :
A= [[2 1 3 4]
[2 4 0 6]
[9 5 7 4]]
B= [[7 2 8 4]
[8 6 8 6]]
and result what I need should be like that:
C=[[[2 1 3 4]
[2 4 0 6]
[9 5 7 4]]
[[7 2 8 4]
[8 6 8 6]]]
Thanks for help
If I understand your question correctly, a 3dim numpy array is probably not the way to represent your data, because there's no definitive shape.
A 3dim numpy array should have a shape of the form N1 x N2 x N3, whereas in your case each "2dim row" has a different shape.
Alternatives would be to keep your data in lists (or a list of arrays), or to use masked arrays, if that happens to be reasonable in you case.
You can only convert to a 3D np.ndarray in a useful manner if A.shape == B.shape. In that case all you need to do is e.g. C = np.array([A, B]).
import numpy as np
A = np.array([[2, 1, 3, 4],
[9, 5, 7, 4]])
B = np.array([[7, 2, 8, 4],
[8, 6, 8, 6]])
C = np.array([A, B])
print C
Because A and B have different sizes (# of rows), the best you can do make an array of shape (2,) and dtype object. Or at least that's what a simple construction gives you:
In [9]: np.array([A,B])
Out[9]:
array([array([[2, 1, 3, 4],
[2, 4, 0, 6],
[9, 5, 7, 4]]),
array([[7, 2, 8, 4],
[8, 6, 8, 6]])], dtype=object)
But constructing an array like this doesn't help much. Just use the list [A,B].
np.vstack([A,B]) produces a (5,4) array.
np.array([A[:2,:],B]) gives a (2,2,4) array. Or you could pad B so they are both (3,4).
So one way or other you need to redefine your problem.

Count number of elements in numpy ndarray

How do I count the number of elements of each datapoint in a ndarray?
What I want to do is to run a OneHotEncoder on all the values that are present at least N times in my ndarray.
I also want to replace all the values that appears less than N times with another element that it doesn't appear in the array (let's call it new_value).
So for example I have :
import numpy as np
a = np.array([[[2], [2,3], [3,34]],
[[3], [4,5], [3,34]],
[[3], [2,3], [3,4] ]]])
with threshold N=2 I want something like:
b = [OneHotEncoder(a[:,[i]])[0] if count(a[:,[i]])>2
else OneHotEncoder(new_value) for i in range(a.shape(1)]
So only to understand the substitutions that I want, not considering the onehotencoder and using new_value=10 my array should look like:
a = np.array([[[10], [2,3], [3,34]],
[[3], [10], [3,34]],
[[3], [2,3], [10] ]]])
How about something like this?
First count the number of unqiue elements in an array:
>>> a=np.random.randint(0,5,(3,3))
>>> a
array([[0, 1, 4],
[0, 2, 4],
[2, 4, 0]])
>>> ua,uind=np.unique(a,return_inverse=True)
>>> count=np.bincount(uind)
>>> ua
array([0, 1, 2, 4])
>>> count
array([3, 1, 2, 3])
From the ua and count arrays it shows that 0 shows up 3 times, 1 shows up 1 time, and so on.
import numpy as np
def mask_fewest(arr,thresh,replace):
ua,uind=np.unique(arr,return_inverse=True)
count=np.bincount(uind)
#Here ua has all of the unique elements, count will have the number of times
#each appears.
##Jamie's suggestion to make the rep_mask faster.
rep_mask = np.in1d(uind, np.where(count < thresh))
#Find which elements do not appear at least `thresh` times and create a mask
arr.flat[rep_mask]=replace
#Replace elements based on above mask.
return arr
>>> a=np.random.randint(2,8,(4,4))
[[6 7 7 3]
[7 5 4 3]
[3 5 2 3]
[3 3 7 7]]
>>> mask_fewest(a,5,50)
[[10 7 7 3]
[ 7 5 10 3]
[ 3 5 10 3]
[ 3 3 7 7]]
For the above example: Let me know if you intended a 2D array or 3D array.
>>> a
[[[2] [2, 3] [3, 34]]
[[3] [4, 5] [3, 34]]
[[3] [2, 3] [3, 4]]]
>>> mask_fewest(a,2,10)
[[10 [2, 3] [3, 34]]
[[3] 10 [3, 34]]
[[3] [2, 3] 10]]

Categories