I have a requirement where I want to convert a 2D matrix to 3D by separating 3 unique values across 3 dimensions.
For Example:
convert
A = [1 2 3 3
1 1 2 1
3 2 2 3
1 3 3 2]
to
A = [[1 0 0 0
1 1 0 1
0 0 0 0
1 0 0 0]
[0 1 0 0
0 0 1 0
0 1 1 0
0 0 0 1]
[0 0 1 1
0 0 0 0
1 0 0 1
0 1 1 0]]
Pardon me if the syntax of matrix representation is not correct.
Use broadcasting with outer-equality for a vectorized solution -
# Input array
In [8]: A
Out[8]:
array([[1, 2, 3, 3],
[1, 1, 2, 1],
[3, 2, 2, 3],
[1, 3, 3, 2]])
In [11]: np.equal.outer(np.unique(A),A).view('i1')
Out[11]:
array([[[1, 0, 0, 0],
[1, 1, 0, 1],
[0, 0, 0, 0],
[1, 0, 0, 0]],
[[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 1, 1, 0],
[0, 0, 0, 1]],
[[0, 0, 1, 1],
[0, 0, 0, 0],
[1, 0, 0, 1],
[0, 1, 1, 0]]], dtype=int8)
To use the explicit dimension-extension + comparison, it would be :
(A == np.unique(A)[:,None,None]).view('i1')
You can use np.unique and take advantage of boolean arrays and cast them to int using numpy.ndarray.astype.
import numpy as np
a=np.array([[1, 2, 3, 3], [1, 1, 2, 1], [3, 2, 2, 3], [1, 3, 3, 2]])
[a==i.astype(int) for i in np.unique(a)]
Output:
[array([[1, 0, 0, 0],
[1, 1, 0, 1],
[0, 0, 0, 0],
[1, 0, 0, 0]]),
array([[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 1, 1, 0],
[0, 0, 0, 1]]),
array([[0, 0, 1, 1],
[0, 0, 0, 0],
[1, 0, 0, 1],
[0, 1, 1, 0]])]
EDIT: Ch3steR's answer is better
A = np.array([[1,2,3,3], [1,1,2,1], [3,2,2,3], [1,3,3,2]])
unique_values = np.unique(A)
B = np.array([np.zeros_like(A) for i in range(len(unique_values))])
for idx, value in enumerate(unique_values):
B[idx][A == value] = 1
Related
I've started learning numpy since yesterday.
my AIM is
Extract odd index elements from numpy array & even index elements from numpy and merge side by side vertically.
Let's say I have the array
mat = np.array([[1, 1, 0, 0, 0],
[0, 1, 0, 0, 1],
[1, 0, 0, 1, 1],
[0, 0, 0, 0, 0],
[1, 0, 1, 0, 1]])
What I've tried.
-->I've done transposing as I've to merge side by by side vertically.
mat = np.transpose(mat)
Which gives me
[[1 0 1 0 1]
[1 1 0 0 0]
[0 0 0 0 1]
[0 0 1 0 0]
[0 1 1 0 1]]
I've tried accessing odd index elements
odd = mat[1::2] print(odd)
Gives me
[[1 1 0 0 0] ----> wrong...should be [0,1,0,0,1] right? I'm confused
[0 0 1 0 0]] --->wrong...Should be [0,0,0,0,0] right? Where these are coming from?
My final output should like like
[[0 0 1 1 1]
[1 0 1 0 0]
[0 0 0 0 1]
[0 0 0 1 0]
[1 0 0 1 1]]
Type - np.nd array
Looks like you want:
mat[np.r_[1:mat.shape[0]:2,:mat.shape[0]:2]].T
Output:
array([[0, 0, 1, 1, 1],
[1, 0, 1, 0, 0],
[0, 0, 0, 0, 1],
[0, 0, 0, 1, 0],
[1, 0, 0, 1, 1]])
Intermediate:
np.r_[1:mat.shape[0]:2,:mat.shape[0]:2]
output: array([1, 3, 0, 2, 4])
While the selection of rows is straight forward, there are various ways of combining them.
In [244]: mat = np.array([[1, 1, 0, 0, 0],
...: [0, 1, 0, 0, 1],
...: [1, 0, 0, 1, 1],
...: [0, 0, 0, 0, 0],
...: [1, 0, 1, 0, 1]])
The odd rows:
In [245]: mat[1::2,:] # or mat[1::2]
Out[245]:
array([[0, 1, 0, 0, 1],
[0, 0, 0, 0, 0]])
The even rows:
In [246]: mat[0::2,:]
Out[246]:
array([[1, 1, 0, 0, 0],
[1, 0, 0, 1, 1],
[1, 0, 1, 0, 1]])
Joining the rows verticallly (np.vstack can also be used):
In [247]: np.concatenate((mat[1::2,:], mat[0::2,:]), axis=0)
Out[247]:
array([[0, 1, 0, 0, 1],
[0, 0, 0, 0, 0],
[1, 1, 0, 0, 0],
[1, 0, 0, 1, 1],
[1, 0, 1, 0, 1]])
But since you want columns - tranpose:
In [248]: np.concatenate((mat[1::2,:], mat[0::2,:]), axis=0).transpose()
Out[248]:
array([[0, 0, 1, 1, 1],
[1, 0, 1, 0, 0],
[0, 0, 0, 0, 1],
[0, 0, 0, 1, 0],
[1, 0, 0, 1, 1]])
We could transpose the selections first:
np.concatenate((mat[1::2,:].T, mat[0::2,:].T), axis=1)
or transpose before indexing (note the change in the ':' slice position):
np.concatenate((mat.T[:,1::2], mat.T[:,0::2]), axis=1)
The r_ in the other answer converts the slices into arrays and concatenates them, to make one row indexing array. That's equally valid.
So here alternate is the logic you can use.
1. convert array to list
2. Access nested list items based on mat[1::2] - odd & mat[::2] for even
3. concat them using np.concat at `axis =0` vertically.
4. Transpose them.
Implementaion.
mat = np.array([[1, 1, 0, 0, 0],
[0, 1, 0, 0, 1],
[1, 0, 0, 1, 1],
[0, 0, 0, 0, 0],
[1, 0, 1, 0, 1]])
mat_list = mat.tolist() ##############Optional
l_odd = mat_list[1::2]
l_even= mat_list[::2]
mask = np.concatenate((l_odd, l_even), axis=0)
mask = np.transpose(mask)
print(mask)
output #
[[0 0 1 1 1]
[1 0 1 0 0]
[0 0 0 0 1]
[0 0 0 1 0]
[1 0 0 1 1]]
Checking Type
print(type(mask))
Gives
<class 'numpy.ndarray'>
For editors: this is NOT stripping all strings in an array but stripping the array itself
So suppose i have an array like this:
[[0, 1, 8, 4, 0, 0],
[1, 2, 3, 0, 0, 0],
[3, 2, 3, 0, 5, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]
I want a function stripArray(0, array) where the first argument is the "empty" value. After applying this function i want the returned array to look like this:
[[0, 1, 8, 4, 0],
[1, 2, 3, 0, 0],
[3, 2, 3, 0, 5]]
Values that were marked as empty (in this case 0) were stripped from the right and bottom sides. How would I go about implementing such a function?
In the real case where I want to use it in the array instead of numbers there are dictionaries.
It is better to do this vectorized
import numpy as np
arr = np.array([[0, 1, 8, 4, 0, 0],
[1, 2, 3, 0, 0, 0],
[3, 2, 3, 0, 5, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]])
def stripArray(e, arr):
return arr[(arr!=e).any(axis = 1), :][:, (arr!=e).any(axis = 0)]
stripArray(0, arr)
array([[0, 1, 8, 4, 0],
[1, 2, 3, 0, 0],
[3, 2, 3, 0, 5]])
Here is an answer which doesnt need numpy:
from typing import List, Any
def all_value(value: Any, arr: List[float]) -> bool:
return all(map(lambda x: x==value, arr))
def transpose_array(arr: List[List[float]]) -> List[List[float]]:
return list(map(list, zip(*arr)))
def strip_array(value: Any, arr: List[List[float]]) -> List[List[float]]:
# delete empty rows
arr = [row for row in arr if not all_value(value, row)]
#transpose and delete empty columns
arr = transpose_array(arr)
arr = [col for col in arr if not all_value(value, col)]
#transpose back
arr = transpose_array(arr)
return arr
test = [[0, 1, 8, 4, 0, 0],
[1, 2, 3, 0, 0, 0],
[3, 2, 3, 0, 5, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]
result = strip_array(0, test)
Output:
result
[[0, 1, 8, 4, 0],
[1, 2, 3, 0, 0],
[3, 2, 3, 0, 5]]
Code:
def strip_array(array, empty_val=0):
num_bad_columns = 0
while np.all(array[:, -(num_bad_columns+1)] == 0):
num_bad_columns += 1
array = array[:, :(-num_bad_columns)]
num_bad_rows = 0
while np.all(array[-(num_bad_rows+1), :] == 0):
num_bad_rows += 1
array = array[:(-num_bad_rows), :]
return array
array = np.array(
[[0, 1, 8, 4, 0, 0],
[1, 2, 3, 0, 0, 0],
[3, 2, 3, 0, 5, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]
)
print(array)
print(strip_array(array, 0))
Output:
[[0 1 8 4 0 0]
[1 2 3 0 0 0]
[3 2 3 0 5 0]
[0 0 0 0 0 0]
[0 0 0 0 0 0]]
[[0 1 8 4 0]
[1 2 3 0 0]
[3 2 3 0 5]]
try using np.delete to remove unwanted rows or columns
data=[[0, 1, 8, 4, 0, 0],
[1, 2, 3, 0, 0, 0],
[3, 2, 3, 0, 5, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]
def drop_row(data):
lstIdx=[]
for i in range(len(data)):
count=0
for j in range(len(data[i])):
if data[i][j] == 0:
count+=1
if count==len(data[i]):
print("row zero")
lstIdx.append(i)
#for i in lstIdx:
data=np.delete(data,lstIdx,axis=0)
return data
def drop_column(data):
lstIdx=[]
if len(data)==0:
return data
for j in range(len(data[0])):
count=0
for i in range(len(data)):
if data[i][j] == 0:
count+=1
if count==len(data):
print("column zero")
lstIdx.append(j)
data=np.delete(data,lstIdx,axis=1)
return data
data=drop_row(data)
data=drop_column(data)
print(data)
output:
[[0 1 8 4 0]
[1 2 3 0 0]
[3 2 3 0 5]]
So, I would like to stack couple 2d arrays to vector so it would look like this:
[[[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]]
[[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]]
[[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]]]
I can make smth like this:
import numpy as np
a = np.zeros((5, 5), dtype=int)
b = np.zeros((5, 5), dtype=int)
c = np.stack((a, b), 0)
print(c)
To get this:
[[[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]]
[[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]]]
But I cant figure out how to add third 2d array to such vector or how to create such vector of 2d arrays iteratively in a loop. Append, stack, concat just dont keep the needed shape
So, any suggestions?
Thank you!
Conclusion:
Thanks to Tom and Mozway we've got two answers
Tom's:
data_x_train = x_train[np.where((y_train==0) | (y_train==1))
Mozway's:
out = np.empty((0,5,5))
while condition:
# get new array
a = XXX
out = np.r_[out, a[None]]
out
Assuming the following arrays:
a = np.ones((5, 5), dtype=int)
b = np.ones((5, 5), dtype=int)*2
c = np.ones((5, 5), dtype=int)*3
You can stack all at once using:
np.stack((a, b, c), 0)
If you really need to add the arrays iteratively, you can use np.r_:
out = a[None]
for i in (b,c):
out = np.r_[out, i[None]]
output:
array([[[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1]],
[[2, 2, 2, 2, 2],
[2, 2, 2, 2, 2],
[2, 2, 2, 2, 2],
[2, 2, 2, 2, 2],
[2, 2, 2, 2, 2]],
[[3, 3, 3, 3, 3],
[3, 3, 3, 3, 3],
[3, 3, 3, 3, 3],
[3, 3, 3, 3, 3],
[3, 3, 3, 3, 3]]])
edit: if you do not know the arrays in advance
out = np.empty((0,5,5))
while condition:
# get new array
a = XXX
out = np.r_[out, a[None]]
out
Do you mean something like:
np.tile(a, (3, 1, 1))
array([[[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]],
[[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]],
[[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]]])
Edit:
Do you mean something like:
test = np.tile(a, (3000, 1, 1))
filtered_subset = test[[1, 10, 100], :, :]
array([[[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]],
[[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]],
[[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]]])
This question already has answers here:
How can I one hot encode in Python?
(22 answers)
Closed 3 years ago.
I would like to take a list of values and transform them to a table (2D-list) of 0's and 1's, with one column for each unique number in the source list and an equal number of rows to the original. Each row will have a 1 if that column index matches the original value-1.
I have code that accomplishes this task, but I'm wondering if there is a better/faster way to do it. (The actual dataset has millions of entries vs. the simplified set below)
Sample Input:
value_list = [1, 2, 1, 3, 6, 5, 4, 3]
Desired output:
output_table = [[1, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 1, 0],
[0, 0, 0, 1, 0, 0],
[0, 0, 1, 0, 0, 0]]
Current Solution:
value_list = [1, 2, 1, 3, 6, 5, 4, 3]
max_val = max(value_list)
# initialize to table of 0's
a = [([0] * max_val) for i in range(len(value_list))]
# overwrite with 1's where required
for i in range(len(value_list)):
j = value_list[i] - 1
a[i][j] = 1
print(f'a = ')
for row in a:
print(f'{row}')
You can do:
import numpy as np
value_list = [1, 2, 1, 3, 6, 5, 4, 3]
# create matrix of zeros
x = np.zeros(shape=(len(value_list), max(value_list)), dtype='int')
for i,v in enumerate(value_list):
x[i,v-1] = 1
print(x)
Output:
[[1 0 0 0 0 0]
[0 1 0 0 0 0]
[1 0 0 0 0 0]
[0 0 1 0 0 0]
[0 0 0 0 0 1]
[0 0 0 0 1 0]
[0 0 0 1 0 0]
[0 0 1 0 0 0]]
You can try this:
dummy_list = [0]*6
output_table = [dummy_list[:i-1] + [1] + dummy_list[i:] for i in value_list]
Output:
output_table = [[1, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 1, 0],
[0, 0, 0, 1, 0, 0],
[0, 0, 1, 0, 0, 0]]
So I have a list of lists in python that is something like this:
[[[0, 1, 0, 1, 0]]
[[1, 1, 1, 1, 1]]
[[1, 0, 0, 1, 1]]
[[0, 1, 0, 0, 0]]]
I want to flatten this list and end up with this:
[[0, 1, 0, 1, 0]
[1, 1, 1, 1, 1]
[1, 0, 0, 1, 1]
[0, 1, 0, 0, 0]]
Is there a straightforward way to do this in python?
Using numpy.squeeze you can do what you want:
import numpy as np
a = np.array([[[0, 1, 0, 1, 0]],
[[1, 1, 1, 1, 1]],
[[1, 0, 0, 1, 1]],
[[0, 1, 0, 0, 0]]])
a.squeeze()
[[0 1 0 1 0]
[1 1 1 1 1]
[1 0 0 1 1]
[0 1 0 0 0]]
a = [[[0, 1, 0, 1, 0]],
[[1, 1, 1, 1, 1]],
[[1, 0, 0, 1, 1]],
[[0, 1, 0, 0, 0]]]
[i[0] for i in a]
output
[[0, 1, 0, 1, 0], [1, 1, 1, 1, 1], [1, 0, 0, 1, 1], [0, 1, 0, 0, 0]]