how to make several arrays by merging two other ones - python

hope doing well.
Is it possible to merge two big numpy array to make a several ones. These two arrays have the same number of rows. One array contains some names:
name_arr = [[sub_1],
[sub_2],
[sub_3],
...
[sub_n]]
The other one has some values:
value_arr = [[1, 2, 3],
[5, 2, -1],
[0, 0, 4],
...
[6, 18,200]]
Now, I want to extract numpy arrays using both the name_arr and value_arr. To be clear, I want to extract arrays with the names coming from name_arr and values coming from value_arr:
sub_1= [[1, 2, 3]]
sub_2= [[1, 2, 3]]
sub_3= [[0, 0, 4]]
...
sub_4= [[6, 18,200]]
I tried to use a for loop, but it was not successful:
for i in name_arr:
for j in value_arr:
if i == j:
name_arr [0, i] = value_arr [0, j]
but it was not successful at all ...
FYI, I made the arrays by splitting a dictionary,
Dict_data = {'sub_1' : [1, 2, 3],
'sub_2' : [5, 2, -1],
'sub_3' : [0, 0, 4],
... ,
'sub_n' : [6, 18,200]}
in case of having a solution to do my extraction directly from the dictionary, I deeply appreciate that. definitely I prefer to find a was to extract numpy arrays with the name of my keys and related data.
In advance, I appreciate any feedback.
Regards

I assume your name array contains strings as follows:
name_arr = ['sub_1',
'sub_2',
'sub_3',
...
'sub_n']
in this case you can simply do it with a for loop:
my_dict = {}
for i in range(len(name_arr)):
my_dict[name_arr[i]] = value_arr[i, :]

Related

Use numpy to mask a row containing only zeros

I have a large array of point cloud data which is generated using the azure kinect. All erroneous measurements are assigned the coordinate [0,0,0]. I want to remove all coordinates with the value [0,0,0]. Since my array is rater large (1 million points) and since U need to do this process in real-time, speed is of the essence.
In my current approach I try to use numpy to mask out all rows that contain three zeroes ([0,0,0]). However, the np.ma.masked_equal function does not evaluate an entire row, but only evaluates single elements. As a result, rows that contain at least one 0 are already filtered by this approach. I only want rows to be filtered when all values in the row are 0. Find an example of my code below:
my_data = np.array([[1,2,3],[0,0,0],[3,4,5],[2,5,7],[0,0,1]])
my_data = np.ma.masked_equal(my_data, [0,0,0])
my_data = np.ma.compress_rows(my_data)
output
array([[1, 2, 3],
[3, 4, 5],
[2, 5, 7]])
desired output
array([[1, 2, 3],
[3, 4, 5],
[2, 5, 7],
[0, 0, 1]])`
Find all data points that are 0 (doesn't require np.ma module) and then select all rows that do not contain all zeros:
import numpy as np
my_data = np.array([[1, 2, 3], [0, 0, 0], [3, 4, 5], [2, 5, 7], [0, 0, 1]])
my_data[~(my_data == 0).all(axis= 1)]
Output:
array([[1, 2, 3],
[3, 4, 5],
[2, 5, 7],
[0, 0, 1]])
Instead of using the np.ma.masked_equal and np.ma.compress_rows functions, you can use the np.all function to check if all values in a row are equal to [0, 0, 0]. This should be faster than your method as it evaluates all values in a row at once.
mask = np.all(my_data == [0, 0, 0], axis=1)
my_data = my_data[~mask]

Finding the Unique Arrays in an List of Arrays

I have a list of arrays, say
List = [A,B,C,D,E,...]
where each A,B,C etc. is an nxn array.
I wish to have the most efficient algorithm to find the unique nxn arrays in the list. That is, say if all entries of A and B are equal, then we discard one of them and generate the list
UniqueList = [A,C,D,E,...]
Not sure if there is a faster way, but I think this should be pretty fast (using the built-in unique function of numpy and choosing axis=0 to look for nxn unique arrays. More detail in the numpy doc):
[i for i in np.unique(np.array(List),axis=0)]
Example:
A = np.array([[1,1],[1,1]])
B = np.array([[1,1],[1,2]])
List = [A,B,A]
[array([[1, 1],
[1, 1]]),
array([[1, 1],
[1, 2]]),
array([[1, 1],
[1, 1]])]
Output:
[array([[1, 1],
[1, 1]]),
array([[1, 1],
[1, 2]])]

Append indices of element to each element

So basically I want to create a new array for each element and append the coordinates of the element to the original value (so adding the x and y position to the original element):
[ [7,2,4],[1,5,3] ]
then becomes
[ [[0,0,7][0,1,2][0,2,4]],
[[1,0,1][1,1,5][1,2,3]] ]
I've been looking for different ways to make this work with the axis system in NumPy but I'm probably overseeing some more obvious way.
You can try np.meshgrid to create a grid and then np.stack to combine it with input array:
import numpy as np
a = np.asarray([[7,2,4],[1,5,3]])
result = np.stack(np.meshgrid(range(a.shape[1]), range(a.shape[0]))[::-1] + [a], axis=-1)
Output:
array([[[0, 0, 7],
[0, 1, 2],
[0, 2, 4]],
[[1, 0, 1],
[1, 1, 5],
[1, 2, 3]]])
Let me know if it helps.
Without numpy you could use list comprehension:
old_list = [ [7,2,4],[1,5,3] ]
new_list = [ [[i,j,old_list[i][j]] for j in range(len(old_list[i]))] for i in range(old_list) ]
I'd assume that numpy is faster but the sublists are not required to have equal length in this solution.
Another approach using enumerate
In [38]: merge = list()
...: for i,j in enumerate(val):
...: merge.append([[i, m, n] for m, n in enumerate(j)])
...:
In [39]: merge
Out[39]: [[[0, 0, 7], [0, 1, 2], [0, 2, 4]], [[1, 0, 1], [1, 1, 5], [1, 2, 3]]]
Hope it useful
a = np.array([[7,2,4], [1,5,3]])
idx = np.argwhere(a)
idx = idx.reshape((*(a.shape), -1))
a = np.expand_dims(a, axis=-1)
a = np.concatenate((idx, a), axis=-1)

Replace numpy subarray when element matches a condition

I have an n x m x 3 numpy array. This represents a middle-step towards an RGB representation of a complex-function plotter. When the function being plotted takes infinite values or has singularities, parts of the RGB data become NaNs.
I'm looking for an efficient way to replace a row containing a NaN with a row of my choice, perhaps [0, 0, 0] or [1, 1, 1]. In terms of the RGB values, this has the effect of replacing poorly-behaving pixels with white or black pixels. By efficient, I mean some way that takes advantage of numpy's vectorization and speed.
Please note that I am not looking to merely replace the NaN values with 0 (which I know how to do with numpy.where); if a row contains a NaN, I want to replace the whole row. I suspect this can be done nicely in numpy, but I'm not sure how.
Concrete Question
Suppose we are given a 2 x 2 x 3 array arr. If a row contains a 5, I want to replace the row with [0, 0, 0]. Trivial code that does this slowly is as follows.
import numpy as np
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 3, 5], [2, 4, 6]]])
# so arr is
# array([[[1, 2, 3],
# [4, 5, 6]],
#
# [[1, 3, 5],
# [2, 4, 6]]])
# Trivial and slow version to replace rows containing 5 with [0,0,0]
for i in range(len(arr)):
for j in range(len(arr[i])):
if 5 in arr[i][j]:
arr[i][j] = np.array([0, 0, 0])
# Now arr is
#
# array([[[1, 2, 3],
# [0, 0, 0]],
#
# [[0, 0, 0],
# [2, 4, 6]]])
How can we accomplish this taking advantage of numpy?
A simpler way would be -
arr[np.isin(arr,5).any(-1)] = 0
If it's just a single value that you are looking for, then we could simplify to -
arr[(arr==5).any(-1)] = 0
If you are looking to match against NaN, we need to do the comparison differently and use np.isnan instead -
arr[np.isnan(arr).any(-1)] = 0
If you are looking to assign array values, instead of just 0, the solutions stay the same. Hence it would be -
arr[(arr==5).any(-1)] = new_array
Using np.broadcast_to
arr[np.broadcast_to((arr == 5).any(-1)[..., None], arr.shape)] = 0
array([[[1, 2, 3],
[0, 0, 0]],
[[0, 0, 0],
[2, 4, 6]]])
Just as FYI, based on your description, if you want to find np.nans instead of integers like 5, you shouldn't use ==, but rather np.isnan
arr[np.broadcast_to((np.isnan(arr)).any(-1)[..., None], arr.shape)] = 0
you can do it using in1d function like below
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 3, 5], [2, 4, 6]]])
arr[np.in1d(arr,5).reshape(arr.shape).any(axis=2)] = [0,0,0]
arr

Removing a specific list from an array

I have this list
list = [[0, 0], [0, 1], [0, 2], [1, 0], [1, 1], [1, 2], [2, 1], [2, 2], [2, 0]]
I want to take 2 integers
row = 2 and column = 1
Combine them
thing = (str(row) + str(", ") + str(column))
then I want to remove the list
[2, 1]
from the array. How would I do this?
EDIT: The language is Python
First of all, don't name your list list. It will overwrite the builtin function list() and potentially mess with your code later.
Secondly, finding and removing elements in a list is done like
data.remove(value)
or in your case
data.remove([2, 1])
Specifically, where you are looking for an entry [row, column], you would do
data.remove([row, column])
where row and column are your two variables.
It may be a bit confusing to name them row and column, though. because your data could be interpreted as a matrix/2D array, where "row" and "column" have a different meaning.

Categories