Convert List into Numpy 2D Array - python

How to convert a list into a numpy 2D ndarray. For example:
lst = [20, 30, 40, 50, 60]
Expected result:
>> print(arr)
>> array([[20],
[30],
[40],
[50],
[60]])
>> print(arr.shape)
>> (5, 1)

Convert it to array and reshape:
x = np.array(x).reshape(-1,1)
reshape adds the column structure. The -1 in reshape takes care of the correct number of rows it requires to reshape.
output:
[[20]
[30]
[40]
[50]
[60]]

If you need your calculations more effective, use numpy arrays instead of list comprehensions. This is an alternative way using array broadcasting
x = [20, 30, 40, 50, 60]
x = np.array(x) #convert your list to numpy array
result = x[:, None] #use numpy broadcasting
if you still need a list type at the end, you can convert your result efficiently using result.tolist()

You may use a list comprehension and then convert it to numpy array:
import numpy as np
x = [20, 30, 40, 50, 60]
x_nested = [[item] for item in x]
x_numpy = np.array(x_nested)

Related

Get specific indices of Array

import numpy as np
X = [-10000, -1000, -100, -10, -1, 0, 1, 10, 100, 1000, 10000]
l = np.where(np.array(X) > 100)[0]
So i have
l = array([ 9, 10], dtype=int64)
Now I want to get a new Array X with the elements of l as indices. I want to get:
X = [1000, 10000]
I thought of:
X = X[l]
But it does not work. What is the proper function to use in this case? I don't want to use a for loop.
You need to convert your list X into a numpy array before you can index it with a list of indices:
>>> X = np.array(X)[l]
>>> X
array([ 1000, 10000])
You should convert list X to numpy array
x=np.array(X)
and then use indexing
x=x[x>100]
This will get the result you need without using where() function
"x>100" expression creates a boolean array with True elements corresponding to the elements of x array which are larger than 100. This array can be used as index to extract elements satisfying the condition.

How to remove specific values in a multi level numpy array given matrix of indices

suppose x = np.array([[30,60,70],[100,20,80]]) and i wish to remove all elements that are <60. That is, the resulting array should be x = np.array([[60,70],[100,80]]).
I use x = np.array([[30,60,70],[100,20,80]]) to find the indices of the needed elements. And I get indices = (array([0, 1]), array([0, 1])). However, when I am trying to delete the elements in x via np.delete(x, indices), i get array([ 70, 100, 20, 80]) rather than what i was hoping.
What can I do to achieve the desired result?
import numpy as np
x = np.array([[30, 60, 70],
[100, 20, 80]])
new_x = np.array([(np.delete(i, np.where(i < 60)[0])) for i in x])
print(new_x)
Got it this way but idk if works too slow for large arrays
import numpy as np
d = np.array([
[30,60,70],
[100, 20, 80]
])
f = lambda x: x > 60
a = np.array([a[f(a)] for a in d])
print(a)

Python Numpy optimized calculate mean of an array on indices given by other array

How can I calculate the mean values of a numpy array y using x indices from another array?
import numpy
x = numpy.array([100, 100, 20000, 20000, 100, 13, 100, numpy.nan])
y = numpy.array([10, 20, 30, 40, numpy.nan, 50, 60, 70])
expected result:
result['13']: (50)/1
result['100']: (10+20+60)/3
result['20000']: (30+40)/2
The following code works but is not efficient due to the size of my real datasets:
result = {}
unique = numpy.unique(x[~numpy.isnan(x)]).astype(int)
for elem in unique:
pos = numpy.where(x == elem)
avg = numpy.nanmean(y[pos])
result[elem]=avg
print(result)
I've read about numpy.bincount, but wasn't capable of using it.
Here is how to use bincount:
>>> nn=~(np.isnan(x)|np.isnan(y))
>>> xr,yr = x[nn],y[nn]
>>> unq,idx,cnt=np.unique(xr,return_inverse=True,return_counts=True)
>>> dict(zip(unq,np.bincount(idx,yr)/cnt))
{13.0: 50.0, 100.0: 30.0, 20000.0: 35.0}

merging multiple numpy arrays

i have 3 numpy arrays which store image data of shape (4,100,100).
arr1= np.load(r'C:\Users\x\Desktop\py\output\a1.npy')
arr2= np.load(r'C:\Users\x\Desktop\py\output\a2.npy')
arr3= np.load(r'C:\Users\x\Desktop\py\output\a3.npy')
I want to merge all 3 arrays into 1 array.
I have tried in this way:
merg_arr = np.zeros((len(arr1)+len(arr2)+len(arr3), 4,100,100), dtype=input_img.dtype)
now this make an array of the required length but I don't know how to copy all the data in this array. may be using a loop?
This will do the trick:
merge_arr = np.concatenate([arr1, arr2, arr3], axis=0)
np.stack arranges arrays along a new dimension. Their dimensions (except for the first) need to match.
Demo:
arr1 = np.empty((60, 4, 10, 10))
arr2 = np.empty((14, 4, 10, 10))
arr3 = np.empty((6, 4, 10, 10))
merge_arr = np.concatenate([arr1, arr2, arr3], axis=0)
print(merge_arr.shape) # (80, 4, 10, 10)

Removing every nth element in an array

How do I remove every nth element in an array?
import numpy as np
x = np.array([0,10,27,35,44,32,56,35,87,22,47,17])
n = 3 # remove every 3rd element
...something like the opposite of x[0::n]? I've tried this, but of course it doesn't work:
for i in np.arange(0,len(x),n):
x = np.delete(x,i)
You're close... Pass the entire arange as subslice to delete instead of attempting to delete each element in turn, eg:
import numpy as np
x = np.array([0,10,27,35,44,32,56,35,87,22,47,17])
x = np.delete(x, np.arange(0, x.size, 3))
# [10 27 44 32 35 87 47 17]
I just add another way with reshaping if the length of your array is a multiple of n:
import numpy as np
x = np.array([0,10,27,35,44,32,56,35,87,22,47,17])
x = x.reshape(-1,3)[:,1:].flatten()
# [10 27 44 32 35 87 47 17]
On my computer it runs almost twice faster than the solution with np.delete (between 1.8x and 1.9x to be honnest).
You can also easily perfom fancy operations, like m deletions each n values etc.
Here's a super fast version for 2D arrays: Remove every m-th row and n-th column from a 2D array (assuming the shape of the array is a multiple of (n, m)):
array2d = np.arange(60).reshape(6, 10)
m, n = (3, 5)
remove = lambda x, q: x.reshape(x.shape[0], -1, q)[..., 1:].reshape(x.shape[0], -1).T
remove(remove(array2d, n), m)
returns:
array([[11, 12, 13, 14, 16, 17, 18, 19],
[21, 22, 23, 24, 26, 27, 28, 29],
[41, 42, 43, 44, 46, 47, 48, 49],
[51, 52, 53, 54, 56, 57, 58, 59]])
To generalize for any shape use padding or reduce the input array depending on your situation.
Speed comparison:
from time import time
'remove'
start = time()
for _ in range(100000):
res = remove(remove(array2d, n), m)
time() - start
'delete'
start = time()
for _ in range(100000):
tmp = np.delete(array2d, np.arange(0, array2d.shape[0], m), axis=0)
res = np.delete(tmp, np.arange(0, array2d.shape[1], n), axis=1)
time() - start
"""
'remove'
0.3835930824279785
'delete'
3.173515558242798
"""
So, compared to numpy.delete the above method is significantly faster.

Categories