Related
how are you?
I have a distance matrix and need to perform a filter based on another list before applying some functions.
The matrix has 10 elements that represent machines and the distances between them, I need to filter this list by getting only the distances between some chosen machines.
matrix = [[0, 1, 3, 17, 24, 12, 18, 16, 17, 15],
[1, 0, 2, 2, 5, 6, 13, 11, 12, 10],
[3, 2, 0, 1, 6, 12, 18, 12, 17, 15],
[17, 2, 1, 0, 3, 12, 17, 15, 16, 14],
[24, 5, 6, 3, 0, 1, 24, 22, 23, 21],
[12, 6, 12, 12, 1, 0, 12, 10, 11, 9],
[18, 13, 18, 17, 24, 12, 0, 3, 4, 5],
[16, 11, 12, 15, 22, 10, 3, 0, 1, 2],
[17, 12, 17, 16, 23, 11, 4, 1, 0, 1],
[15, 10, 15, 14, 21, 9, 5, 2, 1, 0]]
The list used for filtering, for example, is:
filter_list = [1, 2, 7, 10]
The idea is to use this list to filter the rows and the indices of the sublists to get the final matrix:
final_matrix = [[0, 1, 18, 15],
[1, 0, 13, 10],
[18, 13, 0, 5],
[15, 10, 5, 0]]
It is worth noting that the filter list elements vary. Can someone please help me?
That's what I tried:
final_matrix = []
for i in range(0, len(filter_list)):
for j in range(0,len(filter_list[i])):
a = filter_list[i][j]
final_matrix .append(matrix[a-1])
print(final_matrix)
This is because the filter_list can have sublists. I get it:
final_matrix = [[0, 1, 3, 17, 24, 12, 18, 16, 17, 15],
[1, 0, 2, 2, 5, 6, 13, 11, 12, 10],
[18, 13, 18, 17, 24, 12, 0, 3, 4, 5],
[15, 10, 15, 14, 21, 9, 5, 2, 1, 0]]
I could not remove the spare elements.
You forgot to filter by column ids. You can do this using nested list comprehensions.
final_matrix = [[matrix[row-1][col-1] for col in filter_list] for row in filter_list]
final_matrix = []
for i in filter_list:
to_append = []
for j in filter_list:
to_append.append(matrix[i-1][j-1])
final_matrix.append(to_append)
or with list comprehension
final_matrix = [[matrix[i-1][j-1] for j in filter_list] for i in filter_list]
I have an array with two rows, each rows repeated 4 columns.
a = np.array([[ 0, 0, 0, 0, 4, 4, 4, 4, 7, 7, 7, 7, 1, 1, 1, 1],
[ 10, 10, 10, 10, 14, 14, 14, 14, 17, 17, 17, 17, 21, 21, 21, 21]])
I want to consider one value for 4 columns. For example, 0 for the 4 columns of the first row. I can not use the unique(), The output of a is:
b = np.array([[ 0,4, 7, 1],
[ 10,14, 17, 21]])
You can simply take every 4th column like so:
>>> a = np.array([[ 0, 0, 0, 0, 4, 4, 4, 4, 7, 7, 7, 7, 1, 1, 1, 1],
... [ 10, 10, 10, 10, 14, 14, 14, 14, 17, 17, 17, 17, 21, 21, 21, 21]])
>>> a[:,::4]
array([[ 0, 4, 7, 1],
[10, 14, 17, 21]])
For more info, see numpy slicing.
You can remove duplicates in a row
def remove_duplicates(arr):
"""
remove duplicates in a row from array
"""
if len(arr) == 0:
return arr
else:
i = 0
while i < len(arr) - 1:
if arr[i] == arr[i + 1]:
del arr[i]
else:
i += 1
return arr
print(remove_duplicates([0,0,0,0,1,1,1,1,0,0,0,0]))
[0, 1, 0]
print(remove_duplicates([0,0,0,0,4,4,4,4,7,7,7,7,1,1,1,1]))
[0, 4, 7, 1]
Use np.apply_along_axis, which applies a method across each row:
>>> np.apply_along_axis(lambda x: x[::4], axis=1, arr=a)
array([[ 0, 4, 7, 1],
[10, 14, 17, 21]])
Here, the function we pass in just takes every 4th element of the row (this assumes 4 is always static).
You could use itertools.groupby:
>>> import numpy as np
>>> from itertools import groupby
>>> a = np.array([[0, 0, 0, 0, 4, 4, 4, 4, 7, 7, 7, 7, 1, 1, 1, 1], [10, 10, 10, 10, 14, 14, 14, 14, 17, 17, 17, 17, 21, 21, 21, 21]])
>>> a
array([[ 0, 0, 0, 0, 4, 4, 4, 4, 7, 7, 7, 7, 1, 1, 1, 1],
[10, 10, 10, 10, 14, 14, 14, 14, 17, 17, 17, 17, 21, 21, 21, 21]])
>>> b = np.array([[k for k, _ in groupby(arr)] for arr in a])
>>> b
array([[ 0, 4, 7, 1],
[10, 14, 17, 21]])
I have an array like this:
A = np.array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20]])
What I want to do is add 1 to each value in the first and last column. I want to understand broadcasting (avoid loops), by using this and appropriate vector, but I have tried but it doesn't work. Expected results:
A = np.array([[ 2, 2, 3, 4, 6],
[ 7, 7, 8, 9, 11],
[12, 12, 13, 14, 16],
[17, 17, 18, 19, 21]])
You can use numpy indexing to do this. Try this:
# 0 is the first and -1 is the last column
A[:,[0,-1]] = A[:,[0,-1]]+1
Or
A[:,(0,-1)] = A[:,(0,-1)]+1
Or
A[:,[0,-1]]+=1
Or
A[:,(0,-1)]+=1
Output in either case:
array([[ 2, 2, 3, 4, 6],
[ 7, 7, 8, 9, 11],
[12, 12, 13, 14, 16],
[17, 17, 18, 19, 21]])
You can use vector [1,0,0,0,1] and python will do broadcasting for you.
b = np.array([1,0,0,0,1])
A + b
array([[ 2, 2, 3, 4, 6],
[ 7, 7, 8, 9, 11],
[12, 12, 13, 14, 16],
[17, 17, 18, 19, 21]])
If you would like to know how broadcasting works, you can simply try to broadcast once by yourself.
b = np.array([1,0,0,0,1])
B = np.tile(b,(A.shape[0],1))
array([[1, 0, 0, 0, 1],
[1, 0, 0, 0, 1],
[1, 0, 0, 0, 1],
[1, 0, 0, 0, 1]])
A + B
Same result.
I've got K feature vectors that all share dimension n but have a variable dimension m (n x m). They all live in a list together.
to_be_padded = []
to_be_padded.append(np.reshape(np.arange(9),(3,3)))
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
to_be_padded.append(np.reshape(np.arange(18),(3,6)))
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17]])
to_be_padded.append(np.reshape(np.arange(15),(3,5)))
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
What I am looking for is a smart way to zero pad the rows of these np.arrays such that they all share the same dimension m. I've tried solving it with np.pad but I have not been able to come up with a pretty solution. Any help or nudges in the right direction would be greatly appreciated!
The result should leave the arrays looking like this:
array([[0, 1, 2, 0, 0, 0],
[3, 4, 5, 0, 0, 0],
[6, 7, 8, 0, 0, 0]])
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17]])
array([[ 0, 1, 2, 3, 4, 0],
[ 5, 6, 7, 8, 9, 0],
[10, 11, 12, 13, 14, 0]])
You could use np.pad for that, which can also pad 2-D arrays using a tuple of values specifying the padding width, ((top, bottom), (left, right)). For that you could define:
def pad_to_length(x, m):
return np.pad(x,((0, 0), (0, m - x.shape[1])), mode = 'constant')
Usage
You could start by finding the ndarray with the highest amount of columns. Say you have two of them, a and b:
a = np.array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
b = np.array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
m = max(i.shape[1] for i in [a,b])
# 5
And then use this parameter to pad the ndarrays:
pad_to_length(a, m)
array([[0, 1, 2, 0, 0],
[3, 4, 5, 0, 0],
[6, 7, 8, 0, 0]])
I believe there is no very efficient solution for this. I think you will need to loop over the list with a for loop and treat every array individually:
for i in range(len(to_be_padded)):
padded = np.zeros((n, maxM))
padded[:,:to_be_padded[i].shape[1]] = to_be_padded[i]
to_be_padded[i] = padded
where maxM is the longest m of the matrices in your list.
This is my first time handling multidimensional arrays and I'm having problems accessing elements. I'm trying to get the red pixels of a picture but just the first 8 elements within the array. Here's the code
import Image
import numpy as np
im = Image.open("C:\Users\Jones\Pictures\1.jpg")
pix = im.load()
r, g, b = np.array(im).T
print r[0:8]
Since you're dealing with images, r is a 2-D array. To get the first 8 pixels in the image, try
r.flatten()[:8]
This will wrap around automatically if the first row has less than 8 pixels.
do you want all rows too? Try this r[:,:8]
only want the first row? Try this r[0,:8]
You can do it like this:
r[0][:8]
Note, however, that this will not work if the first row has less than 8 pixels. To fix that, do this:
from itertools import chain
r = list(chain.from_iterable(r))
r[:8]
or (if you don't want to import an entire module):
r = [val for element in r for val in element]
r[:8]
I think it could be more simple. This example uses a random matrix (this will be your r matrix):
In [7]: from pylab import * # convention
In [8]: r = randint(0,10,(10,10)) # this is your image
In [9]: r
array([[7, 9, 5, 5, 6, 8, 1, 4, 3, 4],
[5, 4, 4, 4, 2, 6, 2, 6, 4, 2],
[1, 4, 9, 9, 2, 6, 1, 9, 0, 6],
[5, 9, 0, 7, 9, 9, 5, 2, 0, 7],
[8, 3, 3, 9, 0, 0, 5, 9, 2, 2],
[5, 3, 7, 8, 8, 1, 6, 3, 2, 0],
[0, 2, 5, 7, 0, 1, 0, 2, 1, 2],
[4, 0, 4, 5, 9, 9, 3, 8, 3, 7],
[4, 6, 9, 9, 5, 9, 3, 0, 5, 1],
[6, 9, 9, 0, 3, 4, 9, 7, 9, 6]])
Then, extract first 8 columns and do something
In [17]: r_8 = r[:,:8] # extract columns
In [18]: r_8
Out[18]:
array([[7, 9, 5, 5, 6, 8, 1, 4],
[5, 4, 4, 4, 2, 6, 2, 6],
[1, 4, 9, 9, 2, 6, 1, 9],
[5, 9, 0, 7, 9, 9, 5, 2],
[8, 3, 3, 9, 0, 0, 5, 9],
[5, 3, 7, 8, 8, 1, 6, 3],
[0, 2, 5, 7, 0, 1, 0, 2],
[4, 0, 4, 5, 9, 9, 3, 8],
[4, 6, 9, 9, 5, 9, 3, 0],
[6, 9, 9, 0, 3, 4, 9, 7]])
In [19]: r_8 = r_8 * 2 # do something
In [20]: r_8
Out[20]:
array([[14, 18, 10, 10, 12, 16, 2, 8],
[10, 8, 8, 8, 4, 12, 4, 12],
[ 2, 8, 18, 18, 4, 12, 2, 18],
[10, 18, 0, 14, 18, 18, 10, 4],
[16, 6, 6, 18, 0, 0, 10, 18],
[10, 6, 14, 16, 16, 2, 12, 6],
[ 0, 4, 10, 14, 0, 2, 0, 4],
[ 8, 0, 8, 10, 18, 18, 6, 16],
[ 8, 12, 18, 18, 10, 18, 6, 0],
[12, 18, 18, 0, 6, 8, 18, 14]])
Now, this is the trick. Replace the first 8 columns in r using hstack:
In [21]: r = hstack((r_8, r[:,8:])) # it replaces the FISRT 8 columns, note the indexing notation
In [22]: r
Out[22]:
array([[14, 18, 10, 10, 12, 16, 2, 8, 3, 4], # it does not touch the last 2 columns
[10, 8, 8, 8, 4, 12, 4, 12, 4, 2],
[ 2, 8, 18, 18, 4, 12, 2, 18, 0, 6],
[10, 18, 0, 14, 18, 18, 10, 4, 0, 7],
[16, 6, 6, 18, 0, 0, 10, 18, 2, 2],
[10, 6, 14, 16, 16, 2, 12, 6, 2, 0],
[ 0, 4, 10, 14, 0, 2, 0, 4, 1, 2],
[ 8, 0, 8, 10, 18, 18, 6, 16, 3, 7],
[ 8, 12, 18, 18, 10, 18, 6, 0, 5, 1],
[12, 18, 18, 0, 6, 8, 18, 14, 9, 6]])
EDIT: as to what DSM pointed out, OP is infact using a numpy array.
i retract my answer as nneonneo's correct