How can I convert an ndarray to a matrix in scipy? - python

How can I convert an ndarray to a matrix in numpy? I'm trying to import data from a csv and turn it into a matrix.
from numpy import array, matrix, recfromcsv
my_vars = ['docid','coderid','answer1','answer2']
toy_data = matrix( array( recfromcsv('toy_data.csv', names=True)[my_vars] ) )
print toy_data
print toy_data.shape
But I get this:
[[(1, 1, 3, 3) (1, 2, 4, 1) (1, 3, 7, 2) (2, 1, 3, 3) (2, 2, 4, 4)
(2, 4, 3, 1) (3, 1, 3, 3) (3, 2, 4, 3) (3, 3, 3, 4) (4, 4, 5, 1)
(4, 5, 6, 2) (4, 2, 4, 3) (5, 2, 5, 4) (5, 3, 3, 1) (5, 4, 7, 2)
(6, 1, 3, 3) (6, 5, 4, 1) (6, 2, 5, 2)]]
(1, 18)
What do I have to do to get a 4 by 18 matrix out of this code? There's got to be an easy answer to this question, but I just can't find it.

If the ultimate goal is to make a matrix, there's no need to create a recarray with named columns. You could use np.loadtxt to load the csv into an ndarray, then use np.asmatrix to convert it to a matrix:
import numpy as np
toy_data = np.asmatrix(np.loadtxt('toy_data.csv',delimiter=','skiprows=1))
print toy_data
print toy_data.shape
yields
[[ 1. 1. 3. 3.]
[ 1. 2. 4. 1.]
[ 1. 3. 7. 2.]
[ 2. 1. 3. 3.]
[ 2. 2. 4. 4.]
[ 2. 4. 3. 1.]
[ 3. 1. 3. 3.]
[ 3. 2. 4. 3.]
[ 3. 3. 3. 4.]
[ 4. 4. 5. 1.]
[ 4. 5. 6. 2.]
[ 4. 2. 4. 3.]
[ 5. 2. 5. 4.]
[ 5. 3. 3. 1.]
[ 5. 4. 7. 2.]
[ 6. 1. 3. 3.]
[ 6. 5. 4. 1.]
[ 6. 2. 5. 2.]]
(18, 4)
Note: the skiprows argument is used to skip over the header in the csv.

You can just read all your values into a vector, then reshape it.
fo = open("toy_data.csv")
def _ReadCSV(fileobj):
for line in fileobj:
for el in line.split(","):
yield float(el)
header = map(str.strip, fo.readline().split(","))
a = numpy.fromiter(_ReadCSV(fo), numpy.float64)
a.shape = (-1, len(header))
But there may be an even more direct way with newer numpy.

Related

How to initialize locations of numpy array using dictionary keys and values?

I have the following numpy array which is basically a 3 channel image:
arr = np.zeros((6, 4, 3), dtype=np.float32)
# dictionary of values, key is array location
values_of_channel_0 = {
(0, 2) : 1,
(1, 0) : 1,
(1, 3) : 5,
(2, 1) : 2,
(2, 2) : 3,
(2, 3) : 1,
(3, 0) : 1,
(3, 2) : 2,
(4, 0) : 2,
(4, 2) : 20,
(5, 0) : 1,
(5, 2) : 10,
(5, 3) : 1
}
I am trying to find the most elegant way to set all the values of the 3rd channel according to the dictionary. Here is what I tried:
locations = list(values_of_channel_0.keys())
values = list(values_of_channel_0.values())
arr[lc,0] = values # trying to set the 3rd channel
But this fails.
Is there a way in which this can be done without looping over keys and values?
What's wrong with a simple loop? Something will have to iterate over the key/value-pairs you provide in your dictionary in any case?
import numpy as np
arr = np.zeros((6, 4, 3), dtype=np.float32)
# dictionary of values, key is array location
values_of_channel_0 = {
(0, 2) : 1,
(1, 0) : 1,
(1, 3) : 5,
(2, 1) : 2,
(2, 2) : 3,
(2, 3) : 1,
(3, 0) : 1,
(3, 2) : 2,
(4, 0) : 2,
(4, 2) : 20,
(5, 0) : 1,
(5, 2) : 10,
(5, 3) : 1
}
for (a, b), v in values_of_channel_0.items():
arr[a, b, 0] = v
print(arr)
Result:
[[[ 0. 0. 0.]
[ 0. 0. 0.]
[ 1. 0. 0.]
[ 0. 0. 0.]]
[[ 1. 0. 0.]
[ 0. 0. 0.]
[ 0. 0. 0.]
[ 5. 0. 0.]]
[[ 0. 0. 0.]
[ 2. 0. 0.]
[ 3. 0. 0.]
[ 1. 0. 0.]]
[[ 1. 0. 0.]
[ 0. 0. 0.]
[ 2. 0. 0.]
[ 0. 0. 0.]]
[[ 2. 0. 0.]
[ 0. 0. 0.]
[20. 0. 0.]
[ 0. 0. 0.]]
[[ 1. 0. 0.]
[ 0. 0. 0.]
[10. 0. 0.]
[ 1. 0. 0.]]]
If you insist on not looping for the assignment, you can construct a data structure that can be assigned at once:
channel_0 = [[values_of_channel_0[b, a] if (b, a) in values_of_channel_0 else 0 for a in range(4)] for b in range(6)]
arr[..., 0] = channel_0
But this is clearly rather pointless and not even more efficient. If you have some control over how values_of_channel_0 is constructed, you could consider constructing it as a nested list or array of the right dimensions immediately, to allow for this type of assignment.
Users #mechanicpig and #michaelszczesny offer a very clean alternative (which will be more efficient since it relies on the efficient implementation of zip()):
arr[(*zip(*values_of_channel_0), 0)] = list(values_of_channel_0.values())
Edit: you asked for an explanation of the lefthand side.
This hinges on the unpacking operator *. *values_of_channel_0 spreads all the keys of the dictionary values_of_channel_0 into a call to zip(). Since these keys are all 2-tuples of int, zip will yield two tuples, one with all the first coordinates (0, 1, 1, ...) and the second with the second coordinates (2, 0, 3, ...).
Since the call to zip() is also preceded by *, these two values will be spread to index arr[], together with a final coordinate 0. So this:
arr[(*zip(*values_of_channel_0), 0)] = ...
Is essentially the same as:
arr[((0, 1, 1, ...), (2, 0, 3, ...), 0)] = ...
That's a slice of arr with exactly the same number of elements as the dictionary, including all the elements with the needed coordinates. And so assigning list(values_of_channel_0.values()) to it works and has the desired effect of assigning the matching values to the correct coordinates.

How to do an index increment for my case?

I have to find the minimum for each column in a matrix, but there are two rules;
Each column will start from the "index+1" of the previous column, except for the first column
If one of the columns has exactly the index equal to the total number of rows of the matrix, then the rest indices for all the columns will be equal to the number of rows
As an example;
[[ "-1". -11. 0. 8. 1. ]
[ 2. 1. 0. 5. 1. ]
[ 4. 1. -2. 6. 7. ]
[ 8. 3. 1. 3. 0. ]
[ 5. "0". 1. 0. 8. ]
[ 9. 3. "-1". -1. 6.5]
[ 5. 3. 2. 5. 3. ]
[ 10. 3. 7. "1". "-1". ]]
The indices are inside quotations, [0,4,5,7,7]
Another example;
[[ 1. 1. 0. 0. 1.]
[ 2. 1. 0. 5. 1.]
[-4. -1. 2. 6. 7.]
['-5' 3. 1. 1. 0.]
[ 5. '0'. 1. 0. 8.]
[ 5. 3. '-1'. -1. 0.]
[ 5. 3. 1. '1'. 0.]
[ 5. 3. 1. 1. 0.]]
The indices here are [3,4,5,6,7]
I tried to do the following, but I am having errors. Could you please tell me how to do so?
def lst_min(matrix, columnindex, minIndex):
if minIndex == matrix.shape[0]:
return matrix.shape[0]
else:
return np.argmin(matrix[minIndex:, columnindex]) + minIndex
currentMinIndex = 0
lst = []
for i in range(a.shape[1]):
w = lst_min(matrix=a, columnindex=i, minIndex=currentMinIndex)
if w > a.shape[0]:
w = a.shape[0]
lst.append(w)
if w == 0:
c = 1
currentMinIndex = w + c
if currentMinIndex > a.shape[0]:
currentMinIndex = a.shape[0]
Your code use lst.append(w) with some strange logic...
You use lst_min to find minimum index but you simply return matrix.shape[0] when currentMinIndex = 0 (minIndex == matrix.shape[0]) at start.
FIY,
# post source code next time if you can,
# it will be really helpful to others to run your question easily
# and focus on the main problem quickly.
a = np.array(
[[ -1, -11, 0, 8, 1. ],
[ 2, 1, 0, 5, 1. ],
[ 4, 1, -2, 6, 7. ],
[ 8, 3, 1, 3, 0. ],
[ 5, 0, 1, 0, 8. ],
[ 9, 3, -1, -1, 6.5],
[ 5, 3, 2, 5, 3. ],
[ 10, 3, 7, 1, -1. ]]
)
b = np.array(
[[ 1, 1, 0, 0, 1],
[ 2, 1, 0, 5, 1],
[-4, -1, 2, 6, 7],
[-5, 3, 1, 1, 0],
[ 5, 0, 1, 0, 8],
[ 5, 3, -1, -1, 0],
[ 5, 3, 1, 1, 0],
[ 5, 3, 1, 1, 0]]
)
lst = []
start_idx = 0
for vec in a.T: # forloop column-wise
if start_idx >= vec.shape[0]-1: # index compare shape, should -1
lst.append(vec.shape[0]-1) # put the last index
else: # find minimum index
min_idx = np.argmin(vec[start_idx:]) # slice it
start_idx += min_idx # add back the true index
lst.append(start_idx) # append to result
start_idx += 1 # next column, use index + 1 (your rule 1)
if start_idx >= vec.shape[0]-1: # but if it is larger or equal, fit it back for next column use
start_idx = vec.shape[0]-1
The results should be:
>>>lst
[0, 4, 5, 7, 7]
# code change to b.T
>>>lst
[3, 4, 5, 6, 7]

Python3: Remove array elements with same coordinate (x,y)

I have this array (x,y,f(x,y)):
a=np.array([[ 1, 5, 3],
[ 4, 5, 6],
[ 4, 5, 6.1],
[ 1, 3, 42]])
I want to remove the duplicates with same x,y. In my array I have (4,5,6) and (4,5,6.1) and I want to remove one of them (no criterion).
If I had 2 columns (x,y) I could use
np.unique(a[:,:2], axis = 0)
But my array has 3 columns and I don't see how to do this in a simple way.
I can do a loop but my arrays can be very large.
Is there a way to do this more efficiently?
If I understand correctly, you need this:
a[np.unique(a[:,:2],axis=0,return_index=True)[1]]
output:
[[ 1. 3. 42.]
[ 1. 5. 3.]
[ 4. 5. 6.]]
Please be mindful that it does not keep the original order of rows in a. If you want to keep the order, simply sort the indices:
a[np.sort(np.unique(a[:,:2],axis=0,return_index=True)[1])]
output:
[[ 1. 5. 3.]
[ 4. 5. 6.]
[ 1. 3. 42.]]
I think you want to do this?
np.rint will round your numbers to an integer
import numpy as np
a = np.array([
[ 1, 5, 3],
[ 4, 5, 6],
[ 4, 5, 6.1],
[ 1, 3, 42]
])
a = np.unique(np.rint(a), axis = 0)
print(a)
//result :
[[ 1. 3. 42.]
[ 1. 5. 3.]
[ 4. 5. 6.]]

Compare neighbours boolean numpy array in grid

I want to write a function which compares the 8 neighbours of a node in my grid. When minimum of 3 of the neighbours have the same value as the central node, we can define the node as happy.
for example in this array the central node and value is 0, we see that it has 3 neighbours of 0, so the node is happy:
array([[ 1, 0, 1],
[ 1, 0, 1],
[-1, 0, 0]])
I expect an boolean output with True or False.
Can I think of something like this or can I use easily numpy for this?
def nodehappiness(grid, i, j, drempel=3):
if i,j => 3:
node == True
Thanks in advance
Try this:
def neighbours(grid, i, j):
rows = np.array([-1, -1, -1, 0, 0, 1, 1, 1])
cols = np.array([-1, 0, 1, -1, 1, -1, 0, 1])
return grid[rows+i,cols+j]
Edit: Example:
grid = np.arange(25).reshape((5,5))
#array([[ 0, 1, 2, 3, 4],
# [ 5, 6, 7, 8, 9],
# [10, 11, 12, 13, 14],
# [15, 16, 17, 18, 19],
# [20, 21, 22, 23, 24]])
neighbours(grid, 0, 0)
# array([24, 20, 21, 4, 1, 9, 5, 6])
Explanation:
With numpy you can use negative indices allowing you to easily access the last entries of an array. This will also work for multiple dimensions:
x = np.array([0,1,2,3])
x[-1]
# 3
x.reshape((2,2))
#array([[0, 1],
# [2, 3]])
x[-1,-1]
# 3
You are interested in 8 entries of the matrix.
left above -> row - 1, column - 1
above -> row - 1, column + 0
right above -> row - 1, column + 1
left -> row + 0, column - 1
...
Thats what the arrays rows and cols represent. By adding i and j you get all the entries around these coordinates.
Try this.
y=[]
l= len(x)
for i in range(0,l):
for j in range(0,l):
if i==int(l/2) and j==int(l/2):
continue
y.append(x[j,i])
You search something like this?
def neighbour(grid, i, j):
return np.delete((grid[i-1:i+2,j-1:j+2]).reshape(1,9),4)
# Test code
grid = np.arange(16).reshape(4,4)
b = neighbour(m, 2, 2)
Some hackery using ndimage.generic_filter:
from scipy import ndimage
def get_neighbors(arr):
output = []
def f(x):
output.append(x)
return 0
t = tuple(int((x - 1) / 2) for x in arr.shape)
footprint = np.ones_like(arr)
footprint[t] = 0
ndimage.generic_filter(arr, f, footprint=footprint, mode='wrap')
return np.array(output)
arr = np.arange(9).reshape(3, 3)
neighbors = get_neighbors(arr)
neighbors_grid = neighbors.reshape(*arr.shape, -1)
print(neighbors)
print(neighbors_grid)
Which prints:
# neighbors
[[8. 6. 7. 2. 1. 5. 3. 4.]
[6. 7. 8. 0. 2. 3. 4. 5.]
[7. 8. 6. 1. 0. 4. 5. 3.]
[2. 0. 1. 5. 4. 8. 6. 7.]
[0. 1. 2. 3. 5. 6. 7. 8.]
[1. 2. 0. 4. 3. 7. 8. 6.]
[5. 3. 4. 8. 7. 2. 0. 1.]
[3. 4. 5. 6. 8. 0. 1. 2.]
[4. 5. 3. 7. 6. 1. 2. 0.]]
# neighbors_grid
[[[8. 6. 7. 2. 1. 5. 3. 4.]
[6. 7. 8. 0. 2. 3. 4. 5.]
[7. 8. 6. 1. 0. 4. 5. 3.]]
[[2. 0. 1. 5. 4. 8. 6. 7.]
[0. 1. 2. 3. 5. 6. 7. 8.]
[1. 2. 0. 4. 3. 7. 8. 6.]]
[[5. 3. 4. 8. 7. 2. 0. 1.]
[3. 4. 5. 6. 8. 0. 1. 2.]
[4. 5. 3. 7. 6. 1. 2. 0.]]]
If you merely want the padded array:
padded = np.pad(arr, pad_width=1, mode='wrap')
print(padded)
Which of course gives:
[[8 6 7 8 6]
[2 0 1 2 0]
[5 3 4 5 3]
[8 6 7 8 6]
[2 0 1 2 0]]

I want to convert a matrix to a list python

Hi there
I need to convert a matrix to a list as the example below
Matrix:
[[ 1. 6. 13. 10. 2.]
[ 2. 9. 10. 13. 15.]
[ 3. 15. 13. 14. 16.]
[ 4. 5. 14. 13. 6.]
[ 5. 18. 16. 4. 3.]
[ 6. 7. 12. 18. 3.]
[ 7. 1. 8. 17. 11.]
[ 8. 14. 5. 4. 16.]
[ 9. 16. 18. 17. 15.]
[ 10. 8. 9. 15. 17.]
[ 11. 11. 17. 18. 12.]]
List:
[(1, 6, 13, 10, 2), (2, 9, 10, 13, 15), (3, 15, 13, 14, 16),
(4, 5, 14, 13, 6), (5, 18, 16, 4, 3), (6, 7, 12, 18, 3),
(7, 1, 8, 17, 11), (8, 14, 5, 4, 16), (9, 16, 18, 17, 15),
(10, 8, 9, 15, 17), (11, 11, 17, 18, 12)]
Thx in adavance
Is this a numpy matrix? If so, just use the tolist() method. E.g.:
import numpy as np
x = np.matrix([[1,2,3],
[7,1,3],
[9,4,3]])
y = x.tolist()
This yields:
y --> [[1, 2, 3], [7, 1, 3], [9, 4, 3]]
if you are using numpy and you want to just traverse the matrix as a list then you can just
from numpy import array
m = [[ 1. 6. 13. 10. 2.]
[ 2. 9. 10. 13. 15.]
[ 3. 15. 13. 14. 16.]
[ 4. 5. 14. 13. 6.]
[ 5. 18. 16. 4. 3.]
[ 6. 7. 12. 18. 3.]
[ 7. 1. 8. 17. 11.]
[ 8. 14. 5. 4. 16.]
[ 9. 16. 18. 17. 15.]
[ 10. 8. 9. 15. 17.]
[ 11. 11. 17. 18. 12.]]
for x in array(m).flat:
print x
This will not consume extra memory
The best way to do it is:
result = map(tuple, Matrix)
OR you can use one of those :
1- li = list(i for j in yourMatrix for i in j)
2- li = sum(yourMatrix, [])

Categories