I want to convert a matrix to a list python - python

Hi there
I need to convert a matrix to a list as the example below
Matrix:
[[ 1. 6. 13. 10. 2.]
[ 2. 9. 10. 13. 15.]
[ 3. 15. 13. 14. 16.]
[ 4. 5. 14. 13. 6.]
[ 5. 18. 16. 4. 3.]
[ 6. 7. 12. 18. 3.]
[ 7. 1. 8. 17. 11.]
[ 8. 14. 5. 4. 16.]
[ 9. 16. 18. 17. 15.]
[ 10. 8. 9. 15. 17.]
[ 11. 11. 17. 18. 12.]]
List:
[(1, 6, 13, 10, 2), (2, 9, 10, 13, 15), (3, 15, 13, 14, 16),
(4, 5, 14, 13, 6), (5, 18, 16, 4, 3), (6, 7, 12, 18, 3),
(7, 1, 8, 17, 11), (8, 14, 5, 4, 16), (9, 16, 18, 17, 15),
(10, 8, 9, 15, 17), (11, 11, 17, 18, 12)]
Thx in adavance

Is this a numpy matrix? If so, just use the tolist() method. E.g.:
import numpy as np
x = np.matrix([[1,2,3],
[7,1,3],
[9,4,3]])
y = x.tolist()
This yields:
y --> [[1, 2, 3], [7, 1, 3], [9, 4, 3]]

if you are using numpy and you want to just traverse the matrix as a list then you can just
from numpy import array
m = [[ 1. 6. 13. 10. 2.]
[ 2. 9. 10. 13. 15.]
[ 3. 15. 13. 14. 16.]
[ 4. 5. 14. 13. 6.]
[ 5. 18. 16. 4. 3.]
[ 6. 7. 12. 18. 3.]
[ 7. 1. 8. 17. 11.]
[ 8. 14. 5. 4. 16.]
[ 9. 16. 18. 17. 15.]
[ 10. 8. 9. 15. 17.]
[ 11. 11. 17. 18. 12.]]
for x in array(m).flat:
print x
This will not consume extra memory

The best way to do it is:
result = map(tuple, Matrix)

OR you can use one of those :
1- li = list(i for j in yourMatrix for i in j)
2- li = sum(yourMatrix, [])

Related

Formating numpy arrays to find the sum betyween 2 values Python

I am trying to modify the values of the values down below to the expected values. The function down below is meant to sum out all the values between 2 consecutive elements of limits. none of the values are between 0 and 2 within Numbers so the resultant is 0. However the values between 2 and 5 are 3,4 within Numbers so the resultant is 3+4=7. The function has been gotten from issue: issue.
def formating(a, b):
# Formating goes here
x = np.sort(b);
# digitize
l = np.digitize(a, x)
# output:
result = np.bincount(l, weights=a)
return result
Numbers = np.array([3, 4, 5, 7, 8, 10,20])
limit1 = np.array([0, 2 , 5, 12, 15])
limit2 = np.array([0, 2 , 5, 12])
limit3 = np.array([0, 2 , 5, 12, 15, 22])
result1= formating(Numbers, limit1)
result2= formating(Numbers, limit2)
result3= formating(Numbers, limit3)
Current output
result1: [ 0. 0. 7. 30. 0. 20.]
result2: [ 0. 0. 7. 30. 20.]
result3: [ 0. 0. 7. 30. 0. 20.]
Wanted Output:
result1: [ 0. 7. 30. 0.]
result2: [ 0. 7. 30. ]
result3: [ 0. 7. 30. 0. 20.]
So just throw out the bins for numbers off the end.
result1 = result1[1:len(limit1)]
result2 = result2[1:len(limit2)]
result3 = result3[1:len(limit3)]
Or, for smarter results, end the function with:
result = np.bincount(1, weights=a)
return result[1:len(b)]

Compare neighbours boolean numpy array in grid

I want to write a function which compares the 8 neighbours of a node in my grid. When minimum of 3 of the neighbours have the same value as the central node, we can define the node as happy.
for example in this array the central node and value is 0, we see that it has 3 neighbours of 0, so the node is happy:
array([[ 1, 0, 1],
[ 1, 0, 1],
[-1, 0, 0]])
I expect an boolean output with True or False.
Can I think of something like this or can I use easily numpy for this?
def nodehappiness(grid, i, j, drempel=3):
if i,j => 3:
node == True
Thanks in advance
Try this:
def neighbours(grid, i, j):
rows = np.array([-1, -1, -1, 0, 0, 1, 1, 1])
cols = np.array([-1, 0, 1, -1, 1, -1, 0, 1])
return grid[rows+i,cols+j]
Edit: Example:
grid = np.arange(25).reshape((5,5))
#array([[ 0, 1, 2, 3, 4],
# [ 5, 6, 7, 8, 9],
# [10, 11, 12, 13, 14],
# [15, 16, 17, 18, 19],
# [20, 21, 22, 23, 24]])
neighbours(grid, 0, 0)
# array([24, 20, 21, 4, 1, 9, 5, 6])
Explanation:
With numpy you can use negative indices allowing you to easily access the last entries of an array. This will also work for multiple dimensions:
x = np.array([0,1,2,3])
x[-1]
# 3
x.reshape((2,2))
#array([[0, 1],
# [2, 3]])
x[-1,-1]
# 3
You are interested in 8 entries of the matrix.
left above -> row - 1, column - 1
above -> row - 1, column + 0
right above -> row - 1, column + 1
left -> row + 0, column - 1
...
Thats what the arrays rows and cols represent. By adding i and j you get all the entries around these coordinates.
Try this.
y=[]
l= len(x)
for i in range(0,l):
for j in range(0,l):
if i==int(l/2) and j==int(l/2):
continue
y.append(x[j,i])
You search something like this?
def neighbour(grid, i, j):
return np.delete((grid[i-1:i+2,j-1:j+2]).reshape(1,9),4)
# Test code
grid = np.arange(16).reshape(4,4)
b = neighbour(m, 2, 2)
Some hackery using ndimage.generic_filter:
from scipy import ndimage
def get_neighbors(arr):
output = []
def f(x):
output.append(x)
return 0
t = tuple(int((x - 1) / 2) for x in arr.shape)
footprint = np.ones_like(arr)
footprint[t] = 0
ndimage.generic_filter(arr, f, footprint=footprint, mode='wrap')
return np.array(output)
arr = np.arange(9).reshape(3, 3)
neighbors = get_neighbors(arr)
neighbors_grid = neighbors.reshape(*arr.shape, -1)
print(neighbors)
print(neighbors_grid)
Which prints:
# neighbors
[[8. 6. 7. 2. 1. 5. 3. 4.]
[6. 7. 8. 0. 2. 3. 4. 5.]
[7. 8. 6. 1. 0. 4. 5. 3.]
[2. 0. 1. 5. 4. 8. 6. 7.]
[0. 1. 2. 3. 5. 6. 7. 8.]
[1. 2. 0. 4. 3. 7. 8. 6.]
[5. 3. 4. 8. 7. 2. 0. 1.]
[3. 4. 5. 6. 8. 0. 1. 2.]
[4. 5. 3. 7. 6. 1. 2. 0.]]
# neighbors_grid
[[[8. 6. 7. 2. 1. 5. 3. 4.]
[6. 7. 8. 0. 2. 3. 4. 5.]
[7. 8. 6. 1. 0. 4. 5. 3.]]
[[2. 0. 1. 5. 4. 8. 6. 7.]
[0. 1. 2. 3. 5. 6. 7. 8.]
[1. 2. 0. 4. 3. 7. 8. 6.]]
[[5. 3. 4. 8. 7. 2. 0. 1.]
[3. 4. 5. 6. 8. 0. 1. 2.]
[4. 5. 3. 7. 6. 1. 2. 0.]]]
If you merely want the padded array:
padded = np.pad(arr, pad_width=1, mode='wrap')
print(padded)
Which of course gives:
[[8 6 7 8 6]
[2 0 1 2 0]
[5 3 4 5 3]
[8 6 7 8 6]
[2 0 1 2 0]]

How to convert a dendrogram to a tree object in python?

I'm trying to use the scipy.hierarchy.cluster module to hierarchically cluster some text. I've done the following:
l = linkage(model.wv.syn0, method='complete', metric='cosine')
den = dendrogram(
l,
leaf_rotation=0.,
leaf_font_size=16.,
orientation='left',
leaf_label_func=lambda v: str(model.wv.index2word[v])
The dendrogram function returns a dict containing a representation of the tree where:
den['ivl'] is a list of labels corresponding to the leaves:
['politics', 'protest', 'characterfirstvo', 'machine', 'writing', 'learning', 'healthcare', 'climate', 'of', 'rights', 'activism', 'resistance', 'apk', 'week', 'challenge', 'water', 'obamacare', 'colorado', 'change', 'voiceovers', '52', 'acting', 'android']
den['leaves'] is a list of the position of each leaf in the left-to-right traversal of the leaves:[0, 18, 5, 6, 2, 7, 12, 16, 21, 20, 22, 3, 10, 14, 15, 19, 11, 1, 17, 4, 13, 8, 9]
I know that scipy's to_tree() method converts a hierarchical clustering represented by a linkage matrix into a tree object by returning a reference to the root node (a ClusterNode object) - but I'm not sure how this root node corresponds to my leaves/labels. For example, the ids returned by the get_id() method in this case are root = 44, left = 41, right = 43:
rootnode, nodelist = to_tree(l, rd=True)
rootID = rootnode.get_id()
leftID = rootnode.get_left().get_id()
rightID = rootnode.get_right().get_id()
My question essentially is, how can I traverse this tree and get the corresponding position in den['leaves'] and label in den['ivl'] for each ClusterNode?
Thank you in advance for any help!
For reference, this is the linkage matrix l:
[[20. 22. 0.72081252 2. ]
[12. 16. 0.78620636 2. ]
[ 3. 10. 0.79635815 2. ]
[ 0. 18. 0.80193474 2. ]
[15. 19. 0.82297097 2. ]
[ 2. 7. 0.84152483 2. ]
[ 1. 17. 0.84453892 2. ]
[ 4. 13. 0.86098654 2. ]
[ 8. 9. 0.88163748 2. ]
[14. 27. 0.91252009 3. ]
[11. 29. 0.92034739 3. ]
[21. 23. 0.92406542 3. ]
[ 5. 6. 0.93213108 2. ]
[25. 32. 0.98555722 5. ]
[26. 35. 0.99214198 4. ]
[30. 31. 1.05624908 4. ]
[24. 34. 1.0606247 5. ]
[28. 39. 1.06322889 7. ]
[37. 40. 1.1455562 11. ]
[33. 38. 1.15171714 7. ]
[36. 42. 1.17330334 12. ]
[41. 43. 1.25056073 23. ]]

Extract non-main diagonal from scipy sparse matrix?

Say that I have a sparse matrix in scipy.sparse format. How can I extract a diagonal other than than the main diagonal? For a numpy array, you can use numpy.diag. Is there a scipy sparse equivalent?
For example:
from scipy import sparse
A = sparse.diags(ones(5),1)
How would I get back the vector of ones without converting to a numpy array?
When the sparse array is in dia format, the data along the diagonals is recorded in the offsets and data attributes:
import scipy.sparse as sparse
import numpy as np
def make_sparse_array():
A = np.arange(ncol*nrow).reshape(nrow, ncol)
row, col = zip(*np.ndindex(nrow, ncol))
val = A.ravel()
A = sparse.coo_matrix(
(val, (row, col)), shape=(nrow, ncol), dtype='float')
A = A.todia()
# A = sparse.diags(np.ones(5), 1)
# A = sparse.diags([np.ones(4),np.ones(3)*2,], [2,3])
print(A.toarray())
return A
nrow, ncol = 10, 5
A = make_sparse_array()
diags = {offset:(diag[offset:nrow+offset] if 0<=offset<=ncol else
diag if offset+nrow-ncol>=0 else
diag[:offset+nrow-ncol])
for offset, diag in zip(A.offsets, A.data)}
for offset, diag in sorted(diags.iteritems()):
print('{o}: {d}'.format(o=offset, d=diag))
Thus for the array
[[ 0. 1. 2. 3. 4.]
[ 5. 6. 7. 8. 9.]
[ 10. 11. 12. 13. 14.]
[ 15. 16. 17. 18. 19.]
[ 20. 21. 22. 23. 24.]
[ 25. 26. 27. 28. 29.]
[ 30. 31. 32. 33. 34.]
[ 35. 36. 37. 38. 39.]
[ 40. 41. 42. 43. 44.]
[ 45. 46. 47. 48. 49.]]
the code above yields
-9: [ 45.]
-8: [ 40. 46.]
-7: [ 35. 41. 47.]
-6: [ 30. 36. 42. 48.]
-5: [ 25. 31. 37. 43. 49.]
-4: [ 20. 26. 32. 38. 44.]
-3: [ 15. 21. 27. 33. 39.]
-2: [ 10. 16. 22. 28. 34.]
-1: [ 5. 11. 17. 23. 29.]
0: [ 0. 6. 12. 18. 24.]
1: [ 1. 7. 13. 19.]
2: [ 2. 8. 14.]
3: [ 3. 9.]
4: [ 4.]
The output above is printing the offset followed by the diagonal at that offset.
The code above should work for any sparse array. I used a fully populated sparse array only to make it easier to check that the output is correct.

How can I convert an ndarray to a matrix in scipy?

How can I convert an ndarray to a matrix in numpy? I'm trying to import data from a csv and turn it into a matrix.
from numpy import array, matrix, recfromcsv
my_vars = ['docid','coderid','answer1','answer2']
toy_data = matrix( array( recfromcsv('toy_data.csv', names=True)[my_vars] ) )
print toy_data
print toy_data.shape
But I get this:
[[(1, 1, 3, 3) (1, 2, 4, 1) (1, 3, 7, 2) (2, 1, 3, 3) (2, 2, 4, 4)
(2, 4, 3, 1) (3, 1, 3, 3) (3, 2, 4, 3) (3, 3, 3, 4) (4, 4, 5, 1)
(4, 5, 6, 2) (4, 2, 4, 3) (5, 2, 5, 4) (5, 3, 3, 1) (5, 4, 7, 2)
(6, 1, 3, 3) (6, 5, 4, 1) (6, 2, 5, 2)]]
(1, 18)
What do I have to do to get a 4 by 18 matrix out of this code? There's got to be an easy answer to this question, but I just can't find it.
If the ultimate goal is to make a matrix, there's no need to create a recarray with named columns. You could use np.loadtxt to load the csv into an ndarray, then use np.asmatrix to convert it to a matrix:
import numpy as np
toy_data = np.asmatrix(np.loadtxt('toy_data.csv',delimiter=','skiprows=1))
print toy_data
print toy_data.shape
yields
[[ 1. 1. 3. 3.]
[ 1. 2. 4. 1.]
[ 1. 3. 7. 2.]
[ 2. 1. 3. 3.]
[ 2. 2. 4. 4.]
[ 2. 4. 3. 1.]
[ 3. 1. 3. 3.]
[ 3. 2. 4. 3.]
[ 3. 3. 3. 4.]
[ 4. 4. 5. 1.]
[ 4. 5. 6. 2.]
[ 4. 2. 4. 3.]
[ 5. 2. 5. 4.]
[ 5. 3. 3. 1.]
[ 5. 4. 7. 2.]
[ 6. 1. 3. 3.]
[ 6. 5. 4. 1.]
[ 6. 2. 5. 2.]]
(18, 4)
Note: the skiprows argument is used to skip over the header in the csv.
You can just read all your values into a vector, then reshape it.
fo = open("toy_data.csv")
def _ReadCSV(fileobj):
for line in fileobj:
for el in line.split(","):
yield float(el)
header = map(str.strip, fo.readline().split(","))
a = numpy.fromiter(_ReadCSV(fo), numpy.float64)
a.shape = (-1, len(header))
But there may be an even more direct way with newer numpy.

Categories