Cannot create non-empty csr.matrix() in scipy - python

I need sparse matrix to solve problem and according to description of scr.matrix() in scipy here http://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.html#scipy.sparse.csr_matrix it fits perfectly for my issue.
However I cannot even initialize it.
When I use empty matrix example from this doc http://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.sparse.csr_matrix.html it works fine, exactly as in doc
>>> import numpy as np
>>> from scipy.sparse import csr_matrix
>>> csr_matrix((3, 4), dtype=np.int8).toarray()
array([[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]], dtype=int8)
but when I use example of non-empty martix or try to fill it with my own data
>>> row = np.array([0, 0, 1, 2, 2, 2])
>>> col = np.array([0, 2, 2, 0, 1, 2])
>>> data = np.array([1, 2, 3, 4, 5, 6])
>>> csr_matrix((data, (row, col)), shape=(3, 3)).toarray()
I always got this message
/Library/Python/2.7/site-packages/numpy-1.9.2-py2.7-macosx-10.10-
intel.egg/numpy/core/fromnumeric.py:2507: VisibleDeprecationWarning:
`rank` is deprecated; use the `ndim` attribute or function instead.
To find the rank of a matrix see `numpy.linalg.matrix_rank`.
VisibleDeprecationWarning)
What does it mean? I completely stuck. Excuse me for that question I'm new to scipy and need help.

It is only a warning, your matrix I expect to be created.
Scipy is caling an old numpy function. It was fixed in April 2014 in scipy.
Scipy changes at:
https://github.com/scipy/scipy/commit/fa1782e04fdab91f672ccf7a4ebfb887de50f01c

Related

How does scipy.ndimage.filters.convolve when the mode is reflective

I am trying to figure out how to do this with numpy, so I can then convert it to c++ from scratch. I have figured out how to do it when the mode is constant. The way that is done is shown below.
import numpy as np
from scipy import signal
a = np.array([[1, 2, 0, 0], [5, 3, 0, 4], [0, 0, 0, 7], [9, 3, 0, 0]])
k = np.array([[1,0,0],[0,1,0],[0,0,0]])
a = np.pad(a, 1)
k = np.flip(k)
output = signal.convolve(a, k, 'valid')
Which then comes out to the same output as scipy.ndimage.filters.convolve(a, k, mode='constant) So I thought that when the mode was reflect it would work the same way. Except, that the line a = np.pad(a, 1) would be changed to a = np.pad(a, 1, mode='reflect'). However, that does not seem to be the case. Could someone explain how it would work from scratch using numpy and scipy.signal.convolve? Thank you.

How do i find the row echelon form (REF)

import numpy as np
import sympy as sp
Vec = np.matrix([[1,1,1,5],[1,2,0,3],[2,1,3,12]])
Vec_rref = sp.Matrix(Vec).rref()
print(Vec_rref) ##<-- this code prints the RREF, but i am looking for the code for REF (See below)
I have found plenty of codes which solves the RREF but not codes for REF, if **it makes sense. The code i have developed gives the following:
(Matrix([
[1, 0, 2, 7],
[0, 1, -1, -2],
[0, 0, 0, 0]]), (0, 1))
I am looking for a code which should solve the following:
1XXX
REF = 01XX
001X
and not
100X
RREF = 010X
001X
New here so bare with me guys. Thanks in advance :-)
You are using the function of sympy: rref wich is associated to "reduced row-echelon form". You might want to use .echelon_form() instead
import numpy as np
import sympy as sp
from scipy import linalg
Vec = np.matrix([[1,1,1,5],
[1,2,0,3],
[2,1,3,12]])
Vec_rref =sp.Matrix(Vec).echelon_form()
print(Vec_rref)
wich outputs:
Matrix([[1, 1, 1, 5], [0, 1, -1, -2], [0, 0, 0, 0]])

Optimizing execution time for mapping array to value with dictionary and numpy

I am trying to implement a simple mapping to a set of values from an array created with numpy of 2-D.
For each row in the array I need to choose the correct value corresponding with the set of values and add it to a array.
For example:
[0, 1, 0, 0] -> 3
...
[1, 0, 1, 0] -> 2
But, my first implementation made me wonder if I'm doing something really wrong or not efficient at all because of the size of my dataset, so I did this workaround without using for loops and optimize speed execution using dictionary lookup.
import numpy as np
# function to perform the search and return the index accordingly (it is supposed to be fast because of data structure)
def get_val(n):
map_list = {0: [0, 1, 0], 1: [0, 1, 0], 2: [1, 0, 0], 3: [0, 0, 1]}
map_vals = list(map_list.values())
index = map_vals.index(list(n))
return(index)
# set of arbitrary arrays
li = np.array([[0, 1, 0], [0, 0, 1]])
# here is the performance improvement attempt with the help of the function above
arr = [get_val(n) for n in li]
print(arr)
I'm not completely sure if this is the correct way to do it for getting the needed value for a set like this. If there is a better way, please let me know.
Otherwise, I refer to my main question:
what is the best way possible to optimize the code?
Thanks so much for your help.
You can try use matrix multiplication (dot product):
a=np.array([[0, 0, 0],[0, 1, 0], [1, 0, 0], [0, 0, 1]]) # dict values
c=np.array([0,1,2,3]) # dict keys
li = np.array([[0, 1, 0], [0, 0, 1]])
b=np.linalg.pinv(a)#c # decoding table
result=li#b
print(result)

Numpy reshape - automatic filling or removal

I would like to find a reshape function that is able to transform my arrays of different dimensions in arrays of the same dimension. Let me explain it:
import numpy as np
a = np.array([[[1,2,3,3],[1,2,3,3]],[[1,2,3,3],[1,2,3,3]]])
b = np.array([[[1,2,3,3],[1,2,3,3]],[[1,2,3,3],[1,2,3,3]],[[1,2,3,3],[1,2,3,4]]])
c = np.array([[[1,2,3,3],[1,2,3,3]]])
I would like to be able to make b,c shapes equal to a shape. However, np.reshape throws an error because as explained here (Numpy resize or Numpy reshape) the function is explicitly made to handle the same dimensions.
I would like some version of that function that adds zeros at the start of the first dimension if the shape is smaller or remove the start if the shape is bigger. My example will look like this:
b = np.array([[[1,2,3,3],[1,2,3,3]],[[1,2,3,3],[1,2,3,4]]])
c = np.array([[[0,0,0,0],[0,0,0,0]],[[1,2,3,3],[1,2,3,3]]])
Do I need to write my own function to do that?
This is similar to above solution but will also work also if lower dimensions don't match
def custom_reshape(a, b):
result = np.zeros_like(a).ravel()
result[-min(a.size, b.size):] = b.ravel()[-min(a.size, b.size):]
return result.reshape(a.shape)
custom_reshape(a,b)
I would write a function like this:
def align(a,b):
out = np.zeros_like(a)
x = min(a.shape[0], b.shape[0])
out[-x:] = b[-x:]
return out
Output:
align(a,b)
# array([[[1, 2, 3, 3],
# [1, 2, 3, 3]],
# [[1, 2, 3, 3],
# [1, 2, 3, 4]]])
align(a,c)
# array([[[0, 0, 0, 0],
# [0, 0, 0, 0]],
# [[1, 2, 3, 3],
# [1, 2, 3, 3]]])

Scikit image: proper way of counting cells in the objects of an image

Say you have an image in the form of a numpy.array:
vals=numpy.array([[3,24,25,6,2],[8,7,6,3,2],[1,4,23,23,1],[45,4,6,7,8],[17,11,2,86,84]])
And you want to compute how many cells are inside each object, given a threshold value of 17 (example):
from scipy import ndimage
from skimage.measure import regionprops
blobs = numpy.where(vals>17, 1, 0)
labels, no_objects = ndimage.label(blobs)
props = regionprops(blobs)
If you check, this gives an image with 4 distinct objects over the threshold:
In[1]: blobs
Out[1]:
array([[0, 1, 1, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 1, 1, 0],
[1, 0, 0, 0, 0],
[0, 0, 0, 1, 1]])
In fact:
In[2]: no_objects
Out[2]: 4
I want to compute the number of cells (or area) of each object. The intended outcome is a dictionary with the object ID: number of cells format:
size={0:2,1:2,2:1,3:2}
My attempt:
size={}
for label in props:
size[label]=props[label].area
Returns an error:
Traceback (most recent call last):
File "<ipython-input-76-e7744547aa17>", line 3, in <module>
size[label]=props[label].area
TypeError: list indices must be integers, not _RegionProperties
I understand I am using label incorrectly, but the intent is to iterate over the objects. How to do this?
A bit of testing and research sometimes goes a long way.
The problem is both with blobs, because it is not carrying the different labels but only 0,1 values, and label, which needs to be replaced by an iterator looping over range(0,no_objects).
This solution seems to be working:
import skimage.measure as measure
import numpy
from scipy import ndimage
from skimage.measure import regionprops
vals=numpy.array([[3,24,25,6,2],[8,7,6,3,2],[1,4,23,23,1],[45,4,6,7,8],[17,11,2,86,84]])
blobs = numpy.where(vals>17, 1, 0)
labels, no_objects = ndimage.label(blobs)
#blobs is not in an amicable type to be processed right now, so:
labelled=ndimage.label(blobs)
resh_labelled=labelled[0].reshape((vals.shape[0],vals.shape[1])) #labelled is a tuple: only the first element matters
#here come the props
props=measure.regionprops(resh_labelled)
#here come the sought-after areas
size={i:props[i].area for i in range (0, no_objects)}
Result:
In[1]: size
Out[1]: {0: 2, 1: 2, 2: 1, 3: 2}
And if anyone wants to check for the labels:
In[2]: labels
Out[2]:
array([[0, 1, 1, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 2, 2, 0],
[3, 0, 0, 0, 0],
[0, 0, 0, 4, 4]])
And if anyone wants to plot the 4 objects found:
import matplotlib.pyplot as plt
plt.set_cmap('OrRd')
plt.imshow(labels,origin='upper')
To answer the original question:
You have to apply regionprops to the labeled image: props = regionprops(labels)
You can then construct the dictionary using:
size = {r.label: r.area for r in props}
which yields
{1: 2, 2: 2, 3: 1, 4: 2}
That regionprops will generate a lot more information than just the area of each blob. So, if you are just looking to get the count of pixels for the blobs, as an alternative and with focus on performance, we can use np.bincount on labels obtained with ndimage.label, like so -
np.bincount(labels.ravel())[1:]
Thus, for the given sample -
In [53]: labeled_areas = np.bincount(labels.ravel())[1:]
In [54]: labeled_areas
Out[54]: array([2, 2, 1, 2])
To have these results in a dictionary, one additional step would be -
In [55]: dict(zip(range(no_objects), labeled_areas))
Out[55]: {0: 2, 1: 2, 2: 1, 3: 2}

Categories