I want to iteratively fill a scipy sparse coo_matrix
# Constructing an empty matrix
import numpy as np
from scipy.sparse import coo_matrix
m = coo_matrix((3, 4), dtype=np.int8)
by a for loop and some rules, e.g. all ones (I know there's a constructor with data but I can't use it since the rule is more complex). How can I do that? I haven't found any documentation about it.
You can not directly fill a scipy sparse.coo_matrix by for loop
Your best option is to convert the coo_matrix to a dok_matrix that can be indexed like a dictionary, and later revert it back to coo_matrix if needed.
import numpy as np
from scipy.sparse import coo_matrix
m = coo_matrix((3, 4), dtype=np.int8)
m = m.todok() # convert to dok
for i in xrange(10):
for j in xrange(5):
m[i, j] = i + j # update the matrix, note the particular access format
m = m.tocoo() # convert back to coo
This is the fastest way, as mentioned in this answer.
Related
How would I efficiently initialise a fixed-size pyarray.ListArray
from a suitably prepared numpy array?
The documentation of pyarray.array indicates that a nested iterable input structure works, but in practice that does not work if the outer iterable is a numpy array:
import numpy as np
import pyarrow as pa
n = 1000
w = 3
data = np.arange(n*w,dtype="i2").reshape(-1,w)
# this works:
pa.array(list(data),pa.list_(pa.int16(),w))
# this fails:
pa.array(data,pa.list_(pa.int16(),w))
# -> ArrowInvalid: only handle 1-dimensional arrays
It seems ridiculus to split an input array directly matching the Arrow specification into n separate arrays and then re-assemble from there.
pyarray.ListArray.from_arrays seems to require an offsets argument, which only has a meaning for variable-size lists.
I believe you are looking for pyarrow.FixedSizeListArray.from_arrays which, regrettably, appears undocumented (I went ahead and filed a JIRA ticket)
You'll want to reshape your numpy array as a contiguous array first.
import numpy as np
import pyarrow as pa
len = 10
width = 3
# Or just skip the initial reshape but keeping it in to simulate real data
arr = np.arange(len*width,dtype="i2").reshape(-1,width)
arr.shape = -1
pa.FixedSizeListArray.from_arrays(arr, width)
I have a 3D numpy array that is a stack of 2D (m,n) images at certain timestamps, t. So my array is of shape (t, m, n). I want to plot the value of one of the pixels as a function of time.
e.g.:
import numpy as np
import matplotlib.pyplot as plt
data_cube = []
for i in xrange(10):
a = np.random(100,100)
data_cube.append(a)
So my (t, m, n) now has shape (10,100,100). Say I wanted a 1D plot the value of index [12][12] at each of the 10 steps I would do:
plt.plot(data_cube[:][12][12])
plt.show()
But I'm getting index out of range errors. I thought I might have my indices mixed up, but every plot I generate seems to be in the 'wrong' axis, i.e. across one of the 2D arrays, but instead I want it 'through' the vertical stack. Thanks in advance!
Here is the solution: Since you are already using numpy, convert you final list to an array and just use slicing. The problem in your case was two-fold:
First: Your final data_cube was not an array. For a list, you will have to iterate over the values
Second: Slicing was incorrect.
import numpy as np
import matplotlib.pyplot as plt
data_cube = []
for i in range(10):
a = np.random.rand(100,100)
data_cube.append(a)
data_cube = np.array(data_cube) # Added this step
plt.plot(data_cube[:,12,12]) # Modified the slicing
Output
A less verbose version that avoids iteration:
data_cube = np.random.rand(10, 100,100)
plt.plot(data_cube[:,12,12])
In my program I current create a numpy array full of zeros and then for loop through each element replacing it with the desired value. Is there a more efficient way of doing this?
Below is an example of what I am doing however, instead of a int I have a list of each row which needs put into the numpy array. Is there a way to put replace whole rows and is that more efficient.
import numpy as np
from tifffile import imsave
image = np.zeros((5, 2160, 2560), 'uint16')
num =0
for pixel in np.nditer(image, op_flags=['readwrite']):
pixel = num
num += 1
imsave('multipage.tif', image)
Just assign to the whole row using slicing
import numpy as np
from tifffile import imsave
list_of_rows = ... # all items in list should have same length
image = np.zeros((len(list_of_rows),'uint16')
for row_idx, row in enumerate(list_of_rows):
image[row_idx, :] = row
imsave('multipage.tif', image)
Numpy slicing is extremely powerful and nice. I recommend reading through this documentation to get a feeling of what is possible.
You could simply generate a vector of length 5*2160*2560 and apply this to image.
image=np.arange(5*2160*2560)
image.shape=5,2160,-1
I am using np.fromfunction to create an array of a specific sized based on a function. It looks like this:
import numpy as np
test = [[1,0],[0,2]]
f = lambda i, j: sum(test[i])
matrix = np.fromfunction(f, (len(test), len(test)), dtype=int)
However, I receive the following error:
TypeError: only integer arrays with one element can be converted to an index
The function needs to handle numpy arrays. An easy way to get this working is:
import numpy as np
test = [[1,0],[0,2]]
f = lambda i, j: sum(test[i])
matrix = np.fromfunction(np.vectorize(f), (len(test), len(test)), dtype=int)
np.vectorize returns a vectorized version of f, which will handle the arrays correctly.
I have the following code in Python using Numpy:
p = np.diag(1.0 / np.array(x))
How can I transform it to get the sparse matrix p2 with the same values as p without creating p first?
Use scipy.sparse.spdiags (which does a lot, and so may be confusing, at first), scipy.sparse.dia_matrix and/or scipy.sparse.lil_diags. (depending on the format you want the sparse matrix in...)
E.g. using spdiags:
import numpy as np
import scipy as sp
import scipy.sparse
x = np.arange(10)
# "0" here indicates the main diagonal...
# "y" will be a dia_matrix type of sparse array, by default
y = sp.sparse.spdiags(x, 0, x.size, x.size)
Using the scipy.sparse module,
p = sparse.dia_matrix(1.0 / np.array(x), shape=(len(x), len(x)));