I'm trying to handle a memory problem in my application by using memory mapped arrays. However, as part of my computation I need to set values some values in my array to 0. Unfortunately, the array mask will require additional memory. Is there a way to do the following such that the mask is handled cleanly?
source_array = numpy.memmap(filename, dtype='float32', mode='w+', shape=shape)
#Load data into memory mapped numpy array
band.ReadAsArray(buf_obj = source_array)
#set values == 255 to 0
numpy.putmask(source_array, source_array >= 255.0, 0.0)
I believe the last line with source_array >= 255.0 must make a big array in memory, right? Aside from manually looping through each element, is there a memory efficient mechanism to set all my 255 values in source_array to 0?
Sorry realized that of course memmapping the mask isn't an optimal solution here. Numpy does not really have much to help loop through the array in chunks (which would be the cleanest way), though you can of course do that by hand. You might actually have some success with numexpr, which always does its calculation in chunks for speeding up numpy, but I did not try this.
I guess this wasn't quite what you wanted:
You can always use the out parameters to ufuncs and many other functions to ask numpy to store the result into that array directly (also to generally save memory). This means that if you create an empty memory map array you can do this:
# You could use tempfile.NamedTemporaryFile. But I will leave that to you:
mask = np.memmap(tempfile, shape=source_array.shape, dtype=bool, mode='w+')
np.greater_equal(source_array, 255.0, out=mask)
And then use the mask array in putmask. This should solve the problem.
Related
I am trying to assign provinces to an area for use in a game mod. I have two separate maps for area and provinces.
provinces file,
area file.
Currently I am reading in an image in Python and storing it in an array using PIL like this:
import PIL
land_prov_pic = Image.open(INPUT_FILES_DIR + land_prov_str)
land_prov_array = np.array(land_prov_pic)
image_size = land_prov_pic.size
for x in range(image_size[0]):
if x % 100 == 0:
print(x)
for y in range(image_size[1]):
land_prov_array[x][y] = land_prov_pic.getpixel((x,y))
Where you end up with land_prov_array[x][y] = (R,G,B)
However, this get's really slow, especially for large images. I tried reading it in using opencv like this:
import opencv
land_prov_array = cv2.imread(INPUT_FILES_DIR + land_prov_str)
land_prov_array = cv2.cvtColor(land_prov_array, cv2.COLOR_BGR2RGB) #Convert from BGR to RGB
But now land_prov_array[x][y] = [R G B] which is an ndarray and can't be inserted into a set. But it's way faster than the previous for loop. How do I convert [R G B] to (R,G,B) for every element in the array without for loops or, better yet, read it in that way?
EDIT: Added pictures, more description, and code blocks for readability.
It is best to convert the [R,G,B] array to tuple when you need it to be a tuple, rather than converting the whole image to this form. An array of tuples takes up a lot more memory, and will be a lot slower to process, than a numeric array.
The answer by isCzech shows how to create a NumPy view over a 3D array that presents the data as if it were a 2D array of tuples. This might not require the additional memory of an actual array of tuples, but it is still a lot slower to process.
Most importantly, most NumPy functions (such as np.mean) and operators (such as +) cannot be applied to such an array. Thus, one is obliged to iterate over the array in Python code (or with a #np.vectorize function), which is a lot less efficient than using NumPy functions and operators that work on the array as a whole.
For transformation from a 3D array (data3D) to a 2D array (data2D), I've used this approach:
import numpy as np
dt = np.dtype([('x', 'u1'), ('y', 'u1'), ('z', 'u1')])
data2D = data3D.view(dtype=dt).squeeze()
The .view modifies the data type and returns still a 3D array with the last dimension of size 1 which can be then removed by .squeeze. Alternatively you can use .squeeze(axis=-1) to only squeeze the last dimension (in case some of your other dimensions are of size 1 too).
Please note I've used uint8 ('u1') - your type may be different.
Trying to do this using a loop is very slow, indeed (compared to this approach at least).
Similar question here: Show a 2d numpy array where contents are tuples as an image
I'm dealing with large dense square matrices of size NxN ~(100k x 100k) that are too large to fit into memory.
After doing some research, I've found that most people handle large matrices by either using numpy's memap or the pytables package. However, I've found that these packages seem to have major limitations. Neither of them seem to offer support ASSIGN values to list slices to the matrix on the disk along more than one dimension.
I would like to look for an efficient way to assign values to a large dense square matrix M with something like:
M[0, [1,2,3], [8,15,30]] = np.zeros((3, 3)) # or
M[0, [1,2,3,1,2,3,1,2,3], [8,8,8,15,15,15,30,30,30]] = 0 # for memmap
With memmap, the expression M[0, [1,2,3], [8,15,30]] would always copy the slice into RAM hence assignment doesn't seem to work.
With pytables, list slicing along more than 1 dimension is not supported. Currently I'm just slicing along 1 dimension following by the other dimension (i.e. M[0, [1,2,3]][:, [8,15,30]]). RAM usage of this solution would scale with N, which is better than dealing with the whole array (N^2) but is still not ideal.
In addition, it appears that pytables isn't the most efficient way of handling matrices with lots of rows. (or could there be a way of specifying the chunksize to get rid of this message?) I am getting the following warning message:
The Leaf ``/M`` is exceeding the maximum recommended rowsize (104857600 bytes);
be ready to see PyTables asking for *lots* of memory and possibly slow
I/O. You may want to reduce the rowsize by trimming the value of
dimensions that are orthogonal (and preferably close) to the *main*
dimension of this leave. Alternatively, in case you have specified a
very small/large chunksize, you may want to increase/decrease it.
I'm just wonder whether there are better solutions to assign values to arbitrary 2d slices of large matrices?
First of all, note that in numpy (not sure about pytables) M[0, [1,2,3], [8,15,30]] will return an array of shape (3,) corresponding to elements M[0,1,8], M[0,2,15] and M[0,3,30], so assigning np.zeros((3,3)) to that will raise an error.
Now, the following works fine with me:
np.save('M.npy', np.random.randn(5,5,5)) # create some dummy matrix
M = np.load('M.npy', mmap_mode='r+') # load such matrix as a memmap
M[[0,1,2],[1,2,3],[2,3,4]] = 0
M.flush() # make sure thing is updated on disk
del M
M = np.load('M.npy', mmap_mode='r+') # re-load matrix
print(M[[0,1,2],[1,2,3],[2,3,4]]) # should show array([0., 0., 0.])
I am trying to get rid of the for loop and instead do an array-matrix multiplication to decrease the processing time when the weights array is very large:
import numpy as np
sequence = [np.random.random(10), np.random.random(10), np.random.random(10)]
weights = np.array([[0.1,0.3,0.6],[0.5,0.2,0.3],[0.1,0.8,0.1]])
Cov_matrix = np.matrix(np.cov(sequence))
results = []
for w in weights:
result = np.matrix(w)*Cov_matrix*np.matrix(w).T
results.append(result.A)
Where:
Cov_matrix is a 3x3 matrix
weights is an array of n lenght with n 1x3 matrices in it.
Is there a way to multiply/map weights to Cov_matrix and bypass the for loop? I am not very familiar with all the numpy functions.
I'd like to reiterate what's already been said in another answer: the np.matrix class has much more disadvantages than advantages these days, and I suggest moving to the use of the np.array class alone. Matrix multiplication of arrays can be easily written using the # operator, so the notation is in most cases as elegant as for the matrix class (and arrays don't have several restrictions that matrices do).
With that out of the way, what you need can be done in terms of a call to np.einsum. We need to contract certain indices of three matrices while keeping one index alone in two matrices. That is, we want to perform w_{ij} * Cov_{jk} * w.T_{ki} with a summation over j, k, giving us an array with i indices. The following call to einsum will do:
res = np.einsum('ij,jk,ik->i', weights, Cov_matrix, weights)
Note that the above will give you a single 1d array, whereas you originally had a list of arrays with shape (1,1). I suspect the above result will even make more sense. Also, note that I omitted the transpose in the second weights argument, and this is why the corresponding summation indices appear as ik rather than ki. This should be marginally faster.
To prove that the above gives the same result:
In [8]: results # original
Out[8]: [array([[0.02803215]]), array([[0.02280609]]), array([[0.0318784]])]
In [9]: res # einsum
Out[9]: array([0.02803215, 0.02280609, 0.0318784 ])
The same can be achieved by working with the weights as a matrix and then looking at the diagonal elements of the result. Namely:
np.diag(weights.dot(Cov_matrix).dot(weights.transpose()))
which gives:
array([0.03553664, 0.02394509, 0.03765553])
This does more calculations than necessary (calculates off-diagonals) so maybe someone will suggest a more efficient method.
Note: I'd suggest slowly moving away from np.matrix and instead work with np.array. It takes a bit of getting used to not being able to do A*b but will pay dividends in the long run. Here is a related discussion.
I'm doing some work whereby I have to load an manipulate CT images in a format called the Analyze 7.5 file format.
Part of this manipulation - which takes absolutely ages with large images - is loading the raw binary data to a numpy array and reshaping it to the correct dimensions. Here is an example:
headshape = (512,512,245) # The shape the image should be
headdata = np.fromfile("Analyze_CT_Head.img", dtype=np.int16) # loads the image as a flat array, 64225280 long. For testing, a large array of random numbers would do
head_shaped = np.zeros(shape=headshape) # Array to hold the reshaped data
# This set of loops is the problem
for ux in range(0, headshape[0]):
for uy in range(0, headshape[1]):
for uz in range(0, headshape[2]):
head_shaped[ux][uy][uz] = headdata[ux + headshape[0]*uy + (headshape[0]*headshape[1])*uz] # Note the weird indexing of the flat array - this is the pixel ordering I have to work with
I know numpy can do reshaping of arrays quickly, but I can't figure out the correct combination of transformations needed to replicate the effect of the nested loops.
Is there a way to replicate that strange indexing with some combination of numpy.reshape/numpy.ravel etc?
Take a look at the nibabel, a python library that implements readers/writers for the 'Analyze' format. It may have already solved this for you.
You could use reshape in combination with swapaxes
headshape = (2,3,4)
headdata = rand(2*3*4)
head_shaped_short = headdata.reshape(headshape[::-1]).swapaxes(0,2)
worked fine in my case.
numpy stores arrays flat in the memory. The strides attribute contains the necessary information how to map multidimensional indices to the flat indices in the memory.
Here is some further reading about numpy's memory layout.
This should work for you:
# get the number of bytes of the specified dtype
dtype = headdata.dtype
byte_count = dtype.itemsize
headdata = headdata.reshape(headshape)
x, y, z = headshape
headdata.strides = (byte_count, byte_count * x, byte_count * x * y)
# copy data to get back to standard memory layout
data = headdata.copy()
The code exploits setting the strides attribute to reflect your custom memory mapping and to create the (hopefully) correct multidimensional array. After that, it copies the whole array into data, in order to get back to a standard memory layout.
In Numpy 1.4.1, what is the simplest or most efficient way of calculating the histogram of a masked array? numpy.histogram and pyplot.hist do count the masked elements, by default!
The only simple solution I can think of right now involves creating a new array with the non-masked value:
histogram(m_arr[~m_arr.mask])
This is not very efficient, though, as this unnecessarily creates a new array. I'd be happy to read about better ideas!
(Undeleting this as per discussion above...)
I'm not sure whether or not the numpy developers would consider this a bug or expected behavior. I asked on the mailing list, so I guess we'll see what they say.
Either way, it's an easy fix. Patching numpy/lib/function_base.py to use numpy.asanyarray rather than numpy.asarray on the inputs to the function will allow it to properly use masked arrays (or any other subclass of an ndarray) without creating a copy.
Edit: It seems like it is expected behavior. As discussed here:
If you want to ignore masked data it's
just on extra function call
histogram(m_arr.compressed())
I don't think the fact that this makes
an extra copy will be relevant,
because I guess full masked array
handling inside histogram will be a
lot more expensive.
Using asanyarray would also allow
matrices in and other subtypes that
might not be handled correctly by the
histogram calculations.
For anything else besides dropping
masked observations, it would be
necessary to figure out what the
masked array definition of a histogram
is, as Bruce pointed out.
Try hist(m_arr.compressed()).
This is a super old question, but these days I just use:
numpy.histogram(m_arr, bins=.., range=.., density=False, weights=m_arr_mask)
Where m_arr_mask is an array with the same shape as m_arr, consisting of 0 values for elements of m_arr to be excluded from the histogram and 1 values for elements that are to be included.
After running into casting issues by trying Erik's solution (see https://github.com/numpy/numpy/issues/16616), I decided to write a numba function to achieve this behavior.
Some of the code was inspired by https://numba.pydata.org/numba-examples/examples/density_estimation/histogram/results.html. I added the mask bit.
import numpy
import numba
#numba.jit(nopython=True)
def compute_bin(x, bin_edges):
# assuming uniform bins for now
n = bin_edges.shape[0] - 1
a_min = bin_edges[0]
a_max = bin_edges[-1]
# special case to mirror NumPy behavior for last bin
if x == a_max:
return n - 1 # a_max always in last bin
bin = int(n * (x - a_min) / (a_max - a_min))
if bin < 0 or bin >= n:
return None
else:
return bin
#numba.jit(nopython=True)
def masked_histogram(img, bin_edges, mask):
hist = numpy.zeros(len(bin_edges) - 1, dtype=numpy.intp)
for i, value in enumerate(img.flat):
if mask.flat[i]:
bin = compute_bin(value, bin_edges)
if bin is not None:
hist[int(bin)] += 1
return hist # , bin_edges
The speedup is significant. On a (1000, 1000) image: