Consider we only have images as.npy file. Is it possible to resizing images without converting their to images (because I'm looking for a way that is fast when run the code).
for more info, I asked the way without converting to image, I have images but i don't want use those in code, because my dataset is too large and running with images is so slow, on the other hand, Im not sure which size is better for my imeges, So Im looking for a way that first convert images to npy and save .npy file and then preprocess npy file, for example resize the dimension of images.
Try PIL, maybe it's fast enough for you.
import numpy as np
from PIL import Image
arr = np.load('img.npy')
img = Image.fromarray(arr)
img.resize(size=(100, 100))
Note that you have to compute the aspect ratio if you want to keep it. Or you can use Image.thumbnail(), which can take an antialias filter.
There's also scikit-image, but I suspect it's using PIL under the hood. It works on NumPy arrays:
import skimage.transform as st
st.resize(arr, (100, 100))
I guess the other option is OpenCV.
If you are only dealing with numpy arrays, I think slicing would be enough
Say, the shape of the loaded numpy array is (m, n) (one channel), and the target shape is (a, b). Then, the stride can be (s1, s2) = (m // a, n // b)
So the original array can be sliced by
new_array = old_array[::s1, ::s2]
EDIT
To scale up an array is also quite straight forward if you use masks for advanced slicing. For example, the shape of the original array is (m, n), and the target shape is (a, b). Then, as an example
a, b = 300, 200
m, n = 3, 4
original = np.linspace(1, 12, 12).reshape(3, 4)
canvas = np.zeros((a, b))
(s1, s2) = (a // m, b // n) # the scalar
# the two masks
mask_x = np.concatenate([np.ones(s1) * ind for ind in range(m)])
mask_y = np.concatenate([np.ones(s2) * ind for ind in range(n)])
# make sure the residuals are taken into account
if len(mask_x) < a: mask_x = np.concatenate([mask_x, np.ones(len(mask_x) % a) * (m - 1)])
if len(mask_y) < b: mask_y = np.concatenate([mask_y, np.ones(len(mask_y) % b) * (n - 1)])
mask_x = mask_x.astype(np.int8).tolist()
mask_y = mask_y.astype(np.int8).tolist()
canvas = original[mask_x, :]
canvas = canvas[:, mask_y]
Related
Is it possible to use non-c/fortran ordering in Halide? (where given dimensions x, y, c, x varies the fastest, then c varies the 2nd fastest (strides in numpy at least would be: .strides = (W*C, 1, W) Our memory layout is a stack of images where the channels of each image are stacked by scanline.
(Sorry if the layout still isn't clear enough, I can try to clarify). Using the python bindings, I always get ValueError: ndarray is not contiguous when trying to pass in my numpy array with .strides set.
I've tried changing the numpy array to use contiguous strides (without changing the memory layout) just to get it into Halide, then setting .set_stride in halide, but no luck. I'm just wanting to make sure I'm not trying to do something that can't/shouldn't be done.
I think this is similar to the line-by-line layout mentioned at https://halide-lang.org/tutorials/tutorial_lesson_16_rgb_generate.html, except more dimensions in C since the images are "stacked" along channel (to produce a W, H, C*image_count tensor)
Any advice would be much appreciated.
Thanks!
This is more of a numpy question than a Halide one. The following Halide code illustrates use of an array in the shape you are looking for (I think):
import halide as hl
import numpy as np;
x, y, c = hl.Var('x'), hl.Var('y'), hl.Var('c')
f = hl.Func('f')
f[x, y, c] = (x * 3) + (y * 12) + c
# This would be necessary for internally allocated buffers
# f.reorder_storage(x, c, y)
# These control output layout
f.output_buffer().dim(1).set_stride(12)
f.output_buffer().dim(2).set_stride(3)
# Probably wanted for efficiency
f.reorder(x, c, y)
result = f.realize(4, 5, 3)
print(result, result[0, 1, 1])
np_result = np.array(result)
print(np_result, np_result[0, 1, 1])
print(np_result.shape, " ", np_result.strides, " ", np_result.flags)
I'm not well versed in numpy and not sure how you would allocate an array in that layout from scratch but the answer might have to be something like lib.stride_tricks.as_strided.
What is the best way to transform an 1D array that contains rgb data into a 3D RGB array ?
If the array was in this order, it would be easy, (a single reshape)
RGB RGB RGB RGB...
However my array is in the form,
RRRR...GGGG....BBBB
or sometimes even,
GGGG....RRRR....BBBB (result still should be RGB not GRB)
I could of course derive some Python way to achieve this, I even did try a numpy solution, it works but It is obviously a bad solution, I wonder what is the best way, maybe a built-in numpy function ?
My solution:
for i in range(len(video_string) // 921600 - 1): # Consecutive frames iterated over.
frame = video_string[921600 * i: 921600 * (i + 1)] # One frame
array = numpy.fromstring(frame, dtype=numpy.uint8) # Numpy array from one frame.
r = array[:307200].reshape(480, 640)
g = array[307200:614400].reshape(480, 640)
b = array[614400:].reshape(480, 640)
rgb = numpy.dstack((b, r, g)) # Bring them together as 3rd dimention
Don't let the for loop confuse you, I just have frames concatenated to each other in a string, like a video, which is not a part of the question.
What did not help me: In this question, r, g, b values are already 2d arrays so not helping my situation.
Edit1: Desired array shape is 640 x 480 x 3
Reshape to 2D, transpose and then reshape back to 3D for RRRR...GGGG....BBBB form -
a1D.reshape(3,-1).T.reshape(height,-1,3) # assuming height is given
Or use reshape with Fortran order and then swap axes -
a1D.reshape(-1,height,3,order='F').swapaxes(0,1)
Sample run -
In [146]: np.random.seed(0)
In [147]: a = np.random.randint(11,99,(4,2,3)) # original rgb image
In [148]: a1D = np.ravel([a[...,0].ravel(), a[...,1].ravel(), a[...,2].ravel()])
In [149]: height = 4
In [150]: np.allclose(a, a1D.reshape(3,-1).T.reshape(height,-1,3))
Out[150]: True
In [151]: np.allclose(a, a1D.reshape(-1,height,3,order='F').swapaxes(0,1))
Out[151]: True
For GGGG....RRRR....BBBB form, simply append : [...,[1,0,2]].
I have a three dimensional numpy array of images (CIFAR-10 dataset). The image array shape is like below:
a = np.random.rand(32, 32, 3)
Before I do any deep learning, I want to normalize the data to get better result. With a 1D array, I know we can do min max normalization like this:
v = np.random.rand(6)
(v - v.min())/(v.max() - v.min())
Out[68]:
array([ 0.89502294, 0. , 1. , 0.65069468, 0.63657915,
0.08932196])
However, when it comes to a 3D array, I am totally lost. Specifically, I have the following questions:
Along which axis do we take the min and max?
How do we implement this with the 3D array?
I appreciate your help!
EDIT:
It turns out I need to work with a 4D Numpy array with shape (202, 32, 32, 3), so the first dimension would be the index for the image, and the last 3 dimensions are the actual image. It'll be great if someone can provide me with the code to normalize such a 4D array. Thanks!
EDIT 2:
Thanks to #Eric's code below, I've figured it out:
x_min = x.min(axis=(1, 2), keepdims=True)
x_max = x.max(axis=(1, 2), keepdims=True)
x = (x - x_min)/(x_max-x_min)
Assuming you're working with image data of shape (W, H, 3), you should probably normalize over each channel (axis=2) separately, as mentioned in the other answer.
You can do this with:
# keepdims makes the result shape (1, 1, 3) instead of (3,). This doesn't matter here, but
# would matter if you wanted to normalize over a different axis.
v_min = v.min(axis=(0, 1), keepdims=True)
v_max = v.max(axis=(0, 1), keepdims=True)
(v - v_min)/(v_max - v_min)
Along which axis do we take the min and max?
To answer this we probably need more information about your data, but in general, when discussing 3 channel images for example, we would normalize using the per-channel min and max. this means that we would perform the normalization 3 times - once per channel.
Here's an example:
img = numpy.random.randint(0, 100, size=(10, 10, 3)) # Generating some random numbers
img = img.astype(numpy.float32) # converting array of ints to floats
img_a = img[:, :, 0]
img_b = img[:, :, 1]
img_c = img[:, :, 2] # Extracting single channels from 3 channel image
# The above code could also be replaced with cv2.split(img) << which will return 3 numpy arrays (using opencv)
# normalizing per channel data:
img_a = (img_a - numpy.min(img_a)) / (numpy.max(img_a) - numpy.min(img_a))
img_b = (img_b - numpy.min(img_b)) / (numpy.max(img_b) - numpy.min(img_b))
img_c = (img_c - numpy.min(img_c)) / (numpy.max(img_c) - numpy.min(img_c))
# putting the 3 channels back together:
img_norm = numpy.empty((10, 10, 3), dtype=numpy.float32)
img_norm[:, :, 0] = img_a
img_norm[:, :, 1] = img_b
img_norm[:, :, 2] = img_c
Edit: It just occurred to me that once you have the one channel data (32x32 image for instance) you can simply use:
from sklearn.preprocessing import normalize
img_a_norm = normalize(img_a)
How do we work with the 3D array?
Well, this is a bit of a big question. If you need functions like array-wise min and max I would use the Numpy versions. Indexing, for instance, is achieved through axis-wide separators - as you can see from my example above.
Also, please refer to Numpy's documentation of ndarray # https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.html
to learn more. they really have an amazing set of tools for n-dimensional arrays.
There are different approaches here. You can either decide to normalize over the whole batch of images or normalize per single image. To do that you can either use the mean of a single image or use the mean of the whole batch of images or use a fixed mean from another dataset - e.g. you can use the ImageNet mean value.
If you want to do the same as Tensorflow's tf.image.per_image_standardization you should normalize per single image with the mean of this image. So you loop through all images and do the normalization for all axes in a single image like this:
import math
import numpy as np
from PIL import Image
# open images
image_1 = Image.open("your_image_1.jpg")
image_2 = Image.open("your_image_2.jpg")
images = [image_1, image_2]
images = np.array(images)
standardized_images = []
# standardize images
for image in images:
mean = image.mean()
stddev = image.std()
adjusted_stddev = max(stddev, 1.0/math.sqrt(image.size))
standardized_image = (image - mean) / adjusted_stddev
standardized_images.append(standardized_image)
standardized_images = np.array(standardized_images)
I'm trying to speed up my code to perform some numerical calculations where I need to multiply 3 matrices with an array. The structure of the problem is the following:
The array as a shape of (N, 10)
The first matrix is constant along the dynamic dimension of the array and has a shape of (10, 10)
The other two matrices vary along the first dimension of the array and have a (N, 10, 10) shape
The result of the calculation should be an array with (N, shape)
I've implemented a solution using for loops that is working, but I'd like to have a better performance so I'm trying to use the numpy functions. I've tried using numpy.tensordot but when multiplying the dynamic matrices with the array I get a shape of (N, 10, N) instead of (N, 10)
My for loop is the following:
res = np.zeros(temp_rho.shape, dtype=np.complex128)
for i in range(temp_rho.shape[0]):
res[i] = np.dot(self.constMatrix, temp_rho[i])
res[i] += np.dot(self.dinMat1[i], temp_rho[i])
res[i] += np.dot(self.dinMat2[i], np.conj(temp_rho[i]))
#temp_rho.shape = (N, 10)
#res.shape = (N, 10)
#self.constMatrix.shape = (10, 10)
#self.dinMat1.shape = (N, 10, 10)
#self.dinMat2.shape = (N, 10, 10)
How should this code be implemented dot products of numpy, returning the correct dimensions?
Here's an approach using a combination of np.dot and np.einsum -
parte1 = constMatrix.dot(temp_rho.T).T
parte2 = np.einsum('ijk,ik->ij',dinMat1, temp_rho)
parte3 = np.einsum('ijk,ik->ij',dinMat2, np.conj(temp_rho))
out = parte1 + parte2 + parte3
Alternative way to get parte1 would be with np.tensordot -
parte1 = np.tensordot(temp_rho, constMatrix, axes=([1],[1]))
Why doesn't numpy.tensordot work for the later two sum-reductions?
Well, we need to keep the first axis between dinMat1/dinMat2 aligned against the first axis of temp_rho/np.conj(temp_rho), which isn't possible with tensordot as the axes that are not sum-reduced are elementwise multiplied along two separate axes. Therefore, when used with np.tensordot, we would end up with two axes of length N corresponding to the first axis each from the two inputs.
I have a Numpy array of shape (4320,8640). I would like to have an array of shape (2160,4320).
You'll notice that each cell of the new array maps to a 2x2 set of cells in the old array. I would like a cell's value in the new array to be the sum of the values in this block in the old array.
I can achieve this as follows:
import numpy
#Generate an example array
arr = numpy.random.randint(10,size=(4320,8640))
#Perform the transformation
arrtrans = numpy.array([ [ arr[y][x]+arr[y+1][x]+arr[y][x+1]+arr[y+1][x+1] for x in range(0,8640,2)] for y in range(0,4320,2)])
But this is slow and more than a little ugly.
Is there a way to do this using Numpy (or an interoperable package)?
When the window fits exactly into the array, reshaping to more dimensions and collapsing the extra dimensions with np.sum is sort of the canonical way of doing this with numpy:
>>> a = np.random.rand(4320,8640)
>>> a.shape
(4320, 8640)
>>> a_small = a.reshape(2160, 2, 4320, 2).sum(axis=(1, 3))
>>> a_small.shape
(2160, 4320)
>>> np.allclose(a_small[100, 203], a[200:202, 406:408].sum())
True
I'm not sure there exists the package you want, but this code will compute much faster.
>>> arrtrans2 = arr[::2, ::2] + arr[::2, 1::2] + arr[1::2, ::2] + arr[1::2, 1::2]
>>> numpy.allclose(arrtrans, arrtrans2)
True
Where ::2 and 1::2 are translated by 0, 2, 4, ... and 1, 3, 5, ... respectively.
You are operating on sliding windows of the original array. There are numerous questions and answers on SO regarding. sliding windows and numpy and python. By manipulating the strides of an array, this process can be sped up considerably. Here is a generic function that will return (x,y) windows of the array with or without overlap. Using this stride trick appears to be just a hair slower than #mskimm's solution. It's a nice thing to have in your toolkit. This function is not mine - it was found at Efficient Overlapping Windows with Numpy
import numpy as np
from numpy.lib.stride_tricks import as_strided as ast
from itertools import product
def norm_shape(shape):
'''
Normalize numpy array shapes so they're always expressed as a tuple,
even for one-dimensional shapes.
Parameters
shape - an int, or a tuple of ints
Returns
a shape tuple
from http://www.johnvinyard.com/blog/?p=268
'''
try:
i = int(shape)
return (i,)
except TypeError:
# shape was not a number
pass
try:
t = tuple(shape)
return t
except TypeError:
# shape was not iterable
pass
raise TypeError('shape must be an int, or a tuple of ints')
def sliding_window(a,ws,ss = None,flatten = True):
'''
Return a sliding window over a in any number of dimensions
Parameters:
a - an n-dimensional numpy array
ws - an int (a is 1D) or tuple (a is 2D or greater) representing the size
of each dimension of the window
ss - an int (a is 1D) or tuple (a is 2D or greater) representing the
amount to slide the window in each dimension. If not specified, it
defaults to ws.
flatten - if True, all slices are flattened, otherwise, there is an
extra dimension for each dimension of the input.
Returns
an array containing each n-dimensional window from a
from http://www.johnvinyard.com/blog/?p=268
'''
if None is ss:
# ss was not provided. the windows will not overlap in any direction.
ss = ws
ws = norm_shape(ws)
ss = norm_shape(ss)
# convert ws, ss, and a.shape to numpy arrays so that we can do math in every
# dimension at once.
ws = np.array(ws)
ss = np.array(ss)
shape = np.array(a.shape)
# ensure that ws, ss, and a.shape all have the same number of dimensions
ls = [len(shape),len(ws),len(ss)]
if 1 != len(set(ls)):
error_string = 'a.shape, ws and ss must all have the same length. They were{}'
raise ValueError(error_string.format(str(ls)))
# ensure that ws is smaller than a in every dimension
if np.any(ws > shape):
error_string = 'ws cannot be larger than a in any dimension. a.shape was {} and ws was {}'
raise ValueError(error_string.format(str(a.shape),str(ws)))
# how many slices will there be in each dimension?
newshape = norm_shape(((shape - ws) // ss) + 1)
# the shape of the strided array will be the number of slices in each dimension
# plus the shape of the window (tuple addition)
newshape += norm_shape(ws)
# the strides tuple will be the array's strides multiplied by step size, plus
# the array's strides (tuple addition)
newstrides = norm_shape(np.array(a.strides) * ss) + a.strides
strided = ast(a,shape = newshape,strides = newstrides)
if not flatten:
return strided
# Collapse strided so that it has one more dimension than the window. I.e.,
# the new array is a flat list of slices.
meat = len(ws) if ws.shape else 0
firstdim = (np.product(newshape[:-meat]),) if ws.shape else ()
dim = firstdim + (newshape[-meat:])
# remove any dimensions with size 1
dim = filter(lambda i : i != 1,dim)
return strided.reshape(dim)
Usage:
# 2x2 windows with NO overlap
b = sliding_window(arr, (2,2), flatten = False)
c = b.sum((1,2))
Approximate 24% performance improvement using numpy.einsum
c = np.einsum('ijkl -> ij', b)
One SO Q&A example How can I efficiently process a numpy array in blocks similar to Matlab's blkproc (blockproc) function, the selected answer would work for you.