After doing some processing on an audio or image array, it needs to be normalized within a range before it can be written back to a file. This can be done like so:
# Normalize audio channels to between -1.0 and +1.0
audio[:,0] = audio[:,0]/abs(audio[:,0]).max()
audio[:,1] = audio[:,1]/abs(audio[:,1]).max()
# Normalize image to between 0 and 255
image = image/(image.max()/255.0)
Is there a less verbose, convenience function way to do this? matplotlib.colors.Normalize() doesn't seem to be related.
# Normalize audio channels to between -1.0 and +1.0
audio /= np.max(np.abs(audio),axis=0)
# Normalize image to between 0 and 255
image *= (255.0/image.max())
Using /= and *= allows you to eliminate an intermediate temporary array, thus saving some memory. Multiplication is less expensive than division, so
image *= 255.0/image.max() # Uses 1 division and image.size multiplications
is marginally faster than
image /= image.max()/255.0 # Uses 1+image.size divisions
Since we are using basic numpy methods here, I think this is about as efficient a solution in numpy as can be.
In-place operations do not change the dtype of the container array. Since the desired normalized values are floats, the audio and image arrays need to have floating-point point dtype before the in-place operations are performed.
If they are not already of floating-point dtype, you'll need to convert them using astype. For example,
image = image.astype('float64')
If the array contains both positive and negative data, I'd go with:
import numpy as np
a = np.random.rand(3,2)
# Normalised [0,1]
b = (a - np.min(a))/np.ptp(a)
# Normalised [0,255] as integer: don't forget the parenthesis before astype(int)
c = (255*(a - np.min(a))/np.ptp(a)).astype(int)
# Normalised [-1,1]
d = 2.*(a - np.min(a))/np.ptp(a)-1
If the array contains nan, one solution could be to just remove them as:
def nan_ptp(a):
return np.ptp(a[np.isfinite(a)])
b = (a - np.nanmin(a))/nan_ptp(a)
However, depending on the context you might want to treat nan differently. E.g. interpolate the value, replacing in with e.g. 0, or raise an error.
Finally, worth mentioning even if it's not OP's question, standardization:
e = (a - np.mean(a)) / np.std(a)
You can also rescale using sklearn. The advantages are that you can adjust normalize the standard deviation, in addition to mean-centering the data, and that you can do this on either axis, by features, or by records.
from sklearn.preprocessing import scale
X = scale( X, axis=0, with_mean=True, with_std=True, copy=True )
The keyword arguments axis, with_mean, with_std are self explanatory, and are shown in their default state. The argument copy performs the operation in-place if it is set to False. Documentation here.
You are trying to min-max scale the values of audio between -1 and +1 and image between 0 and 255.
Using sklearn.preprocessing.minmax_scale, should easily solve your problem.
e.g.:
audio_scaled = minmax_scale(audio, feature_range=(-1,1))
and
shape = image.shape
image_scaled = minmax_scale(image.ravel(), feature_range=(0,255)).reshape(shape)
note: Not to be confused with the operation that scales the norm (length) of a vector to a certain value (usually 1), which is also commonly referred to as normalization.
This answer to a similar question solved the problem for me with
np.interp(a, (a.min(), a.max()), (-1, +1))
You can use the "i" (as in idiv, imul..) version, and it doesn't look half bad:
image /= (image.max()/255.0)
For the other case you can write a function to normalize an n-dimensional array by colums:
def normalize_columns(arr):
rows, cols = arr.shape
for col in xrange(cols):
arr[:,col] /= abs(arr[:,col]).max()
A simple solution is using the scalers offered by the sklearn.preprocessing library.
scaler = sk.MinMaxScaler(feature_range=(0, 250))
scaler = scaler.fit(X)
X_scaled = scaler.transform(X)
# Checking reconstruction
X_rec = scaler.inverse_transform(X_scaled)
The error X_rec-X will be zero. You can adjust the feature_range for your needs, or even use a standart scaler sk.StandardScaler()
I tried following this, and got the error
TypeError: ufunc 'true_divide' output (typecode 'd') could not be coerced to provided output parameter (typecode 'l') according to the casting rule ''same_kind''
The numpy array I was trying to normalize was an integer array. It seems they deprecated type casting in versions > 1.10, and you have to use numpy.true_divide() to resolve that.
arr = np.array(img)
arr = np.true_divide(arr,[255.0],out=None)
img was an PIL.Image object.
Related
I have a numpy array (let's say 100x64x64).
My goal is to scale each 64x64 layer independently and store a scaler for later use.
This is how it can be achieved with a for-loop solution:
scalers_dict={}
for i in range(X.shape[0]):
scalers_dict[i] = MinMaxScaler()
#fitting the scaler
X[i, :, :] = scalers_dict[i].fit_transform(X[i, :, :])
#saving dict of scalers
joblib.dump(value=scalers_dict,filename="dict_of_scalers.scaler")
My real array is much bigger, and it takes quite a while to iterate through it.
Do you have in mind some more vectorized solution for that, or for-loop is the only way?
If I understand correctly how MinMaxScaler works, it can operate on independent arrays which reduce along axis=0.
To make this useful for your case, you'd need to transform X into a (64 * 64, 100) array:
s = X.shape
X = np.moveaxis(X, 0, -1).reshape(-1, s[0])
Alternatively, you can write
X = X.reshape(s[0], -1).T
Now you can do the scaling with
M = MinMaxScaler()
X = M.fit_transform(X)
Since the actual fit is computed on the first dimension, all the results will be of size 100. This will broadcast perfectly now that the last dimension is of the same size.
To get the original shape back, invert the original transformation:
X = X.T.reshape(s)
When you are done, M will be a scaler calibrated for 100 features. There is no need for a dictionary here. Remember that a dictionary keyed by a sequence of integers can better be expressed as a list or array, which is what happens here.
IIUC, you can manually scale:
mm, MM = inputs.min(axis=(1,2)), inputs.max(axis=(1,2))
# save these for later use
joblib.dump((mm,MM), 'minmax.joblib')
def scale(inputs, mm, MM):
return (inputs - mm[:,None,None])/(MM-mm)[:,None,None]
# load pre-saved min & max
mm, MM = joblib.load('minmax.joblib')
# scaled inputs
scale(inputs, mm, MM)
I'm importing grayscale images that are RGBA (4-channels) formatted using scikit-image.
from skimage import io
example = io.imread("example.png", as_gray=True)
print(example.shape)
print(example)
plt.imshow(example)
I was expecting to get an array with values in the range 0-255. However, I found in the docs, that the above method returns an array of (64-bit) floating points.
Does this mean the values are already normalized (X / 255)? Or do I need to be aware of something else? Thanks in advance.
Min-Max Feature Scaling aka Min-Max Normalization / Unity-based Normalization is a technique that brings all values in a set into the range [0, 1] (or an arbitrary range [a, b]).
The mathematical definition of min-max normalization is as follows:
Notice that calling np.max(example) will result in a value less than or equal to 1.0.
Notice that calling np.min(example) will return a value greater than or equal to 0.0.
Yes, the features have been normalized such that a=0 and b=255 in the equation above.
I have an image I've read from file with shape (m,n,3) (i.e. it has 3 channels). I also have a matrix to convert the color space with dimensions (3,3). I've already arrived at a few different ways of applying this matrix to each vector in the image; for example,
np.einsum('ij,...j',transform,image)
appears to make for the same results as the following (far slower) implementation.
def convert(im: np.array, transform: np.array) -> np.array:
""" Convert an image array to another colorspace """
dimensions = len(im.shape)
axes = im.shape[:dimensions-1]
# Create a new array (respecting mutability)
new_ = np.empty(im.shape)
for coordinate in np.ndindex(axes):
pixel = im[coordinate]
pixel_prime = transform # pixel
new_[coordinate] = pixel_prime
return new_
However, I found that the following is even more efficient while testing on the example image with line_profiler.
np.moveaxis(np.tensordot(transform, X, axes=((-1),(-1))), 0, 2)
The problem I'm having here is using just a np.tensordot, i.e. removing the need for np.moveaxis. I've spent a few hours attempting to find a solution (I'm guessing it resides in choosing the correct axes), so I thought I'd ask others for help.
You can do it concisely with tensordot if you make image the first argument:
np.tensordot(image, transform, axes=(-1, 1))
You can get better performance from einsum by using the argument optimize=True (requires numpy 1.12 or later):
np.einsum('ij,...j', transform, image, optimize=True)
Or (as Paul Panzer pointed out in a comment), you can simply use matrix multiplication:
image # transform.T
They all take about the same time on my computer.
In Numpy 1.4.1, what is the simplest or most efficient way of calculating the histogram of a masked array? numpy.histogram and pyplot.hist do count the masked elements, by default!
The only simple solution I can think of right now involves creating a new array with the non-masked value:
histogram(m_arr[~m_arr.mask])
This is not very efficient, though, as this unnecessarily creates a new array. I'd be happy to read about better ideas!
(Undeleting this as per discussion above...)
I'm not sure whether or not the numpy developers would consider this a bug or expected behavior. I asked on the mailing list, so I guess we'll see what they say.
Either way, it's an easy fix. Patching numpy/lib/function_base.py to use numpy.asanyarray rather than numpy.asarray on the inputs to the function will allow it to properly use masked arrays (or any other subclass of an ndarray) without creating a copy.
Edit: It seems like it is expected behavior. As discussed here:
If you want to ignore masked data it's
just on extra function call
histogram(m_arr.compressed())
I don't think the fact that this makes
an extra copy will be relevant,
because I guess full masked array
handling inside histogram will be a
lot more expensive.
Using asanyarray would also allow
matrices in and other subtypes that
might not be handled correctly by the
histogram calculations.
For anything else besides dropping
masked observations, it would be
necessary to figure out what the
masked array definition of a histogram
is, as Bruce pointed out.
Try hist(m_arr.compressed()).
This is a super old question, but these days I just use:
numpy.histogram(m_arr, bins=.., range=.., density=False, weights=m_arr_mask)
Where m_arr_mask is an array with the same shape as m_arr, consisting of 0 values for elements of m_arr to be excluded from the histogram and 1 values for elements that are to be included.
After running into casting issues by trying Erik's solution (see https://github.com/numpy/numpy/issues/16616), I decided to write a numba function to achieve this behavior.
Some of the code was inspired by https://numba.pydata.org/numba-examples/examples/density_estimation/histogram/results.html. I added the mask bit.
import numpy
import numba
#numba.jit(nopython=True)
def compute_bin(x, bin_edges):
# assuming uniform bins for now
n = bin_edges.shape[0] - 1
a_min = bin_edges[0]
a_max = bin_edges[-1]
# special case to mirror NumPy behavior for last bin
if x == a_max:
return n - 1 # a_max always in last bin
bin = int(n * (x - a_min) / (a_max - a_min))
if bin < 0 or bin >= n:
return None
else:
return bin
#numba.jit(nopython=True)
def masked_histogram(img, bin_edges, mask):
hist = numpy.zeros(len(bin_edges) - 1, dtype=numpy.intp)
for i, value in enumerate(img.flat):
if mask.flat[i]:
bin = compute_bin(value, bin_edges)
if bin is not None:
hist[int(bin)] += 1
return hist # , bin_edges
The speedup is significant. On a (1000, 1000) image:
I have a label matrix with dimension (100*100), stored as a numpy array, and I would like to display the matrix with pyglet.
My original idea is to use this matrix to form a new pyglet image using function pyglet.image.ImageData(). It requres a buffer of the imagedata as an input, however I have no idea how to get a right formated buffer from the numpy array.
Any one have any idea?
ps. my current solution:
3d_label = numpy.empty([100,100,3])
3d_label[:,:,0] = label * 255 # value range of label is [0,1]
3d_label[:,:,1] = label * 255
3d_label[:,:,2] = label * 255
image_data = ctypes.string_at(id(3d_label.tostring())+20, 100*100*3)
image = pyglet.image.ImageData(100, 100, 'RGB', image_data, -100*3)
Any better way to construct a [100*100*3] matrix from 3 [100*100] matrix with numpy?
I think what you are looking for is np.dstack (or more generally, np.concatenate):
label255=label*255
label3=numpy.dstack((label255,label255,label255))
This shows dstack produces the same array (label3) as your construction for label_3d:
import numpy as np
label=np.random.random((100,100))
label255=label*255
label3=np.dstack((label255,label255,label255))
label_3d = np.empty([100,100,3])
label_3d[:,:,0] = label * 255 # value range of label is [0,1]
label_3d[:,:,1] = label * 255
label_3d[:,:,2] = label * 255
print(np.all(label3==label_3d))
# True
PS. I'm not sure, but have you tried using label3.data instead of ctypes.string_at(id(label3.tostring())+20, 100*100*3) ?
You can get the memory representation of your array with 3d_label.tostring().
The tostring() method allows you to change the memory ordering of the elements:
Parameters
----------
order : {'C', 'F', None}, optional
Order of the data for multidimensional arrays:
C, Fortran, or the same as for the original array.
PS: The 3d_label.data of ~unutbu requires less memory, since no string is constructed. However, it does not allow you to change the order in which the elements are output.