I have (a bunch of 3D) stacks of tomographic data, in those I have deduced a certain (3D) coordinate around which I need to cut out a spherical region.
My code produces me the following image which gives an overview of what I do.
I calculate the orange and green points, based on the dashed white and dashed green region.
Around the midpoint of these, I'd like to cut out a spherical region, the representation of it is now marked with a circle in the image (also drawn by my code).
Constructing a sphere with skimage.morphology.ball and multiplying this with the original image is easy to do, but how can I set the center of the sphere at the desired 3D location in the image?
The ~30 3D stacks of images are all of different size with different regions, but I have all the necessary coordinates ready for further use.
you have some radius r and an index (i,j,k) into the data.
kernel = skimage.morphology.ball(r) returns a mask/kernel which is a = 2*r + 1 along each side. It's cube-shaped.
Take a cube-shaped slice, the size of your kernel, out of the tomograph. Starting indices depend on where you need the center to be and what radius the kernel has.
piece = data[i-r:i-r+a, j-r:j-r+a, k-r:k-r+a]
Apply the binary "ball" mask/kernel to the slice.
piece *= kernel
Here are two approaches that I use that show two approaches to 'operating' (calculating values in some way) for sub-regions within an array. The two different approaches are:
So say you wanted to calculate the mean of only the values in your spherical region:
1 - Specify coordinates of the region directly as a 'slice':
data[region_coordinates].mean()
2 - Use a masked version of your array, where the mask is used to specify the region: data_masked.mean()
Which might be better depends on what you may want to do with the values in the region. Both can be used inter-changebly, you can just choose which makes your code clearer/easier/faster.
In my work, I use both approaches, but more commonly the first approach (where you specify a region as a 'slice' of coordinates).
For me, the coordinate slice approach has advantages:
1 - It's more explicitly obvious what is going on
2 - You can more easily apply geometric operations to your 'region' if you need to. (e.g. rotate, translate, scale, ...)
Here is example code, and methods you can use for either approach:
mport numpy as np
import skimage.morphology as mo
from typing import Tuple
def get_ball_coords(radius: int, center: Tuple[int]) -> Tuple[np.ndarray]:
"""
Use radius and center to return the coordinates within that 3d region
as a 'slice'.
"""
coords = np.nonzero(mo.ball(radius))
# 'coords' is a tuple of 1d arrays - to move center using pure numpy,
# first convert to a 2d array
coords_array = np.array(coords)
center_array = np.array([center]).T
# transform coordinates to be centered at 'center'
coords_array = coords_array - radius + center_array
# convert coordinates back to tuple of 1d arrays, which can be used
# directly as a slice specification
coords_tuple = (
coords_array[0,:],
coords_array[1,:],
coords_array[2,:]
)
return coords_tuple
def get_masked_array(data: np.ndarray, radius: int, center: Tuple[int]) -> np.ndarray:
"""
Return a masked version of the data array, where all values are masked
except for the values within the sphere specified by radius and center.
"""
# get 'ball' as 2d array of booleans
ball = np.array(mo.ball(radius), dtype=bool)
# create full mask over entire data array
mask = np.full_like(data, True, dtype=bool)
# un-mask the 'ball' region, translated to the 'center'
mask[
center[0]-radius: center[0]+radius+1,
center[1]-radius: center[1]+radius+1,
center[2]-radius: center[2]+radius+1
] = ~ball
# mask is now True everywhere, except inside the 'ball'
# at 'center' - create masked array version of data using this.
masked_data = np.ma.array(data=data, mask=mask)
return masked_data
# make some 3D data
data_size = (100,100,100)
data = np.random.rand(*data_size)
# define some spherical region by radius and center
region_radius = 2
region_center = (23, 41, 53)
# get coordinates of this region
region_coords = get_ball_coords(region_radius, region_center)
# get masked version of the data, based on this region
data_masked = get_masked_array(data, region_radius, region_center)
# now you can use 'region_coords' as a single 'index' (slice)
# to specify only the points with those coordinates
print('\nUSING COORDINATES:')
print('here is mean value in the region:')
print(data[region_coords].mean())
print('here is the total data mean:')
print(data.mean())
# of you can use the masked array as-is:
print('\nUSING MASKED DATA:')
print('here is mean value in the region:')
print(data_masked.mean())
print('here is the total data mean:')
print(data.mean())
I am using neural network to do semantic segmentation(human parsing), something like taking a photo of people as input and the neural network tells that every pixel is most likely to be head, leg, background or some other parts of human. The algorithm runs smoothly and giving a numpy.ndarray as output . The shape of the array is (1,23,600,400), where 600*400 is the resolution of the input image and 23 is the number of categories. The 3d matrix looks like a 23-layer stacked 2d matrices, where each layer using a matrix of float to tell the possibility that each pixel is of that category.
To visualize the matrix like the following figure, I used numpy.argmax to squash the 3d matrix into a 2d matrix that holds the index of the most possible category. But I don't have any idea how to proceed to get the visualization I want.
EDIT
Actually, I can do it in a trivial way. That is, use a for loop to traverse through every pixel and assign a color to it to get a image. However, this is not a vectorized coding, since numpy has built-in way to speed up matrix manipulation. And I need to save CPU cycles for real time segmentation.
It's fairly easy. All you need to have is a lookup table mapping the 23 labels into unique colors. The easiest way is to have a 23-by-3 numpy array with each row storing the RGB values for the corresponding label:
import numpy as np
import matplotlib.pyplot as plt
lut = np.random.rand(23, 3) # using random mapping - but you can do better
lb = np.argmax(prediction, axis=1) # converting probabilities to discrete labels
rgb = lut[lb[0, ...], :] # this is all it takes to do the mapping.
plt.imshow(rgb)
plt.show()
Alternatively, if you are only interested in the colormap for display purposes, you can use cmap argument of plt.imshow, but this will requires you to transform lut into a "colormap":
from matplotlib.colors import LinearSegmentedColormap
cmap = LinearSegmentedColormap.from_list('new_map', lut, N=23)
plt.imshow(lb[0, ...], cmap=cmap)
plt.show()
I am trying to zero-center and whiten CIFAR10 dataset, but the result I get looks like random noise!
Cifar10 dataset contains 60,000 color images of size 32x32. The training set contains 50,000 and test set contains 10,000 images respectively.
The following snippets of code show the process I did to get the dataset whitened :
# zero-center
mean = np.mean(data_train, axis = (0,2,3))
for i in range(data_train.shape[0]):
for j in range(data_train.shape[1]):
data_train[i,j,:,:] -= mean[j]
first_dim = data_train.shape[0] #50,000
second_dim = data_train.shape[1] * data_train.shape[2] * data_train.shape[3] # 3*32*32
shape = (first_dim, second_dim) # (50000, 3072)
# compute the covariance matrix
cov = np.dot(data_train.reshape(shape).T, data_train.reshape(shape)) / data_train.shape[0]
# compute the SVD factorization of the data covariance matrix
U,S,V = np.linalg.svd(cov)
print 'cov.shape = ',cov.shape
print U.shape, S.shape, V.shape
Xrot = np.dot(data_train.reshape(shape), U) # decorrelate the data
Xwhite = Xrot / np.sqrt(S + 1e-5)
print Xwhite.shape
data_whitened = Xwhite.reshape(-1,32,32,3)
print data_whitened.shape
outputs:
cov.shape = (3072L, 3072L)
(3072L, 3072L) (3072L,) (3072L, 3072L)
(50000L, 3072L)
(50000L, 32L, 32L, 3L)
(32L, 32L, 3L)
and trying to show the resulting image :
import matplotlib.pyplot as plt
%matplotlib inline
from scipy.misc import imshow
print data_whitened[0].shape
fig = plt.figure()
plt.subplot(221)
plt.imshow(data_whitened[0])
plt.subplot(222)
plt.imshow(data_whitened[100])
plt.show()
By the way the data_train[0].shape is (3,32,32),
but if I reshape the whittened image according to that I get
TypeError: Invalid dimensions for image data
Could this be a visualization issue only? if so how can I make sure thats the case?
Update :
Thanks to #AndrasDeak, I fixed the visualization code this way, but still the output looks random :
data_whitened = Xwhite.reshape(-1,3,32,32).transpose(0,2,3,1)
print data_whitened.shape
fig = plt.figure()
plt.subplot(221)
plt.imshow(data_whitened[0])
Update 2:
This is what I get when I run some of the commands given below :
As it can be seen below, toimage can show the image just fine, but trying to reshape it, messes up the image.
# output is of shape (N, 3, 32, 32)
X = X.reshape((-1,3,32,32))
# output is of shape (N, 32, 32, 3)
X = X.transpose(0,2,3,1)
# put data back into a design matrix (N, 3072)
X = X.reshape(-1, 3072)
plt.imshow(X[6].reshape(32,32,3))
plt.show()
for some wierd reason, this was what I got at first , but then after several tries, it changed to the previous image.
Let's walk through this. As you point out, CIFAR contains images which are stored in a matrix; each image is a row, and each row has 3072 columns of uint8 numbers (0-255). Images are 32x32 pixels and pixels are RGB (three channel colour).
# https://www.cs.toronto.edu/~kriz/cifar.html
# wget https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
# tar xf cifar-10-python.tar.gz
import numpy as np
import cPickle
with open('cifar-10-batches-py/data_batch_1') as input_file:
X = cPickle.load(input_file)
X = X['data'] # shape is (N, 3072)
It turns out that the columns are ordered a bit funny: all the red pixel values come first, then all the green pixels, then all the blue pixels. This makes it tricky to have a look at the images. This:
import matplotlib.pyplot as plt
plt.imshow(X[6].reshape(32,32,3))
plt.show()
gives this:
So, just for ease of viewing, let's shuffle the dimensions of our matrix around with reshape and transpose:
# output is of shape (N, 3, 32, 32)
X = X.reshape((-1,3,32,32))
# output is of shape (N, 32, 32, 3)
X = X.transpose(0,2,3,1)
# put data back into a design matrix (N, 3072)
X = X.reshape(-1, 3072)
Now:
plt.imshow(X[6].reshape(32,32,3))
plt.show()
gives:
OK, on to ZCA whitening. We're frequently reminded that it's super important to zero-center the data before whitening it. At this point, an observation about the code you include. From what I can tell, computer vision views color channels as just another feature dimension; there's nothing special about the separate RGB values in an image, just like there's nothing special about the separate pixel values. They're all just numeric features. So, whereas you're computing the average pixel value, respecting colour channels (i.e., your mean is a tuple of r,g,b values), we'll just compute the average image value. Note that X is a big matrix with N rows and 3072 columns. We'll treat every column as being "the same kind of thing" as every other column.
# zero-centre the data (this calculates the mean separately across
# pixels and colour channels)
X = X - X.mean(axis=0)
At this point, let's also do Global Contrast Normalization, which is quite often applied to image data. I'll use the L2 norm, which makes every image have vector magnitude 1:
X = X / np.sqrt((X ** 2).sum(axis=1))[:,None]
One could easily use something else, like the standard deviation (X = X / np.std(X, axis=0)) or min-max scaling to some interval like [-1,1].
Nearly there. At this point, we haven't greatly modified our data, since we've just shifted and scaled it (a linear transform). To display it, we need to get image data back into the range [0,1], so let's use a helper function:
def show(i):
i = i.reshape((32,32,3))
m,M = i.min(), i.max()
plt.imshow((i - m) / (M - m))
plt.show()
show(X[6])
The peacock looks slightly brighter here, but that's just because we've stretched its pixel values to fill the interval [0,1]:
ZCA whitening:
# compute the covariance of the image data
cov = np.cov(X, rowvar=True) # cov is (N, N)
# singular value decomposition
U,S,V = np.linalg.svd(cov) # U is (N, N), S is (N,)
# build the ZCA matrix
epsilon = 1e-5
zca_matrix = np.dot(U, np.dot(np.diag(1.0/np.sqrt(S + epsilon)), U.T))
# transform the image data zca_matrix is (N,N)
zca = np.dot(zca_matrix, X) # zca is (N, 3072)
Taking a look (show(zca[6])):
Now the peacock definitely looks different. You can see that the ZCA has rotated the image through colour space, so it looks like a picture on an old TV with the Tone setting out of whack. Still recognisable, though.
Presumably because of the epsilon value I used, the covariance of my transformed data isn't exactly identity, but it's fairly close:
>>> (np.cov(zca, rowvar=True).argmax(axis=1) == np.arange(zca.shape[0])).all()
True
Update 29 January
I'm not entirely sure how to sort out the issues you're having; your trouble seems to lie in the shape of your raw data at the moment, so I would advise you to sort that out first before you try to move on to zero-centring and ZCA.
One the one hand, the first plot of the four plots in your update looks good, suggesting that you've loaded up the CIFAR data in the correct way. The second plot is produced by toimage, I think, which will automagically figure out which dimension has the colour data, which is a nice trick. On the other hand, the stuff that comes after that looks weird, so it seems something is going wrong somewhere. I confess I can't quite follow the state of your script, because I suspect you're working interactively (notebook), retrying things when they don't work (more on this in a second), and that you're using code that you haven't shown in your question. In particular, I'm not sure how you're loading the CIFAR data; your screenshot shows output from some print statements (Reading training data..., etc.), and then when you copy train_data into X and print the shape of X, the shape has already been reshaped into (N, 3, 32, 32). Like I say, Update plot 1 would tend to suggest that the reshape has happened correctly. From plots 3 and 4, I think you're getting mixed up about matrix dimensions somewhere, so I'm not sure how you're doing the reshape and transpose.
Note that it's important to be careful with the reshape and transpose, for the following reason. The X = X.reshape(...) and X = X.transpose(...) code is modifying the matrix in place. If you do this multiple times (like by accident in the jupyter notebook), you will shuffle the axes of your matrix over and over, and plotting the data will start to look really weird. This image shows the progression, as we iterate the reshape and transpose operations:
This progression does not cycle back, or at least, it doesn't cycle quickly. Because of periodic regularities in the data (like the 32-pixel row structure of the images), you tend to get banding in these improperly reshape-transposed images. I'm wondering if that's what's going on in the third of your four plots in your update, which looks a lot less random than the images in the original version of your question.
The fourth plot of your update is a colour negative of the peacock. I'm not sure how you're getting that, but I can reproduce your output with:
plt.imshow(255 - X[6].reshape(32,32,3))
plt.show()
which gives:
One way you could get this is if you were using my show helper function, and you mixed up m and M, like this:
def show(i):
i = i.reshape((32,32,3))
m,M = i.min(), i.max()
plt.imshow((i - M) / (m - M)) # this will produce a negative img
plt.show()
I had the same issue: the resulting projected values are off:
A float image is supposed to be in [0-1.0] values for each
def toimage(data):
min_ = np.min(data)
max_ = np.max(data)
return (data-min_)/(max_ - min_)
NOTICE: use this function only for visualization!
However notice how the "decorrelation" or "whitening" matrix is computed #wildwilhelm
zca_matrix = np.dot(U, np.dot(np.diag(1.0/np.sqrt(S + epsilon)), U.T))
This is because the U matrix of eigen vectors of the correlation matrix it's actually this one: SVD(X) = U,S,V but U is the EigenBase of X*X not of X https://en.wikipedia.org/wiki/Singular-value_decomposition
As a final note, I would rather consider statistical units only the pixels and the RGB channels their modalities instead of Images as statistical units and pixels as modalities.
I've tryed this on the CIFAR 10 database and it works quite nicely.
IMAGE EXAMPLE: Top image has RGB values "withened", Bottom is the original
IMAGE EXAMPLE2: NO ZCA transform performances in train and loss
IMAGE EXAMPLE3: ZCA transform performances in train and loss
If you want to linearly scale the image to have zero mean and unit norm you can do the same image whitening as Tensofrlow's tf.image.per_image_standardization
. After the documentation you need to use the following formula to normalize each image independently:
(image - image_mean) / max(image_stddev, 1.0/sqrt(image_num_elements))
Keep in mind that the mean and the standard deviation should be computed over all values in the image. This means that we don't need to specify the axis/axes along which they are computed.
The way to implement that without Tensorflow is by using numpy as following:
import math
import numpy as np
from PIL import Image
# open image
image = Image.open("your_image.jpg")
image = np.array(image)
# standardize image
mean = image.mean()
stddev = image.std()
adjusted_stddev = max(stddev, 1.0/math.sqrt(image.size))
standardized_image = (image - mean) / adjusted_stddev
I'm trying to use KMeans centroids to label/clump pixels for a land cover analysis. I'm hoping to do this only using sklearn and matplotlib. At the moment my code looks like this:
kmeans.fit(band_5)
centroids = kmeans.cluster_centers_
plt.scatter(centroids[:, 0], centroids[:, 1])
The shape of band_5 is (713, 1163), yet from the scatter plot I can tell that the centroid coordinates have values well in excess of that shape.
From my understanding, the centroids that KMeans provides need to be converted into the correct coordinates and then a shapefile, which would then be used in a supervised process to label/clump pixels.
How do I convert those centroids to the correct coordinates and then export to a shapefile? Also, do I need to create a shapefile?
I tried to adopt some of the code from this post, but I could not get that to work. http://scikit-learn.org/stable/auto_examples/cluster/plot_color_quantization.html#sphx-glr-auto-examples-cluster-plot-color-quantization-py
A couple of points:
scikit-learn expects data in columns (think a table in a spreadsheet), so simply passing in an array representing a raster band will actually try and classify the data as if you had 1163 sample points and 713 values (bands) for each sample. Instead you'll need to flatten the array, and what kmeans will return will be equivalent to quantile classification of your raster if you're looking at it in something like ArcGIS, with centroids in the range of band minimum value to band maximum value (not in cell coordinates).
Looking at the example you provide, they have a three band jpeg, which the reshape into three long columns:
image_array = np.reshape(china, (w * h, d))
If you need to have spatially constrained pixels then you have two choices: choose a connectivity constrained cluster method such as Agglomerative Clustering or Affinity Propagation, and look at adding the normalised cell coordinates to your sample-set, e.g.:
xs, ys = np.meshgrid(
np.linspace(0, 1, 1163), # x
np.linspace(0, 1, 713), # y
)
data_with_coordinates = np.column_stack([
band_5.flatten(),
xs.flatten(),
ys.flatten()
])
# And on with the clustering
Once you've done the clustering with scikit-learn, assuming you use fit_predict you'll get a label back for each value by cluster, and you can reshape back to the original shape of the band to plot the clustered results.
labels = classifier.fit_predict(data_with_coordinates)
plt.imshow(labels.reshape(band_5.shape)
Do you actually need the cluster centroids given you have labelled points? And do you need them in real world spatial coordinates? If yes, then you need to be looking at the rasterio and the affine methods to transform from map coordinates to array coordinates and vice versa. And then look into fiona to write the points to a shapefile.
I'm using Mayavi to render some imaging data that consists of multiple 2D planes within a 3D volume, the position, orientation, and scale of which are defined by 4x4 rigid body affine transformation matrices. Each plane consists of:
An array of 2D image data, which I display using mayavi.mlab.imshow
A set of ROIs consisting of lines and points that I draw using mayavi.mlab.points3d and mayavi.mlab.plot3d respectively.
I transform my points and line vertices from a 2D reference plane into the 3D space by dotting their coordinates with my affine matrix. Based on my previous question/answer here, I figured out that I could set the positions and orientations of the ImageActor objects individually, using:
obj = mlab.imshow(img)
obj.actor.orientation = [pitch, roll, yaw] # the required orientation (deg)
obj.actor.position = [dx, dy, dz] # the required position
obj.actor.scale = [sx, sy, sz] # the required scale
Now the plot looks like this:
Everything lines up nicely, but it's very difficult to interpret because the planes are so densely spaced in z. What I'd now like to be able to do is 'stretch out' the z-axis by some scaling factor. In the case of the points and lines, this is very easy to do - all I do is multiply all of the transformed z-coordinates by a scaling factor.
However, I can't figure out how to apply the same transformation to the images. If I just scale the z-position, the rotation and scaling of the images will of course be wrong, and my plotted points/lines will no longer fall on the same plane as the image:
What I need to do is apply a non-rigid affine transformation that incorporates shear as well as rotation, translation, and scaling to my images.
Is there any way I can manually apply shear to an ImageActor, or even better just directly apply an arbitrary 4x4 affine matrix that I've precomputed?
ImageActor, which ultimately is a wrapper for tvtk.ImageActor, has a user_matrix property, which lets you assign a 4D transformation matrix.
Starting with a random image,
import numpy as np
from mayavi.mlab import imshow
s = np.random.random((10, 10))
image = imshow(s, colormap='gist_earth', interpolate=False)
gives us the following ...
Creating a transformation matrix and setting a term to give it some shear ...
from tvtk.api import tvtk
transform_matrix = tvtk.Matrix4x4()
transform_matrix.set_element(0, 1, 2.5)
image.actor.user_matrix = transform_matrix
gives us ...
set_element has the signature (row, col, value), so you should be able to set elements on that matrix as needed.