Python image analysis: reading a multidimensional TIFF file from confocal microscopy - python

I have a TIFF image file from a confocal microscope which I can open in ImageJ, but which I would like to get into Python.
The format of the TIFF is as follows:
There are 30 stacks in the Z dimension. Each Z layer has three channels from different fluorescent markers. Each channel has a depth of 8 bits. The image dimensions are 1024x1024.
I can, in principle, read the file with skimage (which I plan to use to further analyse the data) using the tifffile plugin. However, what I get is not quite what I expect.
merged = io.imread("merge.tif", plugin="tifffile")
merged.shape
# (30, 3, 3, 1024, 1024)
# (zslice, RGB?, channel?, height, width)
merged.dtype
# dtype('uint16')
What confused me initially was the fact that I get two axes of length 3. I think that this is because tifffile treats each channel as separate RGB images, but I can work around this by subsetting or using skimage.color.rgb2grey on the individual channels. What concerns me more is that the file is imported as a 16 bit image. I can convert it back using skimage.img_as_ubyte, but afterwards, the histogram does no longer match the one I see in ImageJ.
I am not fixated on using skimage to import the file, but I would like to get the image into a numpy array eventually to use skimage's functionality on it.

I've encountered the same issue working on .tif files. I recommend to use bioformats python package.
import javabridge
import bioformats
javabridge.start_vm(class_path=bioformats.JARS)
path_to_data = '/path/to/data/file_name.tif'
# get XML metadata of complete file
xml_string = bioformats.get_omexml_metadata(path_to_data)
ome = bioformats.OMEXML(xml_string) # be sure everything is ascii
print ome.image_count
depending on data, one file can hold multiple images. Each image can be accessed as follows:
# read some metadata
iome = ome.image(0) # e.g. first image
print iome.get_Name()
print iome.get_ID()
# get pixel meta data
print iome.Pixels.get_DimensionOrder()
print iome.Pixels.get_PixelType()
print iome.Pixels.get_SizeX()
print iome.Pixels.get_SizeY()
print iome.Pixels.get_SizeZ()
print iome.Pixels.get_SizeT()
print iome.Pixels.get_SizeC()
print iome.Pixels.DimensionOrder
loading raw data of image 0 into numpy array is done like that:
reader = bioformats.ImageReader(path_to_data)
raw_data = []
for z in range(iome.Pixels.get_SizeZ()):
# returns 512 x 512 x SizeC array (SizeC = number of channels)
raw_image = reader.read(z=z, series=0, rescale=False)
raw_data.append(raw_image)
raw_data = np.array(raw_data) # 512 x 512 x SizeC x SizeZ array
Hope this helps processing .tif files, Cheers!

I am not sure if the 'hyperstack to stack' function is that what you want. Hyperstacks are simply multidimensional images, could be 4D or 5D (width, hight, slices, channels (e.g. 3 for RGB) and time frames). In ImageJ you have a slider for each dimension in a hyperstack.
Stacks are just stacked 2D images that are somehow related and you have only one slider, in the simplest case it represents the z-slices in a 3D data set.
The 'hyperstack to stack' function stacks all dimensions in your hyperstack. So if you have a hyperstack with 3 channels, 4 slices and 5 time frames (3 sliders) you will get a stack of 3x4x5 = 60 images (one slider). Basically the same thing as you mentioned above with sliding through the focal planes on a per-channel basis. You can go the other way around using 'stack to hyperstack' and make a hyperstack by defining which slices from your stack represent which dimension. In the example file I mentioned above just select order xyzct, 3 channels and 7 time points.
So if your tiff file has 2 sliders, it seems that it is a 4D hyperstack with hight, width, 30 slices and 3 channels. 'hyperstack to stack' would stack all dimensions on top of each other, so you will get 3x30=90 slices.
However, according to the skimage tiff reader it seems that your tiff file is some kind of a 5D hyperstack. Width, hight (1024x1024), 30 z-slices, 3 channels (RGB) and another dimension with 3 entries (e.g. time frames).
In order to find out what is wrong, I would suggest to compare the dimensions with 3 entries of the array you get from skimage. Find out which one of them represents the RGB channels and what the other one is. You can for example use pyqtgraph's image function:
import pyqtgraph as pg
merged = io.imread("merge.tif", plugin="tifffile")
#pg.image takes the dimensions in the following order: z-slider,x,y,RGB channel
#if merged.shape = (30, 3, 3, 1024, 1024), you have to compare the 1st and 2nd dimension
pg.image(merged[:,0,:,:,:].transpose(0, 2, 3, 1))
pg.image(merged[:,1,:,:,:].transpose(0, 2, 3, 1))
pg.image(merged[:,2,:,:,:].transpose(0, 2, 3, 1))
pg.image(merged[:,:,0,:,:].transpose(0, 2, 3, 1))
pg.image(merged[:,:,1,:,:].transpose(0, 2, 3, 1))
pg.image(merged[:,:,2,:,:].transpose(0, 2, 3, 1))

Related

Convert tiled image array into single image array with numpy

I followed the process here: How to Split Image Into Multiple Pieces in Python for splitting an image into MxN number of images. I have a 5490x5490 that I split into 100 pieces by using the following:
M = im.shape[0]//10
N = im.shape[1]//10
tiles = [im[x:x+M,y:y+N] for x in range(0,im.shape[0],M) for y in range(0,im.shape[1],N)]
The shape of tiles is:
np.array(tiles).shape
(100,549,549)
I cannot figure out how to put them back together as one and reshape does not put them back together in the right order.
Found it after taking a break: How to mosaic arrays using numpy?
nd_arr = np.array(tiles)
nd_arr = nd_arr.reshape(10, 10, 549, 549)
nd_arr = nd_arr.swapaxes(1,2)
final = nd_arr.reshape(5490, 5490)
I can verify by:
(final == NDVI).sum()
30140100

cant save an 4d array int .txt file

I am using the sliding window technic to an image and i am extracting the mean values of pixels of each one window. So the results are someting like this [[[[215.015625][123.55036272][111.66057478]]]].now the question is how could i save all these values for every one window into a txt file or at a CSV because i want to use them for further compare similarities? whatever i tried the error is same..that it is a 4D array and not an 1D or 2D. I ll appreciate any help really.! Thank you in advance
import cv2
import matplotlib.pyplot as plt
import numpy as np
# read the image and define the stepSize and window size
# (width,height)
image2 = cv2.imread("bird.jpg")# your image path
image = cv2.resize(image2, (224, 224))
tmp = image # for drawing a rectangle
stepSize = 10
(w_width, w_height) = (60, 60 ) # window size
for x in range(0, image.shape[1] - w_width, stepSize):
for y in range(0, image.shape[0] - w_height, stepSize):
window = image[x:x + w_width, y:y + w_height, :]
# classify content of the window with your classifier and
# determine if the window includes an object (cell) or not
# draw window on image
cv2.rectangle(tmp, (x, y), (x + w_width, y + w_height), (255, 0, 0), 2) # draw rectangle on image
plt.imshow(np.array(tmp).astype('uint8'))
# show all windows
plt.show()
mean_values=[]
mean_val, std_dev = cv2.meanStdDev(image)
mean_val = mean_val[:3]
mean_values.append([mean_val])
mean_values = np.asarray(mean_values)
print(mean_values)
Human Readable Option
Assuming that you want the data to be human readable, saving the data takes a little bit more work. My search showed me that there's this solution for saving 3D data to a text file. However, it's pretty simple to extend this example to 4D for your use case. This code is taken and adapted from that post, thank you Joe Kington and David Cheung.
import numpy as np
data = np.arange(2*3*4*5).reshape((2,3,4,5))
with open('test.csv', 'w') as outfile:
# We write this header for readable, the pound symbol
# will cause numpy to ignore it
outfile.write('# Array shape: {0}\n'.format(data.shape))
# Iterating through a ndimensional array produces slices along
# the last axis. This is equivalent to data[i,:,:] in this case.
# Because we are dealing with 4D data instead of 3D data,
# we need to add another for loop that's nested inside of the
# previous one.
for threeD_data_slice in data:
for twoD_data_slice in threeD_data_slice:
# The formatting string indicates that I'm writing out
# the values in left-justified columns 7 characters in width
# with 2 decimal places.
np.savetxt(outfile, twoD_data_slice, fmt='%-7.2f')
# Writing out a break to indicate different slices...
outfile.write('# New slice\n')
And then once the data has been saved all you need to do is load it and reshape it (np.load()) will default to reading in the data as a 2D array but np.reshape() will allow us to recover the structure. Again, this code is adapted from the previous post.
new_data = np.loadtxt('test.csv')
# Note that this returned a 2D array!
print(new_data.shape)
# However, going back to 3D is easy if we know the
# original shape of the array
new_data = new_data.reshape((2,3,4,5))
# Just to check that they're the same...
assert np.all(new_data == data)
Binary Option
Assuming that human readability is not necessary, I would recommend using the built-in *.npy format which is described here. This stores the data in a binary format.
You can save the array by doing np.save('NAME_OF_ARRAY.npy', ARRAY_TO_BE_SAVED) and then load it with SAVED_ARRAY = np.load('NAME_OF_ARRAY.npy').
You can also save several numpy array in a single zip file with the np.savez() function like so np.savez('MANY_ARRAYS.npz', ARRAY_ONE, ARRAY_TWO). And you load the zipped arrays in a similar fashion SEVERAL_ARRAYS = np.load('MANY_ARRAYS.npz').

Moving/running window of a Multi-dimensional image array

I am trying to work on an efficient numpy solution to perform a running average of an array of color images across the 4th dimension. A set of color images in a directory is read in a loop and I would like to average in subsets of 3. ie. If there are n = 5 color images in the directory I would like to average [1,2,3],[2,3,4], [3,4,5], [4,5,1], and [5,1,2] thus writing 5 output average images.
from os import listdir
from os.path import isfile, join
import numpy as np
import cv2
from matplotlib import pyplot as plt
mypath = 'C:/path/to/5_image/dir'
onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]
img = np.empty(len(onlyfiles), dtype=object)
temp = np.zeros((960, 1280, 3, 3), dtype='uint8')
temp_avg = np.zeros((960, 1280, 3), dtype='uint8')
for n in range(0, len(onlyfiles)):
img[n] = cv2.imread(join(mypath, onlyfiles[n]))
for n in range(0, len(img)):
if (n+2) < len(img)-1:
temp[:, :, :, 0] = img[n]
temp[:, :, :, 1] = img[n + 1]
temp[:, :, :, 2] = img[n + 2]
temp_avg = np.mean(temp,axis=3)
plt.imshow(temp_avg)
plt.show()
else:
break
This script is in no way complete or elegant. The issues i am having is while plotting the average images the color space seems distorted and appears like CMKY. I am not accounting for the last two moving windows [4,5,1] and [5,1,2]. Critique and suggestions welcome.
For performing local operations (such as a running average) across the pixels of an image (or across multiple images), convolution with a kernel is usually a good approach.
Here's how this could be done in your case.
Generating Some Example Data
I used the following to generate 10 images containing random noise to work with:
for i in range(10):
an_img = np.random.randint(0, 256, (960,1280,3))
cv2.imwrite("img_"+str(i)+".png", an_img)
Preparing the Images
This is how I load the images back in:
# Get target file names
mypath = os.getcwd() # or whatever path you like
fnames = [f for f in listdir(mypath) if f.endswith('.png')]
# Create an array to hold all the images
first_img = cv2.imread(join(mypath, fnames[0]))
y,x,c = first_img.shape
all_imgs = np.empty((len(fnames),y,x,c), dtype=np.uint8)
# Load all the images
for i,fname in enumerate(fnames):
all_imgs[i,...] = cv2.imread(join(mypath, fnames[i]))
Some notes:
I use f.endswith('.png') to be a bit more specific with how I generate the list of filenames, allowing other files to be in the same directory without causing problems.
I place all of the images in a single 4D uint8 array of shape (image,y,x,c) instead of the object array you were using. This is necessary to employ the convolution approach below.
I use the first image to get the dimensions of the images, which makes the code just a little bit more general.
Performing Local Averaging by Kernel Convolution
This is all it takes.
from scipy.ndimage import uniform_filter
done = uniform_filter(all_imgs, size=(3,0,0,0), origin=-1, mode='wrap')
Some notes:
I am using scipy.ndimage because it readily allows for its convolution filters to be applied to images with many dimensions (4 in your case). For cv2, I am only aware of cv2.filter2D, which does not have that functionality as far as I know. However, I am not very familiar with cv2, so I may be wrong about this (will edit if someone corrects me in a comment).
The size kwarg specifies the size of the kernel to use along each dimension of the array. By using (3,0,0,0), I make sure that only the first dimension (=the different images) is used for the averaging.
By default, the running window (or rather the kernel) is used to compute the value of its central pixel. To match this more closely with your code, I used origin=-1, so the kernel computes the value of the pixel one to the left of its center.
By default, the edge cases (the two last images in this case) are handled by padding with a reflection. Your question suggests that what you want is to use the first images again instead. This is done using mode='wrap'.
By default, the filter returns the result in the same dtype as the input, here np.uint8. This is probably desirable, but your example code produces floats, so perhaps you want the filter to return floats as well, which you can do by simply changing the dtype of the input, i.e. done = uniform_filter(all_imgs.astype(np.float), size....
As for the distorted color space when you plot your averages; I cannot reproduce that. Your approach seems to produce the correct output for my random noise example images (after correction of the issue I pointed out in my comment to your question). Perhaps you could try plt.imshow(temp_avg, interpolation='none') to avoid possible artefacting from imshow's interpolation?

Resample all images in the database to the same voxel size

I have 3 dicom stacks of size 512x512x133, 512x512x155 and 512x512x277. I would like to resample all the stack to make the sizes 512x512x277, 512x512x277 and 512x512x277. How to do that?
I know I can do resampling using slice thickness and pixel spacing. But that would not ensure same number of slices in each cases.
You can use scipy.ndimage.interpolate.zoom, specifying the array of zoom factors for each axis like this:
# example for first image
zoomArray = desiredshape.astype(float) / original.shape
zoomed = scipy.ndimage.interpolate.zoom(original, zoomArray)
UPDATE:
If that is too slow, you could try somehow to create separate images from the vertical slices of your "image cube", process them with some high-speed image library (some folks love ImageMagick, there's also PIL, opencv, etc.), and stack them together again. That way, you'd take 512 images of size 512x133 and resize them to 512x277, then stack again to 512x512x277 which is your final desired size. Also, this separation would allow for parallelization. One think to consider is: this would only work if the transversal axis (the one along which you will slice the 2D images) would not be resized!
You can use the Resample transform in TorchIO.
import torchio as tio
small, medium, large = dicom_dirs # the folders of your three DICOMs
reference = tio.ScalarImage(large)
resample = tio.Resample(reference)
small_resampled = resample(small)
medium_resampled = resample(medium)
The three images now have the same shape, 512 x 512 x 277.
Disclaimer: I am the main developer of TorchIO.

How to reduce an image size in image processing (scipy/numpy/python)

Hello I have an image ( 1024 x 1024) and I used "fromfile" command in numpy to put every pixel of that image into a matrix.
How can I reduce the size of the image ( ex. to 512 x 512) by modify that matrix a?
a = numpy.fromfile(( - path - ,'uint8').reshape((1024,1024))
I have no idea how to modify the matrix a to reduce the size of the image. So if somebody has any idea, please share your knowledge and I will be appreciated. Thanks
EDIT:
When I look at the result, I found that the reader I got read the image and put it into a "matrix". So I changed the "array" to matrix.
Jose told me I can take only even column and even row and put it into a new matrix . That will reduce the image to half size. What command in scipy/numpy do I need to use to do that?
Thanks
If you want to resize to specific resolution, use scipy.misc.imresize:
import scipy.misc
i_width = 640
i_height = 480
scipy.misc.imresize(original_image, (i_height, i_width))
Use the zoom function from scipy:
http://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.zoom.html#scipy.ndimage.zoom
from scipy.ndimage.interpolation import zoom
a = np.ones((1024, 1024))
small_a = zoom(a, 0.5)
I think the easyiest way is to take only some columns and some rows of the image. Makeing a sample of the array. Take for example, only those even rows and the even columns, put it in a new array and you would have a half size new image.

Categories