My problem is a bit wired. I am working on Prostate MRI dataset which contains dicom images. When I load dicom files using Simple ITK the output numpy array's dtype will be float64 . But when I load same dicom files using pydicom , the output numpy array's dtype will be uint16 And the problem isn't just this. Pixel intensities would be different when using different module. So my question is why they look different and which one is correct and why these modules load data differently?
this is the code I'm using to load dcm files.
import pydicom
import SimpleITK as sitk
path = 'dicoms/1.dcm'
def read_using_sitk():
reader = sitk.ImageFileReader()
reader.SetFileName(path)
image = reader.Execute()
numpy_array = sitk.GetArrayFromImage(image)
return numpy_array.dtype
def read_using_pydicom():
dataset = pydicom.dcmread(path)
numpy_array = dataset.pixel_array
return numpy_array.dtype
The difference is that pydicom loads the original data as saved in the dataset (which is usually uint16 for MR data), while SimpleITK does some preprocessing (most likely applying the LUT) and returns the processed data as a float array.
In pydicom, to get data suitable for display, you have to apply some lookup table yourself, usually the one coming with the image.
If you have a modality LUT (not very common for MR data), you first have to apply that using apply_modality_lut, while for the VOI LUT you use apply_voi_lut. This will apply both modality and VOI LUT as found in the dataset:
ds = dcmread(fname)
arr = ds.pixel_array
out = apply_modality_lut(arr, ds)
display_data = apply_voi_lut(out, ds, index=0)
This can be savely used even if no modality or VOI LUT is present in the dataset - in this case just the input data is returned.
Note that there can be more than one VOI LUT in a DICOM image, for example to show different kinds of tissue - thus the index argument, though that is also not very common in MR images.
Related
I want to convert .nii images to .tif to train my model using U-Net.
1-I looped through all images in the folder.
2-I looped through all slices within each image.
3-I saved each slice as .tif.
The training images are converted successfully. However, the labels (masks) are all saved as black images. I want to successfully convert those masks from .nii to .tif, but I don't know how. I read that it could be something with brightness, but I didn't get the idea clearly, so I couldn't solve the problem until now.
The only reason for this conversion is to be able to train my model. Feel free to suggest a better idea, if anyone can share a way to feed the network with the .nii format directly.
import nibabel as nib
import matplotlib.pyplot as plt
import imageio
import numpy as np
import glob
import os
import nibabel as nib
import numpy as np
from tifffile import imsave
import tifffile as tiff
for filepath in glob.iglob('data/Task04_Hippocampus/labelsTr/*.nii.gz'):
a = nib.load(filepath).get_fdata()
a = a.astype('int8')
base = Path(filepath).stem
base = re.sub('.nii', '', base)
x,y,z = a.shape
for i in range(0,z):
newimage = a[:, :, i]
imageio.imwrite('data/Task04_Hippocampus/masks/'+base+'_'+str(i)+'.tif', newimage)
Unless you absolutely have to use TIFF, I would strongly suggest using the NiFTI format for a number of important reasons:
Image values are often not arbitrary. For example, in CT images the values correspond to x-ray attenuation (check out this Wikipedia page). TIFF, which is likely to scale the values in some way, is not suitable for this.
NIfTI also contains a header which has crucial geometric information needed to correctly interpret the image, such as the resolution, slice thickness, and direction.
You can directly extract a numpy.ndarray from NIfTI images using SimpleITK. Here is a code snippet:
import SimpleITK as sitk
import numpy as np
img = sitk.ReadImage("your_image.nii")
arr = sitk.GetArrayFromImage(img)
slice_0 = arr[0,:,:] # this is a 2D axial slice as a np.ndarray
As an aside: the reason the images where you stored your masks look black is because in NIfTI format labels have a value of 1 (and background is 0). If you directly convert to TIFF, a value of 1 is very close to black when interpreted as an RGB value - another reason to avoid TIFF!
I am trying to import CT scan data into ImageJ/FIJI (There is HDF5 plugin in ImageJ/Fiji, however the synchrotron CT data has so large datasets.. so it was failed to open). The scan data (Image dataset) is saved as dataset into the hdf5 file. So I have to extract image dataset from the hdf5 file, then converted it into the Tiff file.
HdF5 File path is "F:/New_ESRF/SNT_BTO4/SNT_BTO4_S1/SNT_BTO4_S1_1_1pag_db0005_vol.hdf5"
Herein, 'SNT_BTO4_S1_1_1pag_db0005_vol.hdf5' is divided into several datasets, and the image dataset is in here:/entry0000/reconstruction/results/data
At the moment, I accessed to the image dataset using h5py. However, after that, I am in stuck to extract/save the dataset separately from the hdf5 file.
Which code is required to extract the image dataset from the hdf5 file?
After that, I am thinking of using from PIL to Image then convert the image into Tiff file. Can I get any advice on the code for this?
import numpy as np
import h5py
filename = "F:/New_ESRF/SNT_BTO4/SNT_BTO4_S1/SNT_BTO4_S1_1_1pag_db0005_vol.hdf5"
with h5py.File(filename,'r') as hdf:
base_items = list (hdf.items())
print('#Items in the base directory:', base_items)
#entry0000
G1 = hdf.get ('entry0000')
G1_items = list (G1.items())
print('#Items in entry0000', G1_items)
#reconstruction
G11 = G1.get ('/entry0000/reconstruction')
G11_items = list (G11.items())
print('#Items in reconstruction', G11_items)
#results_data
G12 = G11.get ('/entry0000/reconstruction/results')
G12_items = list (G12.items())
print('#Items in results', G12_items)
Extracting image data from an HDF5 file and converting to an image is a "relatively straight forward" 2 step process:
Access the data in the HDF5 file
Convert to an image with cv2 (or PIL)
A simple example is available here: How to extract individual JPEG images from a HDF5 file.
You can apply the same process to your file. Here is some pseudo-code. It's not complete because you don't show the shape of the image dataset (and the shape affects how to read the data). Also, you didn't say how many images are in dataset /entry0000/reconstruction/results/data --- does it have a single image or multiple images. If multiple images, which axis is the image counter?
import h5py
import cv2 ## for image conversion
filename = "F:/New_ESRF/SNT_BTO4/SNT_BTO4_S1/SNT_BTO4_S1_1_1pag_db0005_vol.hdf5"
with h5py.File(filename,'r') as hdf:
# get image dataset
img_ds = hdf['/entry0000/reconstruction/results/data']
print(f'Image Dataset info: Shape={img_ds.shape},Dtype={img_ds.dtype}')
## following depends on dataset shape/schema
## code below assumes images are along axis=0
for i in range(img_ds.shape[0]):
cv2.imwrite(f'test_img_{i:03}.tiff',img_ds[i,:]) # uses slice notation
# alternately load to a numpy array first
img_arr = img_ds[i,:] # slice notation gets [i,:,:,:]
cv2.imwrite(f'test_img_{i:03}.tiff',img_arr)
Note: you don't need to use .get() to get a dataset. You can simply reference the dataset path. Also, when you use a group object, use the relative path from the dataset to the group, not the absolute path. (You should modify your code to reflect these changes.) For example, the following are equivalent
G1 = hdf['entry0000']
## is the same as G1 = hdf.get('entry0000')
G11 = hdf['entry0000/reconstruction']
## is the same as G11 = hdf.get('entry0000/reconstruction')
## OR referencing G1 group object:
G11 = G1['reconstruction']
## is the same as G11 = G1.get('reconstruction')
I'm trying to extract the images (and its label and such) from an RGB-D dataset called NYUV2 dataset. (I downloaded the labelled dataset)
It's a matlab file so I tried using hdf5 to read it but I don't know how to proceed from here. How do I save the images and its corresponding labels and depths into a different folder??
Here's the script that I used and its corresponding output.
import numpy as np
import h5py
f = h5py.File('nyu_depth_v2_labeled.mat','r')
k = list(f.keys())
print(k)
Output is
['#refs#', '#subsystem#', 'accelData', 'depths', 'images', 'instances', 'labels', 'names', 'namesToIds', 'rawDepthFilenames', 'rawDepths', 'rawRgbFilenames', 'sceneTypes', 'scenes']
I hope this helps.
I suppose you are using the PIL package The function fromarray expects the "mode of the image" see https://pillow.readthedocs.io/en/3.1.x/handbook/concepts.html#concept-modes
I suppose your image is in RGB. I believe the image souhld be under group 'images' and dataset image_name
Therefore
import h5py
import numpy as np
from PIL import Image
hdf = h5py.File('nyu_depth_v2_labeled.mat','r')
array = np.array(list(hdf.get("images/image_name")))
img = Image.fromarray(array.astype('uint8'), 'RGB')
img.show()
You can also look at another answer I gave to know how to save images
Images saved as HDF5 arent colored
To view the content of the h5 file, download HDFview, it will help navigate through it.
I was looking at this Tensorflow tutorial.
In the tutorial the images are magically read like this:
mnist = learn.datasets.load_dataset("mnist")
train_data = mnist.train.images
My images are placed in two directories:
../input/test/
../input/train/
They all have a *.jpg ending.
So how can read them into my program?
I don't think I can use learn.datasets.load_dataset because this seems to take in a specialized dataset structure, while I only have folders with images.
mnist.train.images is essentially a numpy array of shape [55000, 784]. Where, 55000 is the number of images and 784 is the number of pixels in each image (each image is 28x28)
You need to create a similar numpy array from your data in case you want to run this exact code. So, you'll need to iterate over all your images, read image as a numpy array, flatten it and create a matrix of size [num_examples, image_size]
The following code snippet should do it:
import os
import cv2
import numpy as np
def load_data(img_dir):
return np.array([cv2.imread(os.path.join(img_dir, img)).flatten() for img in os.listdir(img_dir) if img.endswith(".jpg")])
A more comprehensive code to enable debugging:
import os
list_of_imgs = []
img_dir = "../input/train/"
for img in os.listdir("."):
img = os.path.join(img_dir, img)
if not img.endswith(".jpg"):
continue
a = cv2.imread(img)
if a is None:
print "Unable to read image", img
continue
list_of_imgs.append(a.flatten())
train_data = np.array(list_of_imgs)
Note:
If your images are not 28x28x1 (B/W images), you will need to change the neural network architecture (defined in cnn_model_fn). The architecture in the tutorial is a toy architecture which only works for simple images like MNIST. Alexnet may be a good place to start for RGB images.
You can check the answers given in How do I convert a directory of jpeg images to TFRecords file in tensorflow?. Easiest way is to use the utility provided by tensor flow :build_image_data.py, which does exactly the thing you want to do.
I have an array of pixel values from a greyscale image. I want to export these values into a text file or CSV. I have been trying to to this using various function including: xlsxwrite, write, CSV but I have so far been unsuccessful in doing this. This is what I am working with:
from PIL import Image
import numpy as np
import sys
# load the original image
img_file = Image.open("seismic.png")
img_file.show()
# get original image parameters...
width, height = img_file.size
format = img_file.format
mode = img_file.mode
# Make image Greyscale
img_grey = img_file.convert('L')
img_grey.save('result.png')
img_grey.show()
# Save Greyscale values
value = img_grey.load()
Essentially I would like to save 'value' to use elsewhere.
Instead of using the .load() method, you could use this (with 'x' as your grey-scale image):
value = np.asarray(x.getdata(),dtype=np.float64).reshape((x.size[1],x.size[0]))
This is from: Converting an RGB image to grayscale and manipulating the pixel data in python
Once you have done this, it is straightforward to save the array on file using numpy.savetxt
numpy.savetxt("img_pixels.csv", value, delimiter=',')
Note: x.load() yields an instance of a Pixel Access class which cannot be manipulated using the csv writers you mentioned. The methods available for extracting and manipulating individual pixels in a Pixel Access class can be found here: http://pillow.readthedocs.io/en/3.1.x/reference/PixelAccess.html