I am taking a Pattern recognition subject in this semester. I have a project to do face detection system from 3000++ images. I am using python for this project.
What I have done so far is convert the image into numpy array and store inside a list using code below:
# convert to numpy array, then grayscale, then resize, then vectorize, finally store in
# a list
for file in sorted(img_path):
img = cv2.imread(file)
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_gray = cv2.resize(img_gray, dsize=(150, 150), interpolation=cv2.INTER_CUBIC)
img_gray = img_gray.reshape(-1)
imagesData.append(img_gray)
# save to .h5 file, not yet do for label dataset
hf = h5py.File(save_path, 'a')
dset = hf.create_dataset('dataset',data=imagesData)
hf.close()
There is a small question here, is reshape(-1) mean vectorize? I try imagesData.shape, it print out (22500,), originally (150,150)
print(imagesData[0].shape)
The images are from a google drive folder(consisit of .png image). I am using sorted in looping because I want to arrange and store the numpy array in list from first to last images (1223 - 5222). Why I do this because I was given a text file containing some features that arranged from (1223-5222) and I going to store both dataset (imagesData) and label datasets (features) inside a .h5 file. The features text file as below:
text file
Am I right? because after store both dataset and label datasets into .h5 file, I will load them out and start some machine algorithm for my project, so I have to make sure each row of sample match correct label.
Related
My problem is a bit wired. I am working on Prostate MRI dataset which contains dicom images. When I load dicom files using Simple ITK the output numpy array's dtype will be float64 . But when I load same dicom files using pydicom , the output numpy array's dtype will be uint16 And the problem isn't just this. Pixel intensities would be different when using different module. So my question is why they look different and which one is correct and why these modules load data differently?
this is the code I'm using to load dcm files.
import pydicom
import SimpleITK as sitk
path = 'dicoms/1.dcm'
def read_using_sitk():
reader = sitk.ImageFileReader()
reader.SetFileName(path)
image = reader.Execute()
numpy_array = sitk.GetArrayFromImage(image)
return numpy_array.dtype
def read_using_pydicom():
dataset = pydicom.dcmread(path)
numpy_array = dataset.pixel_array
return numpy_array.dtype
The difference is that pydicom loads the original data as saved in the dataset (which is usually uint16 for MR data), while SimpleITK does some preprocessing (most likely applying the LUT) and returns the processed data as a float array.
In pydicom, to get data suitable for display, you have to apply some lookup table yourself, usually the one coming with the image.
If you have a modality LUT (not very common for MR data), you first have to apply that using apply_modality_lut, while for the VOI LUT you use apply_voi_lut. This will apply both modality and VOI LUT as found in the dataset:
ds = dcmread(fname)
arr = ds.pixel_array
out = apply_modality_lut(arr, ds)
display_data = apply_voi_lut(out, ds, index=0)
This can be savely used even if no modality or VOI LUT is present in the dataset - in this case just the input data is returned.
Note that there can be more than one VOI LUT in a DICOM image, for example to show different kinds of tissue - thus the index argument, though that is also not very common in MR images.
I am trying to import CT scan data into ImageJ/FIJI (There is HDF5 plugin in ImageJ/Fiji, however the synchrotron CT data has so large datasets.. so it was failed to open). The scan data (Image dataset) is saved as dataset into the hdf5 file. So I have to extract image dataset from the hdf5 file, then converted it into the Tiff file.
HdF5 File path is "F:/New_ESRF/SNT_BTO4/SNT_BTO4_S1/SNT_BTO4_S1_1_1pag_db0005_vol.hdf5"
Herein, 'SNT_BTO4_S1_1_1pag_db0005_vol.hdf5' is divided into several datasets, and the image dataset is in here:/entry0000/reconstruction/results/data
At the moment, I accessed to the image dataset using h5py. However, after that, I am in stuck to extract/save the dataset separately from the hdf5 file.
Which code is required to extract the image dataset from the hdf5 file?
After that, I am thinking of using from PIL to Image then convert the image into Tiff file. Can I get any advice on the code for this?
import numpy as np
import h5py
filename = "F:/New_ESRF/SNT_BTO4/SNT_BTO4_S1/SNT_BTO4_S1_1_1pag_db0005_vol.hdf5"
with h5py.File(filename,'r') as hdf:
base_items = list (hdf.items())
print('#Items in the base directory:', base_items)
#entry0000
G1 = hdf.get ('entry0000')
G1_items = list (G1.items())
print('#Items in entry0000', G1_items)
#reconstruction
G11 = G1.get ('/entry0000/reconstruction')
G11_items = list (G11.items())
print('#Items in reconstruction', G11_items)
#results_data
G12 = G11.get ('/entry0000/reconstruction/results')
G12_items = list (G12.items())
print('#Items in results', G12_items)
Extracting image data from an HDF5 file and converting to an image is a "relatively straight forward" 2 step process:
Access the data in the HDF5 file
Convert to an image with cv2 (or PIL)
A simple example is available here: How to extract individual JPEG images from a HDF5 file.
You can apply the same process to your file. Here is some pseudo-code. It's not complete because you don't show the shape of the image dataset (and the shape affects how to read the data). Also, you didn't say how many images are in dataset /entry0000/reconstruction/results/data --- does it have a single image or multiple images. If multiple images, which axis is the image counter?
import h5py
import cv2 ## for image conversion
filename = "F:/New_ESRF/SNT_BTO4/SNT_BTO4_S1/SNT_BTO4_S1_1_1pag_db0005_vol.hdf5"
with h5py.File(filename,'r') as hdf:
# get image dataset
img_ds = hdf['/entry0000/reconstruction/results/data']
print(f'Image Dataset info: Shape={img_ds.shape},Dtype={img_ds.dtype}')
## following depends on dataset shape/schema
## code below assumes images are along axis=0
for i in range(img_ds.shape[0]):
cv2.imwrite(f'test_img_{i:03}.tiff',img_ds[i,:]) # uses slice notation
# alternately load to a numpy array first
img_arr = img_ds[i,:] # slice notation gets [i,:,:,:]
cv2.imwrite(f'test_img_{i:03}.tiff',img_arr)
Note: you don't need to use .get() to get a dataset. You can simply reference the dataset path. Also, when you use a group object, use the relative path from the dataset to the group, not the absolute path. (You should modify your code to reflect these changes.) For example, the following are equivalent
G1 = hdf['entry0000']
## is the same as G1 = hdf.get('entry0000')
G11 = hdf['entry0000/reconstruction']
## is the same as G11 = hdf.get('entry0000/reconstruction')
## OR referencing G1 group object:
G11 = G1['reconstruction']
## is the same as G11 = G1.get('reconstruction')
I have a text file that contains the paths to jpeg images I want to import into my script. I am using an example code provided by a Udemy course: "Deep Learning with Python - Novice to Pro!" that detects smiles in images. The function I am having trouble with is converting images into matrixes/two-dimensional arrays:
def img2array(f, detection=False, ii_size=(64, 64)):
"""
Convert images into matrixes/two-dimensional arrays.
detection - if True we will resize an image to fit the
shape of a data that our first convolutional
layer is accepting which is 32x32 array,
used only on detection.
ii_size - this is the size that our input images have.
"""
rf=None
if detection:
rf=f.rsplit('.')
rf=rf[0]+'-resampled.'+rf[1]
im = Image.open(f)
# Create a smaller scalled down thumbnail
# of our image.
im.thumbnail(ii_size)
# Our thumbnail might not be of a perfect
# dimensions, so we need to create a new
# image and paste the thumbnail in.
newi = Image.new('L', ii_size)
newi.paste(im, (0,0))
newi.save(rf, "JPEG")
f=rf
# Turn images into an array.
data=imread(f, as_gray=True)
# Downsample it from 64x64 to 32x32
# (that's what we need to feed into our first convolutional layer).
data=block_reduce(data, block_size=(2, 2), func=np.mean)
if rf:
remove(rf)
return data
The function is called in another script:
img_data=prep_array(img2array(filename, detection=True), detection=True)
I am not sure what to name 'filename' in order for this code to run correctly. When I give it the text file path I get an error that says:
UnidentifiedImageError: cannot identify image file 'filepath\imagelist.txt
I am brand new to Python, and I need help importing the correct 'filename' variable to make this function work.
From the looks of the error message, you're passing in the file path of the textfile (containing paths to images) as filename
Parse the text file for file paths to images and pass it into your function.
with open("path/to/imagelist.txt", "r") as fp:
filepaths = fp.read().splitlines()
for filename in filepaths:
img_data=prep_array(img2array(filename, detection=True), detection=True)
I have a data set of images in an image processing project. I want to input an image and scan through the data set to recognize the given image. What module/ library/ approach( eg: ML) should I use to identify my image in my python- opencv code?
To find exactly the same image, you don't need any kind of ML. The image is just an array of pixels, so you can check if the array of the input image equals that of an image in your dataset.
import glob
import cv2
import numpy as np
# Read in source image (the one you want to match to others in the dataset)
source = cv2.imread('test.jpg')
# Make a list of all the images in the dataset (I assume they are images in a directory)
filelist = glob.glob(r'C:\Users\...\Images\*.JPG')
# Loop through the images, read them in and check if an image is equal to your source
for file in filelist:
img = cv2.imread(file)
if np.array_equal(source, img):
print("%s is the same image as source" %(file))
break
I was looking at this Tensorflow tutorial.
In the tutorial the images are magically read like this:
mnist = learn.datasets.load_dataset("mnist")
train_data = mnist.train.images
My images are placed in two directories:
../input/test/
../input/train/
They all have a *.jpg ending.
So how can read them into my program?
I don't think I can use learn.datasets.load_dataset because this seems to take in a specialized dataset structure, while I only have folders with images.
mnist.train.images is essentially a numpy array of shape [55000, 784]. Where, 55000 is the number of images and 784 is the number of pixels in each image (each image is 28x28)
You need to create a similar numpy array from your data in case you want to run this exact code. So, you'll need to iterate over all your images, read image as a numpy array, flatten it and create a matrix of size [num_examples, image_size]
The following code snippet should do it:
import os
import cv2
import numpy as np
def load_data(img_dir):
return np.array([cv2.imread(os.path.join(img_dir, img)).flatten() for img in os.listdir(img_dir) if img.endswith(".jpg")])
A more comprehensive code to enable debugging:
import os
list_of_imgs = []
img_dir = "../input/train/"
for img in os.listdir("."):
img = os.path.join(img_dir, img)
if not img.endswith(".jpg"):
continue
a = cv2.imread(img)
if a is None:
print "Unable to read image", img
continue
list_of_imgs.append(a.flatten())
train_data = np.array(list_of_imgs)
Note:
If your images are not 28x28x1 (B/W images), you will need to change the neural network architecture (defined in cnn_model_fn). The architecture in the tutorial is a toy architecture which only works for simple images like MNIST. Alexnet may be a good place to start for RGB images.
You can check the answers given in How do I convert a directory of jpeg images to TFRecords file in tensorflow?. Easiest way is to use the utility provided by tensor flow :build_image_data.py, which does exactly the thing you want to do.