I have a csv file that contains three columns namely (image file names, labels(in form of 0,1...), and class names). While another folder contains all the images. I want to read the images using this csv file and use them further for the task of image classification using deep learning models on python.
You can try this:
import pandas as pd
import cv2
import numpy as np
from pathlib import Path
df = pd.read_csv('path_to_csv.csv')
path = Path('path/to/img/dir') # relative path is OK
dataset = df.apply(lambda file_name, label:
return (cv2.imread(path/file_name), label),
axis=1)
dataset = np.asarray(dataset)
Now you'll have dataset that is a numpy matrix with images as column 1 and labels as column 2
I'm using OpenCV to import image. Of course you can use others like pillow too, but OpenCV is faster than pillow
Related
i have some left atrium data that is h5 form and i was wondering if its possible to change into niftis so i can train my model
I have done this already with dicom to nifties using dicom2niftis package however i couldent find one for h5 to nifty package
from Heart_segmentation.Prepering_data import Processing
from glob import glob
#STEP2
import dicom2nifti
import os
# will be used to chnage dicom files into nifties
# dicom series 2 paramters
in_path_images = 'C:/Users/Omar/Task02_Heart/decomGroups/images/*'
# contains images and labels
in_path_labels= 'C:/Users/Omar/Task02_Heart/decomGroups/labels/*'
out_path_images='C:/Users/Omar/Task02_Heart/nifti_files/images'
out_path_labels='C:/Users/Omar/Task02_Heart/nifti_files/labels'
list_of_images= glob(in_path_images)
list_of_labels=glob(in_path_labels)
print(list_of_labels)
print(list_of_images)
I have uploaded the fairface dataset (https://github.com/joojs/fairface) into my google drive and I'm trying to convert the images to a dataset of arrays that I can use in a CNN.
First, I created a list of the files for the validation set. Now I am trying to convert the images to arrays. This is what I am trying, but it says my directory does not exist.
val is the folder of validation images.
import os
from PIL import Image
from numpy import asarray
val_items = os.listdir('/content/val')
train_items = os.listdir('/content/train')
val_img_array = []
# load the image and convert into
# numpy array
for i in range(len(val_items)):
img = Image.open('/content/val/*.jpg')
numpydata = asarray(img)
val_img_array.append(numpydata)
print(val_img_array)
Please give me any guidance you have. Thanks!
You are not importing the drive correctly. Your path should look like this:
Image.open("/content/drive/MyDrive/val/")
I am trying to import CT scan data into ImageJ/FIJI (There is HDF5 plugin in ImageJ/Fiji, however the synchrotron CT data has so large datasets.. so it was failed to open). The scan data (Image dataset) is saved as dataset into the hdf5 file. So I have to extract image dataset from the hdf5 file, then converted it into the Tiff file.
HdF5 File path is "F:/New_ESRF/SNT_BTO4/SNT_BTO4_S1/SNT_BTO4_S1_1_1pag_db0005_vol.hdf5"
Herein, 'SNT_BTO4_S1_1_1pag_db0005_vol.hdf5' is divided into several datasets, and the image dataset is in here:/entry0000/reconstruction/results/data
At the moment, I accessed to the image dataset using h5py. However, after that, I am in stuck to extract/save the dataset separately from the hdf5 file.
Which code is required to extract the image dataset from the hdf5 file?
After that, I am thinking of using from PIL to Image then convert the image into Tiff file. Can I get any advice on the code for this?
import numpy as np
import h5py
filename = "F:/New_ESRF/SNT_BTO4/SNT_BTO4_S1/SNT_BTO4_S1_1_1pag_db0005_vol.hdf5"
with h5py.File(filename,'r') as hdf:
base_items = list (hdf.items())
print('#Items in the base directory:', base_items)
#entry0000
G1 = hdf.get ('entry0000')
G1_items = list (G1.items())
print('#Items in entry0000', G1_items)
#reconstruction
G11 = G1.get ('/entry0000/reconstruction')
G11_items = list (G11.items())
print('#Items in reconstruction', G11_items)
#results_data
G12 = G11.get ('/entry0000/reconstruction/results')
G12_items = list (G12.items())
print('#Items in results', G12_items)
Extracting image data from an HDF5 file and converting to an image is a "relatively straight forward" 2 step process:
Access the data in the HDF5 file
Convert to an image with cv2 (or PIL)
A simple example is available here: How to extract individual JPEG images from a HDF5 file.
You can apply the same process to your file. Here is some pseudo-code. It's not complete because you don't show the shape of the image dataset (and the shape affects how to read the data). Also, you didn't say how many images are in dataset /entry0000/reconstruction/results/data --- does it have a single image or multiple images. If multiple images, which axis is the image counter?
import h5py
import cv2 ## for image conversion
filename = "F:/New_ESRF/SNT_BTO4/SNT_BTO4_S1/SNT_BTO4_S1_1_1pag_db0005_vol.hdf5"
with h5py.File(filename,'r') as hdf:
# get image dataset
img_ds = hdf['/entry0000/reconstruction/results/data']
print(f'Image Dataset info: Shape={img_ds.shape},Dtype={img_ds.dtype}')
## following depends on dataset shape/schema
## code below assumes images are along axis=0
for i in range(img_ds.shape[0]):
cv2.imwrite(f'test_img_{i:03}.tiff',img_ds[i,:]) # uses slice notation
# alternately load to a numpy array first
img_arr = img_ds[i,:] # slice notation gets [i,:,:,:]
cv2.imwrite(f'test_img_{i:03}.tiff',img_arr)
Note: you don't need to use .get() to get a dataset. You can simply reference the dataset path. Also, when you use a group object, use the relative path from the dataset to the group, not the absolute path. (You should modify your code to reflect these changes.) For example, the following are equivalent
G1 = hdf['entry0000']
## is the same as G1 = hdf.get('entry0000')
G11 = hdf['entry0000/reconstruction']
## is the same as G11 = hdf.get('entry0000/reconstruction')
## OR referencing G1 group object:
G11 = G1['reconstruction']
## is the same as G11 = G1.get('reconstruction')
I want to Train a deep neural network on the MRI slices dataset. Here is my code
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import matplotlib
file_dir = 'C:\\Users\\adam\\Downloads\\MRI_Images\\'
import glob
import cv2
images = [cv2.imread(file) for file in glob.glob("C:\\Users\\adam\\Downloads\\MRI_Images\\.png")]
(X_train_full, y_train_full), (X_test, y_test) = images
And python shows that not enough values to unpack. I don't know why. Is there problem when I put all images in one file to python?
I don't know the structure of your dataset directory, but I know that using glob.glob() will return all the images inside the 'C:\\Users\\adam\\Downloads\\MRI_Images\\' folder (not include subfolder).
That is, what you get inside image is a list of read-in images (numpy array format), like:
[image_0, image_1, ...]
A list can not be unpack into two tuples. And this is why the error comes out.
Try reading your train and test images seperately might help:
images_trainx = [cv2.imread(file) for file in glob.glob("C:\\Users\\adam\\Downloads\\MRI_Images\\trainx\\*.png")]
images_trainy = [cv2.imread(file) for file in glob.glob("C:\\Users\\adam\\Downloads\\MRI_Images\\trainy\\*.png")]
images_testx = [cv2.imread(file) for file in glob.glob("C:\\Users\\adam\\Downloads\\MRI_Images\\testx\\*.png")]
images_testy = [cv2.imread(file) for file in glob.glob("C:\\Users\\adam\\Downloads\\MRI_Images\\testy\\*.png")]
This approach is clunky but hard to go wrong.
I'm trying to extract the images (and its label and such) from an RGB-D dataset called NYUV2 dataset. (I downloaded the labelled dataset)
It's a matlab file so I tried using hdf5 to read it but I don't know how to proceed from here. How do I save the images and its corresponding labels and depths into a different folder??
Here's the script that I used and its corresponding output.
import numpy as np
import h5py
f = h5py.File('nyu_depth_v2_labeled.mat','r')
k = list(f.keys())
print(k)
Output is
['#refs#', '#subsystem#', 'accelData', 'depths', 'images', 'instances', 'labels', 'names', 'namesToIds', 'rawDepthFilenames', 'rawDepths', 'rawRgbFilenames', 'sceneTypes', 'scenes']
I hope this helps.
I suppose you are using the PIL package The function fromarray expects the "mode of the image" see https://pillow.readthedocs.io/en/3.1.x/handbook/concepts.html#concept-modes
I suppose your image is in RGB. I believe the image souhld be under group 'images' and dataset image_name
Therefore
import h5py
import numpy as np
from PIL import Image
hdf = h5py.File('nyu_depth_v2_labeled.mat','r')
array = np.array(list(hdf.get("images/image_name")))
img = Image.fromarray(array.astype('uint8'), 'RGB')
img.show()
You can also look at another answer I gave to know how to save images
Images saved as HDF5 arent colored
To view the content of the h5 file, download HDFview, it will help navigate through it.