How to use from_tensor_slices properly on MRI images? - python

I'm working with MRI images and I'd like to use from_tensor_slices to preprocess the paths but I don't know how to use that properly. Below are my code, the problem message and link for the dataset.
First I rearrange my data. 484 images and 484 labels
image_data_path = './drive/MyDrive/Brain Tumour/Task01_BrainTumour/imagesTr/'
label_data_path = './drive/MyDrive/Brain Tumour/Task01_BrainTumour/labelsTr/'
image_paths = [image_data_path + name
for name in os.listdir(image_data_path)
if not name.startswith(".")]
label_paths = [label_data_path + name
for name in os.listdir(label_data_path)
if not name.startswith(".")]
image_paths = sorted(image_paths)
label_paths = sorted(label_paths)
Then, the function to load 1 example (I use nibabel to load nii files)
def load_one_sample(image_path, label_path):
image = nib.load(image_path).get_fdata()
image = tf.convert_to_tensor(image, dtype = 'float32')
label = nib.load(label_path).get_fdata()
label = tf.convert_to_tensor(label, dtype = 'uint8')
return image, label
Next, I tried using from_tensor_slices
image_filenames = tf.constant(image_paths)
label_filenames = tf.constant(label_paths)
dataset = tf.data.Dataset.from_tensor_slices((image_filenames, label_filenames))
all_data = dataset.map(load_one_sample)
And the error comes: TypeError: stat: path should be string, bytes, os.PathLike or integer, not Tensor
What can be wrong and how can I fix it?
Datalink: https://drive.google.com/drive/folders/1HqEgzS8BV2c7xYNrZdEAnrHk7osJJ--2 (task 1 - Brain Tumour)
Please tell me if you need more information.

nib.load is not a TensorFlow function.
If you want to use anything in tf.data pipeline that is not a TensorFlow function then you have to wrap it using a tf.py_function.
Code:
image_data_path = 'Task01_BrainTumour/imagesTr/'
label_data_path = 'Task01_BrainTumour/labelsTr/'
image_paths = [image_data_path + name
for name in os.listdir(image_data_path)
if not name.startswith(".")]
label_paths = [label_data_path + name
for name in os.listdir(label_data_path)
if not name.startswith(".")]
image_paths = sorted(image_paths)
label_paths = sorted(label_paths)
def load_one_sample(image_path, label_path):
image = nib.load(image_path.numpy().decode()).get_fdata()
image = tf.convert_to_tensor(image, dtype = 'float32')
label = nib.load(label_path.numpy().decode()).get_fdata()
label = tf.convert_to_tensor(label, dtype = 'uint8')
return image, label
def wrapper_load(img_path, label_path):
img, label = tf.py_function(func = load_one_sample, inp = [img_path, label_path], Tout = [tf.float32, tf.uint8])
return img, label
dataset = tf.data.Dataset.from_tensor_slices((image_paths, label_paths)).map(wrapper_load)
The error is not due to the from_tensor_slices function but arises as nibs.load is expecting a string but gets a tensor.
However, a better way would be to create tfrecords and use them to train the model.

Related

Error when using a custom dataset with fastai

I am getting an error when trying to use my custom fastai dataset
The error:
Exception: Can't infer the type of your targets.
It's either because your data source is empty or because your labeling function raised an error.
The code:
from fastai import *
from fastai.vision import *
class URL:
MURDERHORNETS = f"https://superdata.quinniboi10.repl.co/MurderHornetImages"
path = untar_data(URL.MURDERHORNETS)
'''
path = untar_data(URLs.PETS)
files = get_image_files(path)
import PIL
img = PIL.Image.open(files[0])
img
'''
fnames = get_image_files(path)
fnames[:5]
np.random.seed (2)
pat = r'/([^/]+)_\d+\.(png|jpg|jpeg)$'
data = ImageDataBunch.from_folder(path, train=path, test=None, valid_pct=0.2,
ds_tfms=get_transforms(),
size=160)
data.normalize (imagenet_stats)
data.show_batch(rows=3, figsize=(7,6))
print (data.classes)
len (data.classes),data.c
learn = cnn_learner(data, models.resnet50, metrics=error_rate)
learn.fit_one_cycle(5)
learn.save ('stage-1')
The dataset is here, don't comment on the name, I don't know why that is what I chose :/
Get the zip file of the dataset here

FaceAlign AttributeError: 'str' object has no attribute 'shape'

I'm using Face Align and getting the following error for every image:
AttributeError: 'str' object has no attribute 'shape'
I interpret this to mean that my code is expecting an image object and instead receiving a string, correct?
The offending code:
def getAligns(self,
img,
use_cnn = False,
savepath = None,
return_info = False):
"""
get face alignment picture
:param img: original BGR image or a path to it
:param use_cnn: using CNN to extract aligned faces, if set, dlib
be compiled with cuda support
:param savepath: savepath, format "xx/xx/xx.png"
:param return_info: if set, return face positinos [(x, y, w, h)]
:return: aligned faces, (opt) rects
"""
print(img.shape)
if type(img) == str:
try:
img = cv2.imread(img)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
except:
shutil.copy2(img, 'temp.jpg')
img = cv2.imread('temp.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
os.remove('temp.jpg')
Relevant code from Align_Faces:
if clean: clear_dir(config.aligned)
os.chdir(config.labeled)
jobs = glob.glob("*.jpg")
print(len(jobs))
## # un-parallel
for picname in jobs:
print(picname)
aligned = FL.getAligns(picname)
if len(aligned) != 1:
print(config.aligned)
print(picname)
print(aligned[0])
return cv2.imwrite(config.aligned + picname, aligned[0])
Full output:
Look at your data flow:
jobs = glob.glob("*.jpg")
## # un-parallel
for picname in jobs:
print(picname)
aligned = FL.getAligns(picname)
def getAligns(self,
img,
use_cnn = False,
savepath = None,
return_info = False):
print(img.shape)
Your posted output shows that you've maintained the file name as a string 0_0_ErinMurphy.jpg, and passed that string into getAligns. A string has no shape attribute. You've missed a conversion step, such as reading in the image.
You are passing the string with the image name to the function and then asking what is the shape of the string.
print(picname)
returns 0_0_ErinMurphy.jpg which is a string.
You need to import the image and then convert it to the pixels so that you can read its shape.

Python decoding of particular image

I am trying to decode and display a particular image format for a planetary image encoded according the PDS standard. I am using Pillow and I need to implement a bit decoder but after various attempts I have no good results. Could anyone help me to implement this particular Pillow decoder? I would provide more information when needed
For sure, I will provide information of the encoding for the PDS file I want to use. These are the important lines from the header of the file :
POINTERS TO DATA OBJECTS
BROWSE_IMAGE= = 20480 <BYTES>
IMAGE = 53248 <BYTES>
OBJECT DESCRIPTION
OBJECT = IMAGE
FIRST_LINE = 1
LINE_PREFIX_BYTES = 0
LINE_SUFFIX_BYTES = 0
LINES = 1024
LINE_SAMPLES = 1024
SAMPLE_TYPE = MSB_UNSIGNED_INTEGER
SAMPLE_BITS = 16
SAMPLE_BIT_MASK = "2#0000001111111111#"
END_OBJECT = IMAGE
Is that clear enough? that is the part of the header defining the encoding method of the image. The whole file contains both the header and the image itself. I obviously need a pillow decoder, and precisely a bit decoder, but mine does not work.
And this will be my code for the bit decoder:
from PIL import Image, ImageFile
class DarkImageFile( ImageFile.ImageFile ) :
format = 'IMG'
format_description = 'IMG dark frame'
def _open( self ) :
self.size = (1024,1024)
self.mode = 'F' # data representation mode
self.tile = [ ("bit", ( 0,0 ) + self.size, 53248, (10,6, 0, 3) ) ]
Image.register_open( "IMG", DarkImageFile )
Image.register_extension( "IMG", ".img" )

How to classify images using Spark and Caffe

I am using Caffe to do image classification, can I am using MAC OS X, Pyhton.
Right now I know how to classify a list of images using Caffe with Spark python, but if I want to make it faster, I want to use Spark.
Therefore, I tried to apply the image classification on each element of an RDD, the RDD created from a list of image_path. However, Spark does not allow me to do so.
Here is my code:
This is the code for image classification:
# display image name, class number, predicted label
def classify_image(image_path, transformer, net):
image = caffe.io.load_image(image_path)
transformed_image = transformer.preprocess('data', image)
net.blobs['data'].data[...] = transformed_image
output = net.forward()
output_prob = output['prob'][0]
pred = output_prob.argmax()
labels_file = caffe_root + 'data/ilsvrc12/synset_words.txt'
labels = np.loadtxt(labels_file, str, delimiter='\t')
lb = labels[pred]
image_name = image_path.split(images_folder_path)[1]
result_str = 'image: '+image_name+' prediction: '+str(pred)+' label: '+lb
return result_str
This this the code generates Caffe parameters and apply the classify_image method on each element of the RDD:
def main():
sys.path.insert(0, caffe_root + 'python')
caffe.set_mode_cpu()
model_def = caffe_root + 'models/bvlc_reference_caffenet/deploy.prototxt'
model_weights = caffe_root + 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel'
net = caffe.Net(model_def,
model_weights,
caffe.TEST)
mu = np.load(caffe_root + 'python/caffe/imagenet/ilsvrc_2012_mean.npy')
mu = mu.mean(1).mean(1)
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2,0,1))
transformer.set_mean('data', mu)
transformer.set_raw_scale('data', 255)
transformer.set_channel_swap('data', (2,1,0))
net.blobs['data'].reshape(50,
3,
227, 227)
image_list= []
for image_path in glob.glob(images_folder_path+'*.jpg'):
image_list.append(image_path)
images_rdd = sc.parallelize(image_list)
transformer_bc = sc.broadcast(transformer)
net_bc = sc.broadcast(net)
image_predictions = images_rdd.map(lambda image_path: classify_image(image_path, transformer_bc, net_bc))
print image_predictions
if __name__ == '__main__':
main()
As you can see, here I tried to broadcast the caffe parameters, transformer_bc = sc.broadcast(transformer), net_bc = sc.broadcast(net)
The error is:
RuntimeError: Pickling of "caffe._caffe.Net" instances is not enabled
Before I am doing the broadcast, the error was :
Driver stacktrace.... Caused by: org.apache.spark.api.python.PythonException: Traceback (most recent call last):....
So, do you know, is there any way I can classify images using Caffe and Spark but also take advantage of Spark?
When you work with complex, non-native objects initialization has to moved directly to the workers for example with singleton module:
net_builder.py:
import cafe
net = None
def build_net(*args, **kwargs):
... # Initialize net here
return net
def get_net(*args, **kwargs):
global net
if net is None:
net = build_net(*args, **kwargs)
return net
main.py:
import net_builder
sc.addPyFile("net_builder.py")
def classify_image(image_path, transformer, *args, **kwargs):
net = net_builder.get_net(*args, **kwargs)
It means you'll have to distribute all required files as well. It can be done either manually or using SparkFiles mechanism.
On a side note you should take a look at the SparkNet package.

NameError when running a function where the name of an array is a parameter of the function

I am attempting to write a function which imports a specified CDF datafile, formats the image as a numpy array and does some slight refinements on the image to remove background.This works fine, however I encounter an error when I try to define the name of the array as one of my parameters in my function:
from netCDF4 import Dataset
import numpy as np
def importfunction(datafile, imagelabel):
f = Dataset(datafile)
locationfloatfield = f.variables['FloatField']
floatfield = locationfloatfield[:]
img = floatfield.flatten()
img = scipy.signal.detrend(img)
imagelabel = np.reshape(img, (256, 256))
imagelabel += abs(imagelabel.min())
imagelabel *= (1.0/imagelabel.max())
I attempt to label the array as imagelabel which is specified when I call the function. However when I call this function i get this error:
importfunction('..../CS191mk2153-M-Xp-Topo.nc', label)
NameError: name 'label' is not defined
I am unsure how to fix this
Maybe what you want to do is:
def importfunction(datafile):
f = Dataset(datafile)
locationfloatfield = f.variables['FloatField']
floatfield = locationfloatfield[:]
img = floatfield.flatten()
img = scipy.signal.detrend(img)
imagelabel = np.reshape(img, (256, 256))
imagelabel += abs(imagelabel.min())
imagelabel *= (1.0/imagelabel.max())
return imagelabel
then call:
label = importfunction('..../CS191mk2153-M-Xp-Topo.nc')

Categories