Faster performance for normalizing a numpy array? - python

I am currently normalizing an numpy array in python created by splicing an image by windows with a stride which creates about 20K patches. The current normalization implementation is a big pain point in my runtime, and I'm trying to replace it with the same functionality done maybe in a C extension or something. I am looking to see what advice the community has to get this done easily and simple?
Current runtime is about 0.34s just for the normalization part, I'm trying to get below 0.1s or better. You can see creating patches is extremely efficient with view_as_windows, and I am looking for something similar for normalization. Note you can simply comment/uncomment the lines labeled " # ---- Normalization" to see the runtimes yourself for the different implementations.
Here is the current implementation:
import gc
import cv2, time
from libraries import GCN
from skimage.util.shape import view_as_windows
def create_imageArray(patch_list):
returnImageArray = numpy.zeros(shape=(len(patch_list), 1, 40, 60))
idx = 0
for patch, name, coords in patch_list:
imgArray = numpy.asarray(patch[:,:], dtype=numpy.float32)
imgArray = imgArray[numpy.newaxis, ...]
returnImageArray[idx] = imgArray
idx += 1
return returnImageArray
# print "normImgArray[0]:",normImgArray[0]
def NormalizeData(imageArray):
tempImageArray = imageArray
# Normalize the data in batches
batchSize = 25000
dataSize = tempImageArray.shape[0]
imageChannels = tempImageArray.shape[1]
imageHeight = tempImageArray.shape[2]
imageWidth = tempImageArray.shape[3]
for i in xrange(0, dataSize, batchSize):
stop = i + batchSize
print("Normalizing data [{0} to {1}]...".format(i, stop))
dataTemp = tempImageArray[i:stop]
dataTemp = dataTemp.reshape(dataTemp.shape[0], imageChannels * imageHeight * imageWidth)
#print("Performing GCN [{0} to {1}]...".format(i, stop))
dataTemp = GCN(dataTemp)
#print("Reshaping data again [{0} to {1}]...".format(i, stop))
dataTemp = dataTemp.reshape(dataTemp.shape[0], imageChannels, imageHeight, imageWidth)
#print("Updating data with new values [{0} to {1}]...".format(i, stop))
tempImageArray[i:stop] = dataTemp
del dataTemp
gc.collect()
return tempImageArray
start_time = time.time()
img1_path = "777628-1032-0048.jpg"
img_list = ["images/1.jpg", "images/2.jpg", "images/3.jpg", "images/4.jpg", "images/5.jpg"]
patchWidth = 60
patchHeight = 40
channels = 1
stride = patchWidth/6
multiplier = 1.31
finalImgArray = []
vaw_time = 0
norm_time = 0
array_time = 0
for im_path in img_list:
start = time.time()
baseFileWithExt = os.path.basename(im_path)
baseFile = os.path.splitext(baseFileWithExt)[0]
img = cv2.imread(im_path, cv2.IMREAD_GRAYSCALE)
nxtWidth = 800
nxtHeight = 1200
patchesList = []
for i in xrange(7):
img = cv2.resize(img, (nxtWidth, nxtHeight))
nxtWidth = int(nxtWidth//multiplier)
nxtHeight = int(nxtHeight//multiplier)
patches = view_as_windows(img, (patchHeight, patchWidth), stride)
cols = patches.shape[0]
rows = patches.shape[1]
patchCount = cols*rows
print "patchCount:",patchCount, " patches.shape:",patches.shape
returnImageArray = numpy.zeros(shape=(patchCount, channels, patchHeight, patchWidth))
idx = 0
for col in xrange(cols):
for row in xrange(rows):
patch = patches[col][row]
imageName = "{0}-patch{1}-{2}.jpg".format(baseFile, i, idx)
patchCoodrinates = (0, 1, 2, 3) # don't need these for example
patchesList.append((patch, imageName, patchCoodrinates))
# ---- Normalization inside 7 iterations <> Part 1
# imgArray = numpy.asarray(patch[:,:], dtype=numpy.float32)
# imgArray = patch.astype(numpy.float32)
# imgArray = imgArray[numpy.newaxis, ...] # Add a new axis for channel so goes from shape [40,60] to [1,40,60]
# returnImageArray[idx] = imgArray
idx += 1
# if i == 0: finalImgArray = returnImageArray
# else: finalImgArray = numpy.concatenate((finalImgArray, returnImageArray), axis=0)
vaw_time += time.time() - start
# ---- Normalizaion inside 7 iterations <> Part 2
# start = time.time()
# normImageArray = NormalizeData(finalImgArray)
# norm_time += time.time() - start
# print "returnImageArray.shape:", finalImgArray.shape
# ---- Normalization outside 7 iterations
start = time.time()
imgArray = create_imageArray(patchesList)
array_time += time.time() - start
start = time.time()
normImgArray = NormalizeData( imgArray )
norm_time += time.time() - start
print "len(patchesList):",len(patchesList)
total_time = (time.time() - start_time)/len(img_list)
print "\npatches_time per img: {0:.3f} s".format(vaw_time/len(img_list))
print "create imgArray per img: {0:.3f} s".format(array_time/len(img_list))
print "normalization_time per img: {0:.3f} s".format(norm_time/len(img_list))
print "total time per image: {0:.3f} s \n".format(total_time)
Here is the GCN code in case you need to download it to use it: http://pastebin.com/RdVMD2P3
Details on code inside GCN
I am calling GCN using the default params.
At high level it is taking the average of all of the pixels and dividing all the pixels by that average. So if there's an an image array that looks like this [1 2 3], then the average is 2. Therefore we divide each number by 2 and get [0.5, 1, 1.5]. That's what the normalize does. I forgot to highlight in the image above the mean = X.mean(axis=1).
Notes:
If you are wondering why I am re-iterating and creating a new imgArray to normalize instead of doing it in the original patch creation it is to keep data transfer to a minimum. I am implementing this with the multiprocess library, and serializing data takes a LOOONG time, so trying to keep the data serialization to a minimum (meaning pass as little data back from the process). I have measured the difference between doing inside the 7 loops or outside, and notes are below so I can deal with that. However if you know of a faster implementation, do let me know.
Runtimes for creating imageArray inside 7 loops:
patches_time per img: 0.560 s
normalization_time per img: 0.336 s
total time per image: 0.896 s
Runtimes for creating imageArray and normalizing outside of 7 iterations:
patches_time per img: 0.040 s
create imgArray per img: 0.146 s
normalization_time per img: 0.339 s
total time per image: 0.524 s
I didn't see this before, but it seems creating the array is also taking some time.

Related

Dlib training error: error about aspect ratio and area size when all conditions are actually true

I use dlib.train_simple_object_detector to create detector for steel bars in bunch. This is my sample:
It has SAME box for every bar (created via Duplicate RectBox in labelImg) and size of box is 122x118 (14396 in area).
This is my training code:
import dlib
import cv2.cv2 as cv2
import os
import time
import sys
from xml.dom import minidom
if len(sys.argv) != 4:
print("Usage: python train.py /path/to/images/ /path/to/boxes/ /path/to/result.svm")
print("Images and boxes are named like 1.jpg and 1.xml")
exit(1)
data = {}
image_indexes = [int(img_name.split(".")[0]) for img_name in os.listdir(sys.argv[1])]
# np.random.shuffle(image_indexes)
image_indexes.sort()
# parse rectangle data
for index in image_indexes:
if index in [0]:
continue
rects = minidom.parse("{}/{}.xml".format(sys.argv[2], index)).getElementsByTagName("bndbox")
img = cv2.imread(os.path.join(sys.argv[1], str(index) + ".jpg"))
for rect in rects:
xmin = int(rect.getElementsByTagName("xmin")[0].firstChild.data)
xmax = int(rect.getElementsByTagName("xmax")[0].firstChild.data)
ymin = int(rect.getElementsByTagName("ymin")[0].firstChild.data)
ymax = int(rect.getElementsByTagName("ymax")[0].firstChild.data)
dlib_box = dlib.rectangle(left=xmin, top=ymin, right=xmax, bottom=ymax)
if index in data:
data[index][1].append(dlib_box)
else:
data[index] = (img, [dlib_box])
# train
percent = 0.8
split = int(len(data) * percent)
images = [tuple_value[0] for tuple_value in data.values()]
bounding_boxes = [tuple_value[1] for tuple_value in data.values()]
options = dlib.simple_object_detector_training_options()
options.add_left_right_image_flips = False
options.C = 5
options.num_threads = 16
options.epsilon = 0.01
# options.be_verbose = True
st = time.time()
detector = dlib.train_simple_object_detector(images[:split], bounding_boxes[:split], options)
print("Training complete. Time taken: {:.2f} seconds.".format(time.time() - st))
print("Training Metrics: {}".format(dlib.test_simple_object_detector(images[:split], bounding_boxes[:split], detector)))
detector.save(sys.argv[3])
When I run it with this sample it gives an error:
Error! An impossible set of object boxes was given for training. All the boxes
need to have a similar aspect ratio and also not be smaller than about 400
pixels in area.
But it's not true. They definitely have same aspect ratio as they are same boxes and they do have area > 400 (about 14000 actually). Why do this happen?

Why converting npy files (containing video frames) to tfrecords consumes too much disk space?

I am working on a violence detection service. I am trying to develop software based on the code in this repo. My dataset consists of videos resided in two directories "Violence" and "Non-Violence".
I used this code to generate npy files out of RGB channels and optical flow features. The output of this part would be 2 folders containing npy array with 244x244x5 shape. (np.float32 dtype). so it's like I have video frames in RGB in the first 3 channels (npy[...,:3]) and optical flow features in the next two channels (npy[..., 3:]).
Now I am trying to convert them to tfrecords and use tf.data.tfrecorddataset to speed up the training process. Since my model input has to be a cube tensor, my training elements has to be 64 frames of each video. It means the data point shape has to be 64x244x244x5.
So I used this code to convert the npy files to tfrecords.
from pathlib import Path
from os.path import join
import tensorflow as tf
import numpy as np
import cv2
from tqdm import tqdm
def normalize(data):
mean = np.mean(data)
std = np.std(data)
return (data - mean) / std
def random_flip(video, prob):
s = np.random.rand()
if s < prob:
video = np.flip(m=video, axis=2)
return video
def color_jitter(video):
# range of s-component: 0-1
# range of v component: 0-255
s_jitter = np.random.uniform(-0.2, 0.2)
v_jitter = np.random.uniform(-30, 30)
for i in range(len(video)):
hsv = cv2.cvtColor(video[i], cv2.COLOR_RGB2HSV)
s = hsv[..., 1] + s_jitter
v = hsv[..., 2] + v_jitter
s[s < 0] = 0
s[s > 1] = 1
v[v < 0] = 0
v[v > 255] = 255
hsv[..., 1] = s
hsv[..., 2] = v
video[i] = cv2.cvtColor(hsv, cv2.COLOR_HSV2RGB)
return video
def uniform_sample(video: str, target_frames: int = 64) -> np.ndarray:
"""
gets video and outputs n_frames number of frames in video.
Args:
video:
target_frames:
Returns:
"""
len_frames = int(len(data))
interval = int(np.ceil(len_frames / target_frames))
# init empty list for sampled video and
sampled_video = []
for i in range(0, len_frames, interval):
sampled_video.append(video[i])
# calculate number of padded frames and fix it
num_pad = target_frames - len(sampled_video)
if num_pad > 0:
padding = [video[i] for i in range(-num_pad, 0)]
sampled_video += padding
return np.array(sampled_video, dtype=np.float32)
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
if __name__ == '__main__':
path = Path('transformed/')
npy_files = list(path.rglob('*.npy'))[:100]
aug = True
# one_hots = to_categorical(range(2), dtype=np.int8)
path_to_save = 'data_tfrecords'
tfrecord_path = join(path_to_save, 'all_data.tfrecord')
with tf.io.TFRecordWriter(tfrecord_path) as writer:
for file in tqdm(npy_files, desc='files converted'):
# load npy files
npy = np.load(file.as_posix(), mmap_mode='r')
data = np.float32(npy)
del npy
# Uniform sampling
data = uniform_sample(data, target_frames=64)
# Add augmentation
if aug:
data[..., :3] = color_jitter(data[..., :3])
data = random_flip(data, prob=0.5)
# Normalization
data[..., :3] = normalize(data[..., :3])
data[..., 3:] = normalize(data[..., 3:])
# Label one hot encoding
label = 1 if file.parent.stem.startswith('F') else 0
# label = one_hots[label]
feature = {'image': _bytes_feature(tf.compat.as_bytes(data.tobytes())),
'label': _int64_feature(int(label))}
example = tf.train.Example(features=tf.train.Features(feature=feature))
writer.write(example.SerializeToString())
The code works fine, but the real problem is that it consumes too much disk drive. my whole dataset consisting of 2000 videos takes 12 GB, when I converted them to npy files, it became around 80 GB, and now using tfrecords It became over 120 GB or so. How can I convert them in an efficient way to reduce the space required to store them?
The answer might be too late. But I see you are still saving the video frame in your tfrecords file.
Try removing the "image" feature from your features list. And saving per frame as their Height, Width, Channels, and so forth.
feature = {'label': _int64_feature(int(label))}
Which is why the file is taking more space.

Python Apply Low Pass Filter On Some Data

I have a .txt file which includes some frame difference of a video.
The project is to remove noise and stabilize a video using these frame differences and a low pass filter.
The Vibrated2.txt file is:
0.341486, -0.258215
0.121945, 1.27605
-0.0811261, 0.78985
-0.0269414, 1.59913
-0.103227, 0.518159
0.274445, 1.69945
, ...
How can i apply a low pass filter on this data?
I tried this but it didn't work!
import cv2
import numpy as np
from scipy.signal import butter, lfilter
video= cv2.VideoCapture('Vibrated2.avi')
freq = (video.get(cv2.CAP_PROP_FPS))
cutoff = 5
data = np.loadtxt('Vibrated2.txt', delimiter=',')
b, a = butter(5, (cutoff/freq), btype='low', analog=False)
data = lfilter(b, a, data)
Any help? Any idea?
I am not exactly sure how your txt file is structured, but if you want to apply a low pass filter on your frame differencing output, i guess you want to make it binary?
def icv_check_threshold(pixel_value, desired_minimum_value):
if pixel_value < desired_minimum_value:
return False
else:
return True
And for the frame differencing:
def icv_pixel_frame_differencing(frame_1, frame_2):
# first convert frames to numpy arrays to make it easier to work with
first_frame = np.asarray(frame_1, dtype=np.float32)
second_frame = np.asarray(frame_2, dtype=np.float32)
# then compute frame dimensions
frame_width = int(first_frame[0].size)
frame_height = int(first_frame.size/frame_width)
# we then create a stock image for differencing output
frame_difference = np.zeros((frame_height, frame_width), np.uint8)
for i in range(0, frame_width - 1):
for j in range(0, frame_height - 1):
# compute the absolute difference between the current frame and first frame
frame_difference[j, i] = abs(first_frame[j, i] - second_frame[j, i])
# check if the threshold = 25 is satisfied, if not set pixel value to 0, else to 255
# comment out code below to obtain result without threshold / non-binary
if icv_check_threshold(frame_difference[j, i]):
frame_difference[j, i] = 255
else:
frame_difference[j, i] = 0
cv2.imwrite("differenceC.jpg", frame_difference)
cv2.imwrite("frame50.jpg", first_frame)
cv2.imwrite("frame51.jpg", second_frame)
return frame_difference
I hope that this helps. Also here is a link to a project with frame differencing I was working on.

could not broadcast input array from shape (20,310,310) into shape (20)

I'm trying to detect lung cancer nodules using DICOM files. The main steps in cancer detection included following steps.
1) Preprocessing
* Converting the pixel values to Hounsfield Units (HU)
* Resampling to an isomorphic resolution to remove variance in scanner resolution
*Lung segmentation
2) Training the data set using preprocessed images in Tensorflow CNN
3) Testing and validation
I followed few online tutorials to do this.
I need to combine the given solutions in
1) https://www.kaggle.com/gzuidhof/full-preprocessing-tutorial
2) https://www.kaggle.com/sentdex/first-pass-through-data-w-3d-convnet.
I could implement the example in link two. But since it is lack ok lung segmentation and few other preprocessing steps I need to combine the steps in link one with link two. But I'm getting number of errors while doing it. Since I'm new to python can someone please help me in solving it.
There are 20 patient folders and each patient folder has number of slices, which are dicom files.
For the process_data method , slices_path of each patient and patient number was sent.
def process_data(slices,patient,labels_df,img_px_size,hm_slices):
try:
label=labels_df.get_value(patient,'cancer')
patient_pixels = get_pixels_hu(slices)
segmented_lungs2, spacing = resample(patient_pixels, slices, [1,1,1])
new_slices=[]
segmented_lung = segment_lung_mask(segmented_lungs2, False)
segmented_lungs_fill = segment_lung_mask(segmented_lungs2, True)
segmented_lungs=segmented_lungs_fill-segmented_lung
#This method returns smallest integer not less than x.
chunk_sizes =math.ceil(len(segmented_lungs)/HM_SLICES)
for slice_chunk in chunks(segmented_lungs,chunk_sizes):
slice_chunk=list(map(mean,zip(*slice_chunk))) #list - []
#print (slice_chunk)
new_slices.append(slice_chunk)
print(len(segmented_lungs), len(new_slices))
if len(new_slices)==HM_SLICES-1:
new_slices.append(new_slices[-1])
if len(new_slices)==HM_SLICES-2:
new_slices.append(new_slices[-1])
new_slices.append(new_slices[-1])
if len(new_slices)==HM_SLICES+2:
new_val =list(map(mean, zip(*[new_slices[HM_SLICES-1],new_slices[HM_SLICES],])))
del new_slices[HM_SLICES]
new_slices[HM_SLICES-1]=new_val
if len(new_slices)==HM_SLICES+1:
new_val =list(map(mean, zip(*[new_slices[HM_SLICES-1],new_slices[HM_SLICES],])))
del new_slices[HM_SLICES]
new_slices[HM_SLICES-1]=new_val
print('LENGTH ',len(segmented_lungs), len(new_slices))
except Exception as e:
# again, some patients are not labeled, but JIC we still want the error if something
# else is wrong with our code
print(str(e))
#print(len(new_slices))
if label==1: label=np.array([0,1])
elif label==0: label=np.array([1,0])
return np.array(new_slices),label
Main method
# Some constants
#data_dir = '../../CT_SCAN_IMAGE_SET/IMAGES/'
#patients = os.listdir(data_dir)
#labels_df=pd.read_csv('../../CT_SCAN_IMAGE_SET/stage1_labels.csv',index_col=0)
#patients.sort()
#print (labels_df.head())
much_data=[]
much_data2=[]
for num,patient in enumerate(patients):
if num%100==0:
print (num)
try:
slices = load_scan(data_dir + patients[num])
img_data,label=process_data(slices,patients[num],labels_df,IMG_PX_SIZE,HM_SLICES)
much_data.append([img_data,label])
#much_data2.append([processed,label])
except:
print ('This is unlabeled data')
np.save('muchdata-{}-{}-{}.npy'.format(IMG_PX_SIZE,IMG_PX_SIZE,HM_SLICES),much_data)
#np.save('muchdata-{}-{}-{}.npy'.format(IMG_PX_SIZE,IMG_PX_SIZE,HM_SLICES),much_data2)
The preprocessing part works fine but when I'm trying to enter the final out put to a Convolutional NN and train the data set , Following is the error I'm receiving including some of the comments that I had put
0
shape hu
(113, 512, 512)
Resize factor
[ 2.49557522 0.6015625 0.6015625 ]
shape
(282, 308, 308)
chunk size
15
282 19
LENGTH 282 20
Tensor("Placeholder:0", dtype=float32)
..........1.........
..........2.........
..........3.........
..........4.........
WARNING:tensorflow:From C:\Research\Python_installation\lib\site-packages\tensorflow\python\util\tf_should_use.py:170: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Use `tf.global_variables_initializer` instead.
..........5.........
..........6.........
Epoch 1 completed out of 20 loss: 0
..........7.........
Traceback (most recent call last):
File "C:\Research\LungCancerDetaction\sendbox2.py", line 436, in <module>
train_neural_network(x)
File "C:\Research\LungCancerDetaction\sendbox2.py", line 424, in train_neural_network
print('Accuracy:',accuracy.eval({x:[i[0] for i in validation_data], y:[i[1] for i in validation_data]}))
File "C:\Research\Python_installation\lib\site-packages\tensorflow\python\framework\ops.py", line 606, in eval
return _eval_using_default_session(self, feed_dict, self.graph, session)
File "C:\Research\Python_installation\lib\site-packages\tensorflow\python\framework\ops.py", line 3928, in _eval_using_default_session
return session.run(tensors, feed_dict)
File "C:\Research\Python_installation\lib\site-packages\tensorflow\python\client\session.py", line 789, in run
run_metadata_ptr)
File "C:\Research\Python_installation\lib\site-packages\tensorflow\python\client\session.py", line 968, in _run
np_val = np.asarray(subfeed_val, dtype=subfeed_dtype)
File "C:\Research\Python_installation\lib\site-packages\numpy\core\numeric.py", line 531, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: could not broadcast input array from shape (20,310,310) into shape (20)
I think it is the issue with the 'segmented_lungs=segmented_lungs_fill-segmented_lung'
In the working example,
segmented_lungs=[cv2.resize(each_slice,(IMG_PX_SIZE,IMG_PX_SIZE)) for each_slice in patient_pixels]
Please help me in solving this. I'm unable to proceed since some time. If anything is not clear please let me know.
Following is the whole code that had tried.
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import dicom
import os
import scipy.ndimage
import matplotlib.pyplot as plt
import cv2
import math
import tensorflow as tf
from skimage import measure, morphology
from mpl_toolkits.mplot3d.art3d import Poly3DCollection
# Some constants
data_dir = '../../CT_SCAN_IMAGE_SET/IMAGES/'
patients = os.listdir(data_dir)
labels_df=pd.read_csv('../../CT_SCAN_IMAGE_SET/stage1_labels.csv',index_col=0)
patients.sort()
print (labels_df.head())
#Image pixel array watching
for patient in patients[:10]:
#label is to get the label of the patient. This is what done in the .get_value method.
label=labels_df.get_value(patient,'cancer')
path=data_dir+patient
slices = [dicom.read_file(path + '/' + s) for s in os.listdir(path)]
#You have dicom files and they have attributes.
slices.sort(key = lambda x: float(x.ImagePositionPatient[2]))
print (len(slices),slices[0].pixel_array.shape)
#If u need to see many slices and resize the large pixelated 2D images into 150*150 pixelated images
IMG_PX_SIZE=50
HM_SLICES=20
for patient in patients[:1]:
#label is to get the label of the patient. This is what done in the .get_value method.
label=labels_df.get_value(patient,'cancer')
path=data_dir+patient
slices = [dicom.read_file(path + '/' + s) for s in os.listdir(path)]
#You have dicom files and they have attributes.
slices.sort(key = lambda x: float(x.ImagePositionPatient[2]))
#This shows the pixel arrayed image related to the second slice of each patient
#subplot
fig=plt.figure()
for num,each_slice in enumerate(slices[:16]):
print (num)
y=fig.add_subplot(4,4,num+1)
#down sizing everything. Resize the imag size as their pixel values are 512*512
new_image=cv2.resize(np.array(each_slice.pixel_array),(IMG_PX_SIZE,IMG_PX_SIZE))
y.imshow(new_image)
plt.show()
print (len(patients))
###################################################################################
def get_pixels_hu(slices):
image = np.array([s.pixel_array for s in slices])
# Convert to int16 (from sometimes int16),
# should be possible as values should always be low enough (<32k)
image = image.astype(np.int16)
# Set outside-of-scan pixels to 0
# The intercept is usually -1024, so air is approximately 0
image[image == -2000] = 0
# Convert to Hounsfield units (HU)
for slice_number in range(len(slices)):
intercept = slices[slice_number].RescaleIntercept
slope = slices[slice_number].RescaleSlope
if slope != 1:
image[slice_number] = slope * image[slice_number].astype(np.float64)
image[slice_number] = image[slice_number].astype(np.int16)
image[slice_number] += np.int16(intercept)
return np.array(image, dtype=np.int16)
#The next problem is each patient is got different number of slices . This is a performance issue.
# Take the slices and put that into a list of slices and chunk that list of slices into fixed numer of
#chunk of slices and averaging those chunks.
#yield is like 'return'. It returns a generator
def chunks(l,n):
for i in range(0,len(l),n):
#print ('Inside yield')
#print (i)
yield l[i:i+n]
def mean(l):
return sum(l)/len(l)
def largest_label_volume(im, bg=-1):
vals, counts = np.unique(im, return_counts=True)
counts = counts[vals != bg]
vals = vals[vals != bg]
if len(counts) > 0:
return vals[np.argmax(counts)]
else:
return None
def segment_lung_mask(image, fill_lung_structures=True):
# not actually binary, but 1 and 2.
# 0 is treated as background, which we do not want
binary_image = np.array(image > -320, dtype=np.int8)+1
labels = measure.label(binary_image)
# Pick the pixel in the very corner to determine which label is air.
# Improvement: Pick multiple background labels from around the patient
# More resistant to "trays" on which the patient lays cutting the air
# around the person in half
background_label = labels[0,0,0]
#Fill the air around the person
binary_image[background_label == labels] = 2
# Method of filling the lung structures (that is superior to something like
# morphological closing)
if fill_lung_structures:
# For every slice we determine the largest solid structure
for i, axial_slice in enumerate(binary_image):
axial_slice = axial_slice - 1
labeling = measure.label(axial_slice)
l_max = largest_label_volume(labeling, bg=0)
if l_max is not None: #This slice contains some lung
binary_image[i][labeling != l_max] = 1
binary_image -= 1 #Make the image actual binary
binary_image = 1-binary_image # Invert it, lungs are now 1
# Remove other air pockets insided body
labels = measure.label(binary_image, background=0)
l_max = largest_label_volume(labels, bg=0)
if l_max is not None: # There are air pockets
binary_image[labels != l_max] = 0
return binary_image
#Loading the files
#Load the scans in given folder path
def load_scan(path):
slices = [dicom.read_file(path + '/' + s) for s in os.listdir(path)]
slices.sort(key = lambda x: float(x.ImagePositionPatient[2]))
try:
slice_thickness = np.abs(slices[0].ImagePositionPatient[2] - slices[1].ImagePositionPatient[2])
except:
slice_thickness = np.abs(slices[0].SliceLocation - slices[1].SliceLocation)
for s in slices:
s.SliceThickness = slice_thickness
return slices
def resample(image, scan, new_spacing=[1,1,1]):
# Determine current pixel spacing
spacing = np.array([scan[0].SliceThickness] + scan[0].PixelSpacing, dtype=np.float32)
resize_factor = spacing / new_spacing
new_real_shape = image.shape * resize_factor
new_shape = np.round(new_real_shape)
real_resize_factor = new_shape / image.shape
new_spacing = spacing / real_resize_factor
print ('Resize factor')
print (real_resize_factor)
image = scipy.ndimage.interpolation.zoom(image, real_resize_factor, mode='nearest')
print ('shape')
print (image.shape)
return image, new_spacing
'''def chunks(l,n):
for i in range(0,len(l),n):
#print ('Inside yield')
#print (i)
yield l[i:i+n]
def mean(l):
return sum(l)/len(l)'''
#processing data
def process_data(slices,patient,labels_df,img_px_size,hm_slices):
#for patient in patients[:10]:
#label is to get the label of the patient. This is what done in the .get_value method.
try:
label=labels_df.get_value(patient,'cancer')
print ('label process data')
print (label)
#path=data_dir+patient
#slices = [dicom.read_file(path + '/' + s) for s in os.listdir(path)]
#You have dicom files and they have attributes.
slices.sort(key = lambda x: float(x.ImagePositionPatient[2]))
#This shows the pixel arrayed image related to the second slice of each patient
patient_pixels = get_pixels_hu(slices)
print ('shape hu')
print (patient_pixels.shape)
segmented_lungs2, spacing = resample(patient_pixels, slices, [1,1,1])
#print ('Pix shape')
#print (segmented_lungs2.shape)
#segmented_lungs=np.array(segmented_lungs2).tolist()
new_slices=[]
segmented_lung = segment_lung_mask(segmented_lungs2, False)
segmented_lungs_fill = segment_lung_mask(segmented_lungs2, True)
segmented_lungs=segmented_lungs_fill-segmented_lung
#print ('length of segmented lungs')
#print (len(segmented_lungs))
#print ('Shape of segmented lungs......................................')
#print (segmented_lungs.shape)
#print ('hiiii')
#segmented_lungs=[cv2.resize(each_slice,(IMG_PX_SIZE,IMG_PX_SIZE)) for each_slice in segmented_lungs3]
#print ('bye')
#print ('length of slices')
#print (len(slices))
#print ('shape of slices')
#print (slices.shape)
#print (each_slice.pixel_array)
#This method returns smallest integer not less than x.
chunk_sizes =math.ceil(len(segmented_lungs)/HM_SLICES)
print ('chunk size ')
print (chunk_sizes)
for slice_chunk in chunks(segmented_lungs,chunk_sizes):
slice_chunk=list(map(mean,zip(*slice_chunk))) #list - []
#print (slice_chunk)
new_slices.append(slice_chunk)
print(len(segmented_lungs), len(new_slices))
if len(new_slices)==HM_SLICES-1:
new_slices.append(new_slices[-1])
if len(new_slices)==HM_SLICES-2:
new_slices.append(new_slices[-1])
new_slices.append(new_slices[-1])
if len(new_slices)==HM_SLICES-3:
new_slices.append(new_slices[-1])
new_slices.append(new_slices[-1])
new_slices.append(new_slices[-1])
if len(new_slices)==HM_SLICES+2:
new_val =list(map(mean, zip(*[new_slices[HM_SLICES-1],new_slices[HM_SLICES],])))
del new_slices[HM_SLICES]
new_slices[HM_SLICES-1]=new_val
if len(new_slices)==HM_SLICES+1:
new_val =list(map(mean, zip(*[new_slices[HM_SLICES-1],new_slices[HM_SLICES],])))
del new_slices[HM_SLICES]
new_slices[HM_SLICES-1]=new_val
if len(new_slices)==HM_SLICES+3:
new_val =list(map(mean, zip(*[new_slices[HM_SLICES-1],new_slices[HM_SLICES],])))
del new_slices[HM_SLICES]
new_slices[HM_SLICES-1]=new_val
print('LENGTH ',len(segmented_lungs), len(new_slices))
except Exception as e:
# again, some patients are not labeled, but JIC we still want the error if something
# else is wrong with our code
print(str(e))
#print(len(new_slices))
if label==1: label=np.array([0,1])
elif label==0: label=np.array([1,0])
return np.array(new_slices),label
# Some constants
#data_dir = '../../CT_SCAN_IMAGE_SET/IMAGES/'
#patients = os.listdir(data_dir)
#labels_df=pd.read_csv('../../CT_SCAN_IMAGE_SET/stage1_labels.csv',index_col=0)
#patients.sort()
#print (labels_df.head())
much_data=[]
much_data2=[]
for num,patient in enumerate(patients):
if num%100==0:
print (num)
try:
slices = load_scan(data_dir + patients[num])
img_data,label=process_data(slices,patients[num],labels_df,IMG_PX_SIZE,HM_SLICES)
much_data.append([img_data,label])
#much_data2.append([processed,label])
except:
print ('This is unlabeled data')
np.save('muchdata-{}-{}-{}.npy'.format(IMG_PX_SIZE,IMG_PX_SIZE,HM_SLICES),much_data)
#np.save('muchdata-{}-{}-{}.npy'.format(IMG_PX_SIZE,IMG_PX_SIZE,HM_SLICES),much_data2)
IMG_SIZE_PX = 50
SLICE_COUNT = 20
n_classes=2
batch_size=10
x = tf.placeholder('float')
y = tf.placeholder('float')
keep_rate = 0.8
def conv3d(x, W):
return tf.nn.conv3d(x, W, strides=[1,1,1,1,1], padding='SAME')
def maxpool3d(x):
# size of window movement of window as you slide about
return tf.nn.max_pool3d(x, ksize=[1,2,2,2,1], strides=[1,2,2,2,1], padding='SAME')
def convolutional_neural_network(x):
# # 5 x 5 x 5 patches, 1 channel, 32 features to compute.
weights = {'W_conv1':tf.Variable(tf.random_normal([3,3,3,1,32])),
# 5 x 5 x 5 patches, 32 channels, 64 features to compute.
'W_conv2':tf.Variable(tf.random_normal([3,3,3,32,64])),
# 64 features
'W_fc':tf.Variable(tf.random_normal([54080,1024])),
'out':tf.Variable(tf.random_normal([1024, n_classes]))}
biases = {'b_conv1':tf.Variable(tf.random_normal([32])),
'b_conv2':tf.Variable(tf.random_normal([64])),
'b_fc':tf.Variable(tf.random_normal([1024])),
'out':tf.Variable(tf.random_normal([n_classes]))}
# image X image Y image Z
x = tf.reshape(x, shape=[-1, IMG_SIZE_PX, IMG_SIZE_PX, SLICE_COUNT, 1])
conv1 = tf.nn.relu(conv3d(x, weights['W_conv1']) + biases['b_conv1'])
conv1 = maxpool3d(conv1)
conv2 = tf.nn.relu(conv3d(conv1, weights['W_conv2']) + biases['b_conv2'])
conv2 = maxpool3d(conv2)
fc = tf.reshape(conv2,[-1, 54080])
fc = tf.nn.relu(tf.matmul(fc, weights['W_fc'])+biases['b_fc'])
fc = tf.nn.dropout(fc, keep_rate)
output = tf.matmul(fc, weights['out'])+biases['out']
return output
much_data = np.load('muchdata-50-50-20.npy')
# If you are working with the basic sample data, use maybe 2 instead of 100 here... you don't have enough data to really do this
train_data = much_data[:-4]
validation_data = much_data[-4:]
def train_neural_network(x):
print ('..........1.........')
prediction = convolutional_neural_network(x)
print ('..........2.........')
#cost = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(prediction,y) )
cost = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(logits=prediction,labels=y))
print ('..........3.........')
optimizer = tf.train.AdamOptimizer(learning_rate=1e-3).minimize(cost)
print ('..........4.........')
hm_epochs = 20
with tf.Session() as sess:
sess.run(tf.initialize_all_variables())
successful_runs = 0
total_runs = 0
print ('..........5.........')
for epoch in range(hm_epochs):
epoch_loss = 0
for data in train_data:
total_runs += 1
try:
X = data[0]
Y = data[1]
_, c = sess.run([optimizer, cost], feed_dict={x: X, y: Y})
epoch_loss += c
successful_runs += 1
except Exception as e:
# I am passing for the sake of notebook space, but we are getting 1 shaping issue from one
# input tensor. Not sure why, will have to look into it. Guessing it's
# one of the depths that doesn't come to 20.
pass
#print(str(e))
print ('..........6.........')
print('Epoch', epoch+1, 'completed out of',hm_epochs,'loss:',epoch_loss)
print ('..........7.........')
correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct, 'float'))
print('Accuracy:',accuracy.eval({x:[i[0] for i in validation_data], y:[i[1] for i in validation_data]}))
print('Done. Finishing accuracy:')
print('Accuracy:',accuracy.eval({x:[i[0] for i in validation_data], y:[i[1] for i in validation_data]}))
print('fitment percent:',successful_runs/total_runs)
print (x)
# Run this locally:
train_neural_network(x)
P.S : resample() , segment_lung_mask() methods can be found from link 1.
For training you have
for data in train_data:
total_runs += 1
try:
X = data[0]
Y = data[1]
_, c = sess.run([optimizer, cost], feed_dict={x: X, y: Y})
So x and y are, respectively, the first two elements of a single row of train_data.
However, when calculating the accuracy you have
print('Accuracy:',accuracy.eval({x:[i[0] for i in validation_data], y:[i[1] for i in validation_data]}))
So x is the first element of all rows of validation_data, which gives it dimensions of (20,310,310), which can't be broadcast to a placeholder of dimension (20). Ditto for y. (Broadcasting means that if you gave it a tensor of dimensions (20, 310) it would know to take each of the 310 columns and feed it to the placeholder separately. It can't figure out what to do with a tensor of (20, 310, 310).)
Incidentally, when you declare your placeholders it's a good idea to specify their dimensions, using None for the dimension depending on the number of separate examples. This way the program can warn you when dimensions don't match up.
The error message seems to indicate that the placeholder tensors x and y have not been defined correctly. They should have the same shape as the input values X = data[0] and Y = data[1], such as
x = tf.placeholder(shape=[20,310,310], dtype=tf.float32)
# if y is a scalar:
y = tf.placeholder(shape=[], dtype=tf.float32)

Bottleneck for Numpy-based Recursive Function to Undo Max Pooling

I've devised a recursive function to handle a specific problem within the deep learning community. It seems to work quickly and well for most cases, but then takes ~20 minutes for other cases for seemingly no reason. The function, in the simplest case, can be abstracted as simply numpy's "repeat" function on two axes. Here's the code I used to test this function:
def recursive_upsample(fMap, index, dims):
if index == 0:
return fMap
else:
start = time.time()
upscale = np.zeros((dims[index-1][0],dims[index-1][1],fMap.shape[-1]))
if dims[index-1][0] % 2 == 1 and dims[index-1][1] % 2 == 1:
crop = fMap[:fMap.shape[0]-1,:fMap.shape[1]-1]
consX = fMap[-1,:][:-1]
consY = fMap[:,-1][:-1]
corner = fMap[-1,-1]
crop = crop.repeat(2, axis=0).repeat(2, axis=1)
upscale[:crop.shape[0],:crop.shape[1]] = crop
upscale[-1,:][:-1] = consX.repeat(2,axis=0)
upscale[:,-1][:-1] = consY.repeat(2,axis=0)
upscale[-1,-1] = corner
elif dims[index-1][0] % 2 == 1:
crop = fMap[:fMap.shape[0]-1]
consX = fMap[-1:,]
crop = crop.repeat(2, axis=0).repeat(2, axis=1)
upscale[:crop.shape[0]] = crop
upscale[-1:,] = consX.repeat(2,axis=1)
elif dims[index-1][1] % 2 == 1:
crop = fMap[:,:fMap.shape[1]-1]
consY = fMap[:,-1]
crop = crop.repeat(2, axis=0).repeat(2, axis=1)
upscale[:,:crop.shape[1]] = crop
upscale[:,-1] = consY.repeat(2,axis=0)
else:
upscale = fMap.repeat(2, axis=0).repeat(2, axis=1)
print('Upscaling from {} to {} took {} seconds'.format(fMap.shape,upscale.shape,time.time() - start))
fMap = upscale
return recursive_upsample(fMap,index-1,dims)
if __name__ == '__main__':
dims = [(634,1020,64),(317,510,128),(159,255,256),(80,128,512),(40,64,512)]
images = []
for dim in dims:
image = np.random.rand(dim[0],dim[1],dim[2])
images.append(image)
start = time.time()
upsampled = []
for index,image in enumerate(images):
upsampled.append(recursive_upsample(image,index,dims))
print('Upsampling took {} seconds'.format(time.time() - start))
For some odd reason, the point in the recursion where the feature map of shape (40,64,512) is being upsampled from shape (317,510,512) to (634,1020,512) takes an egregious 941 seconds! I'm starting to rewrite this code with Theano, but should I be looking to some underlying problem with my code? My reasoning as of right now is that computing this on CPU is unwieldy, but I'm not sure what the hold up is with such a simple function. Also any tips on how to make this function faster would be appreciated!
There's no need to do the recursion. E.g. for the (40,64,512) image you can directly do:
upsampled = image.repeat(16, axis=0).repeat(16, axis=1)[:634,:1020]

Categories