I have a lot of images (pydicom files). I would like to divide in half. From 1 image, I would like 2 images: part left and part right.
Input: 1000x1000
Output: 500x1000 (width x height).
Currently, I can only read a file.
ds = pydicom.read_file(image_fps[0]) # read dicom image from filepath
First part, I would like to put half in one folder and the other half to second.
This is what I have:
enter image description here
This is what I want:
enter image description here
I use Mask-RCNN to object localization problem. I would like crop 50% of image size (pydicom file).
EDIT1:
import SimpleITK as sitk
filtered_image = sitk.GetImageFromArray(left_part)
sitk.WriteImage(filtered_image, '/home/wojtek/Mask/nnna.dcm', True)
I have dicom file, but I can't display it.
this transfer syntax JPEG 2000 Image Compression (Lossless Only), can not be read because Pillow lacks the jpeg 2000 decoder plugin
Once you have executed pydicom.dcm_read() your pixel data is available at ds.pixel_array. You can just slice the data you want and save it with any suitable library. In this example I will be using matplotlib as I also use that for verifying whether my slicing is correct. Adjust to your needs obviously, one thing you need to do is generate the correct path/filenames for saving. Have fun!
(this script assumes the filepaths are available in a paths variable)
import pydicom
import matplotlib
# for testing if the slice is correct
from matplotlib import pyplot as plt
for path in paths:
# read the dicom file
ds = pydicom.dcmread(path)
# find the shape of your pixel data
shape = ds.pixel_array.shape
# get the half of the x dimension. For the y dimension use shape[0]
half_x = int(shape[1] / 2)
# slice the halves
# [first_axis, second_axis] so [:,:half_x] means slice all from first axis, slice 0 to half_x from second axis
left_part = ds.pixel_array[:, :half_x]
right_part = ds.pixel_array[:,half_x:]
# to check whether the slices are correct, matplotlib can be convenient
# plt.imshow(left_part); do not do this in the loop
# save the files, see the documentation for matplotlib if you want a different format
# bmp, png are surely supported
path_to_left_image = 'generate\the\path\and\filename\for\the\left\image.bmp'
path_to_right_image = 'generate\the\path\and\filename\for\the\right\image.bmp'
matplotlib.image.imsave(path_to_left_image, left_part)
matplotlib.image.imsave(path_to_right_image, right_part)
If you want to save the DICOM files keep in mind that they may not be valid DICOM if you do not update the appropriate data. For instance the SOP Instance UID is technically not allowed to be the same as in the original DICOM file, or any other SOP Instance UID for that matter. How important that is, is up to you.
With a script like below you can define named slices and split any dicom image file it finds in the supplied path into the appropriate slices.
import os
import pydicom
import numpy as np
def save_partials(parts, path_to_directory):
"""
parts: list of tuples, each tuple specifying a name and a list of four slice offsets
path_to_directory: path to directory containing dicom files
any file with a .dcm extension will have its image data split into the specified slices and saved accordingly.
original file will not be modified
"""
dir_content = [os.path.join(path_to_directory, item) for item in os.listdir(path_to_directory)]
files = [i for i in dir_content if os.path.isfile(os.path.join(path_to_directory, i))]
for file in files:
root, extension = os.path.splitext(file)
if extension.lower() != '.dcm':
# not a .dcm file, continue with next iteration of loop
continue
for part in parts:
ds = pydicom.read_file(file)
if not isinstance(ds.pixel_array, np.ndarray):
# no image data available
continue
part_name = part[0]
p = part[1] # slice list
ds.PixelData = ds.pixel_array[p[0]:p[1], p[2]:p[3]].tobytes()
ds.Rows = p[1] - p[0]
ds.Columns = p[3] - p[2]
##
## Here you can modify any tags using ds.KeyWord
##
new_file_name = "{r}-{pn}{ext}".format(r=root, pn=part_name, ext=extension)
ds.save_as(new_file_name)
print('saved {}'.format(new_file_name))
dir_path = '/home/wojtek/Mask'
parts = [('left', [0,512,0,256]),
('right', [0,512,256,512])]
save_partials(parts, dir_path)
Related
I have cobbled together some code on python, to try and work through a folder of dicom files, splitting each image in two.
All my dicom files are X-rays of both the left and right feet, and I need to separate them.
To do this I am adapting some code produced by #g_unit seen here
Unfortunately - this attempt results in two unaltered copies of the original file - unsplit. It does work when writing the files as PNG or JPG, but not when writing as dicoms. My test image in the console also looks good.
In my below example, I am using a folder with only one file in it. I will adapt to write the new files and filenames after I get my single sample to work.
import matplotlib.pyplot as plt
import pydicom
import pydicom as pd
import os
def main():
path = 'C:/.../test_block_out/'
# iterate through the names of contents of the folder
for file in os.listdir(path):
# create the full input path and read the file
input_path = os.path.join(path, file)
dataset = pd.dcmread(input_path)
shape = dataset.pixel_array.shape
# get the half of the x dimension. For the y dimension use shape[0]
half_x = int(shape[1] / 2)
# slice the halves
# [first_axis, second_axis] so [:,:half_x] means slice all from first axis, slice 0 to half_x from second axis
left_part = dataset.pixel_array[:, :half_x].tobytes()
right_part = dataset.pixel_array[:,half_x:].tobytes()
#Save halves
path_to_left_image = 'C:.../test_file/left.dcm'
path_to_right_image = 'C:.../test_file/right.dcm'
dataset.save_as(path_to_left_image, left_part)
dataset.save_as(path_to_right_image, right_part)
#print test image
plt.imshow(dataset.pixel_array[:, :half_x])
#plt.imshow(dataset.pixel_array[:,half_x:])
if __name__ == '__main__':
main()
I have tried to write the pixel array to dataset.PixelData - but this throws the error:
ValueError: The length of the pixel data in the dataset (5120000 bytes) doesn't match the expected length (10240000 bytes). The dataset may be corrupted or there may be an issue with the pixel data handler.
Which makes sense, since its half my original dimensions. It will write a DCM, but I cannot load this DCM into any dicom viewer tools ('Decode error!')
Is there a way to get this to write the files as DCMs, not PNGs? Or will the DCMs always bug if the dimensions are incorrect?
A kind colleague has helped by providing the answer.
The issue was that I was saving "dataset", not "left_part".
The solution was to create a new pydicom object , deep copying the dcm file, and then modifying the copy.
Code below:
# iterate through the names of contents of the folder
for file in os.listdir(path):
# create the full input path and read the file
input_path = os.path.join(path, file)
dataset = pd.dcmread(input_path)
left_part = copy.deepcopy(dataset)
right_part = copy.deepcopy(dataset)
shape = dataset.pixel_array.shape
# get the half of the x dimension. For the y dimension use shape[0]
half_x = int(shape[1] / 2)
# slice the halves
# [first_axis, second_axis] so [:,:half_x] means slice all from first axis, slice 0 to half_x from second axis
left_part.PixelData = dataset.pixel_array[:, :half_x].tobytes()
left_part['Columns'].value=half_x
right_part.PixelData = dataset.pixel_array[:,half_x:].tobytes()
right_part['Columns'].value=shape[1]-half_x
#Save halves
path_to_left_image = os.path.join(path, 'left_'+file)
path_to_right_image = os.path.join(path, 'right_'+file)
left_part.save_as(path_to_left_image)
right_part.save_as(path_to_right_image)
#print test image
plt.imshow(left_part.pixel_array)
plt.show()
I am using the sliding window technic to an image and i am extracting the mean values of pixels of each one window. So the results are someting like this [[[[215.015625][123.55036272][111.66057478]]]].now the question is how could i save all these values for every one window into a txt file or at a CSV because i want to use them for further compare similarities? whatever i tried the error is same..that it is a 4D array and not an 1D or 2D. I ll appreciate any help really.! Thank you in advance
import cv2
import matplotlib.pyplot as plt
import numpy as np
# read the image and define the stepSize and window size
# (width,height)
image2 = cv2.imread("bird.jpg")# your image path
image = cv2.resize(image2, (224, 224))
tmp = image # for drawing a rectangle
stepSize = 10
(w_width, w_height) = (60, 60 ) # window size
for x in range(0, image.shape[1] - w_width, stepSize):
for y in range(0, image.shape[0] - w_height, stepSize):
window = image[x:x + w_width, y:y + w_height, :]
# classify content of the window with your classifier and
# determine if the window includes an object (cell) or not
# draw window on image
cv2.rectangle(tmp, (x, y), (x + w_width, y + w_height), (255, 0, 0), 2) # draw rectangle on image
plt.imshow(np.array(tmp).astype('uint8'))
# show all windows
plt.show()
mean_values=[]
mean_val, std_dev = cv2.meanStdDev(image)
mean_val = mean_val[:3]
mean_values.append([mean_val])
mean_values = np.asarray(mean_values)
print(mean_values)
Human Readable Option
Assuming that you want the data to be human readable, saving the data takes a little bit more work. My search showed me that there's this solution for saving 3D data to a text file. However, it's pretty simple to extend this example to 4D for your use case. This code is taken and adapted from that post, thank you Joe Kington and David Cheung.
import numpy as np
data = np.arange(2*3*4*5).reshape((2,3,4,5))
with open('test.csv', 'w') as outfile:
# We write this header for readable, the pound symbol
# will cause numpy to ignore it
outfile.write('# Array shape: {0}\n'.format(data.shape))
# Iterating through a ndimensional array produces slices along
# the last axis. This is equivalent to data[i,:,:] in this case.
# Because we are dealing with 4D data instead of 3D data,
# we need to add another for loop that's nested inside of the
# previous one.
for threeD_data_slice in data:
for twoD_data_slice in threeD_data_slice:
# The formatting string indicates that I'm writing out
# the values in left-justified columns 7 characters in width
# with 2 decimal places.
np.savetxt(outfile, twoD_data_slice, fmt='%-7.2f')
# Writing out a break to indicate different slices...
outfile.write('# New slice\n')
And then once the data has been saved all you need to do is load it and reshape it (np.load()) will default to reading in the data as a 2D array but np.reshape() will allow us to recover the structure. Again, this code is adapted from the previous post.
new_data = np.loadtxt('test.csv')
# Note that this returned a 2D array!
print(new_data.shape)
# However, going back to 3D is easy if we know the
# original shape of the array
new_data = new_data.reshape((2,3,4,5))
# Just to check that they're the same...
assert np.all(new_data == data)
Binary Option
Assuming that human readability is not necessary, I would recommend using the built-in *.npy format which is described here. This stores the data in a binary format.
You can save the array by doing np.save('NAME_OF_ARRAY.npy', ARRAY_TO_BE_SAVED) and then load it with SAVED_ARRAY = np.load('NAME_OF_ARRAY.npy').
You can also save several numpy array in a single zip file with the np.savez() function like so np.savez('MANY_ARRAYS.npz', ARRAY_ONE, ARRAY_TWO). And you load the zipped arrays in a similar fashion SEVERAL_ARRAYS = np.load('MANY_ARRAYS.npz').
I have the following snippet:
from aicspylibczi import CziFile
from pathlib import Path
pth = Path('/Volumes/USB/20x_HE.czi')
czi = CziFile(pth)
image, shp = czi.read_image(C=0, M=0) # very slow
The parameters C und M are there to slice the big array in to little numpy pieces.
The File is 3,4GB big and it is taking to long(with 8GB RAM Macbook) so I abort it always.
I think thats not okay because I want to have the first slice of the array, not the whole matrix.
You can try slideio python package (http://slideio.com). It makes use of internal image pyramids. You can read the image partially with high resolution or the whole image with low resolution.
The code below rescales the image so that the width of the delivered raster will be 500 pixels (the height is computed to keep the image size ratio).
import slideio
slide = slideio.open_slidei(file_path="/data/a.czi",driver_id="CZI")
scene = slide.get_scene(0)
block = scene.read_block(size=(500,0))
By slice do you mean the first slice of a z-stack? The package you are using, aicspylibczi, allows you to specify a z coordinate e.g. to read the first z-slice:
image, shp = czi.read_image(C=0, M=0, Z=0)
To best understand, please reproduce the code in a Jupyternotebook:
I have two files: img.jpg and img.txt. Img.jpg is the image and img.txt is the face landmarks....If you plot them both, it will look like this:
I rotated the image by 24.5 degree....but how to do I also rotate the coordinates?
import cv2
img = cv2.imread('img.jpg')
plt.imshow(img)
plt.show()
# In[130]:
landmarks = []
with open('img.txt') as f:
for line in f:
landmarks.extend([float(number) for number in line.split()])
landmarks.pop(0) #Remove first line.
#Store all points inside the variable.
landmarkPoints = [] #Store the points in this
for j in range(int(len(landmarks))):
if j%2 == 1:
continue
landmarkPoints.append([int(landmarks[j]),int(landmarks[j+1])])
# In[ ]:
def rotate_bound(image, angle):
# grab the dimensions of the image and then determine the
# center
(h, w) = image.shape[:2]
(cX, cY) = (w // 2, h // 2)
# grab the rotation matrix (applying the negative of the
# angle to rotate clockwise), then grab the sine and cosine
# (i.e., the rotation components of the matrix)
M = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
cos = np.abs(M[0, 0])
sin = np.abs(M[0, 1])
# compute the new bounding dimensions of the image
nW = int((h * sin) + (w * cos))
nH = int((h * cos) + (w * sin))
# adjust the rotation matrix to take into account translation
M[0, 2] += (nW / 2) - cX
M[1, 2] += (nH / 2) - cY
# perform the actual rotation and return the image
return cv2.warpAffine(image, M, (nW, nH))
# In[131]:
imgcopy = img.copy()
for i in range(len(landmarkPoints)):
cv2.circle(imgcopy, (landmarkPoints[i][0], landmarkPoints[i][1]), 5, (0, 255, 0), -1)
plt.imshow(imgcopy)
plt.show()
landmarkPoints
# In[146]:
print(img.shape)
print(rotatedImage.shape)
# In[153]:
face_angle = 24.5
rotatedImage = rotate_bound(img, -face_angle)
for i in range(len(landmarkPoints)):
x,y = (landmarkPoints[i][0], landmarkPoints[i][1])
cv2.circle(rotatedImage, (int(x),int(y)), 5, (0, 255, 0), -1)
plt.imshow(rotatedImage)
plt.show()
Please download img.jpg and img.txt for reproducing this: https://drive.google.com/file/d/1FhQUFvoKi3t7TrIepx2Es0mBGAfT755w/view?usp=sharing
I tried this function, but y-axis is wrong
def rotatePoint(angle, pt):
a = np.radians(angle)
cosa = np.cos(a)
sina = np.sin(a)
return pt[0]*cosa - pt[1]*sina, pt[0] * sina + pt[1] * cosa
Edit: The above function gives me this result:
Although it has been long time since the question was asked. But I have decided to answer it as it has no accepted answer yet, even if it is a well accepted question. I have added a lot of comments to make the implementation clear. So, the code is hopefully self-explanatory. But I am also describing the ImageAugmentation's parameters for further clarification:
Here, original_data_dir is the directory to the parent folder, where all of the image's folders exists (yes it can read from multiple image folders). This parameter is compulsory.
augmentation_data_dir is the folder directory where you want to save the outputs. The program will automatically create all sub-folders inside of the output directory just like they appear in input directory. It is totally optional, it can generate the output directory by mimicking the input directory by appending the string _augmentation after the input folder name.
keep_original is another optional parameter. In many cases you may want to keep the original image with the augmented images in the output folder. If you want so, make it True (default).
num_of_augmentations_per_image is the total number of augmented images to be generated from each image. Although you wanted only rotation, but this program is designed to do other augmentations as well, change them, add or remove them as you need. I have also added a link to documentation where you will find other augmentations which can be introduced here in this code. It is defaulted to 3, if you keep the original image, there will be 3 + 1 = 4 images will be generated in the output.
discard_overflow_and_underflow is for handling the case where due to spatial transformation, the augmented points along with the image underneath can go outside of image's resolution, you can optionally keep them. But it is discarded here by default. Again, it will also discard images having width or height values <= 0. Defaulted to True.
put_landmarks means if you want the landmarks to be shown in the output. Make it True or False as required. It is False by default.
Hope you like it!
import logging
import imgaug as ia
import imgaug.augmenters as iaa
from imgaug.augmentables import Keypoint
from imgaug.augmentables import KeypointsOnImage
import os
import cv2
import re
SEED = 31 # To reproduce the result
class ImageAugmentation:
def __init__(self, original_data_dir, augmentation_data_dir = None, keep_original = True, num_of_augmentations_per_image = 3, discard_overflow_and_underflow = True, put_landmarks = False):
self.original_data_dir = original_data_dir
if augmentation_data_dir != None:
self.augmentation_data_dir = augmentation_data_dir
else:
self.augmentation_data_dir = self.original_data_dir + '_augmentation'
# Most of the time you will want to keep the original images along with the augmented images
self.keep_original = keep_original
# For example for self.num_of_augmentations_per_image = 3, from 1 image we will get 3 more images, totaling 4 images.
self.num_of_augmentations_per_image = num_of_augmentations_per_image
# if discard_overflow_and_underflow is True, the program will discard all augmentation where landmark (and image underneath) goes outside of image resolution
self.discard_overflow_and_underflow = discard_overflow_and_underflow
# Optionally put landmarks on output images
self.put_landmarks = put_landmarks
def get_base_annotations(self):
"""This method reads all the annotation files (.txt) and make a list
of annotations to be used by other methods.
"""
# base_annotations are the annotations which has come with the original images.
base_annotations = []
def get_info(content):
"""This utility function reads the content of a single annotation
file and returns the count of total number of points and a list of coordinates
of the points inside a dictionary.
As you have provided in your question, the annotation file looks like the following:
106
282.000000 292.000000
270.000000 311.000000
259.000000 330.000000
.....
.....
Here, the first line is the number of points.
The second and the following lines gives their coordinates.
"""
# As all the lines newline separated, hence splitting them
# accordingly first
lines = content.split('\n')
# The first line is the total count of the point, we can easily get it just by counting the points
# so we are not taking this information.
# From the second line to the end all lines are basically the coordinate values
# of each point (in each line). So, going to each of the lines (from the second line)
# and taking the coordinates as tuples.
# We will end up with a list of tuples and which will be inserted to the dict "info"
# under the key "point_coordinates"
points = []
for line in lines[1:]:
# Now each of the line can be splitted into two numbers representing coordinates
try:
# Keeping inside try block, as some of the lines might be accidentally contain
# a single number, or it can be the case that there might be some extra newlines
# where there is no number.
col, row = line.split(' ')
points.append((float(col), float(row)))
except:
pass
# Returns: List of tuples
return points
for subdir, dirs, files in os.walk(self.original_data_dir):
for file in files:
ext = os.path.splitext(file)[-1].lower()
# Looping through image files (instead of annotation files which are in '.txt' format)
# because image files can have very different extensions and we have to preserve them.
# Whereas, all the annotation files are assumed to be in '.txt' format.
# Annotation file's (.txt) directory will be generated from here.
if ext not in ['.txt']:
input_image_file_dir = os.path.join(subdir, file)
# As the image filenames and associated annotation text filenames are the same,
# so getting the common portion of them, it will be used to generate the annotation
# file's directory.
# Also assuming, there are no dots (.) in the input_annotation_file_dir except before the file extension.
image_annotation_base_dir = self.split_extension(input_image_file_dir)[0]
# Generating annotation file's directory
input_annotation_file_dir = image_annotation_base_dir + '.txt'
try:
with open(input_annotation_file_dir, 'r') as f:
content = f.read()
image_annotation_base_dir = os.path.splitext(input_annotation_file_dir)[0]
if os.path.isfile(input_image_file_dir):
image = cv2.imread(input_image_file_dir)
# Taking image's shape is basically surving dual purposes.
# First of all, we will need the image's shape for sanity checking after augmentation
# Again, if any of the input image is corrupt this following line will through exception
# and we will be able to skip that corrput image.
image_shape = image.shape # height (y), width (x), channels (depth)
# Collecting the directories of original annotation files and their contents.
# The same folder structure will be used to save the augmented data.
# As the image filenames and associated annotation text filenames are the same, so
base_annotations.append({'image_file_dir': input_image_file_dir,
'annotation_data': get_info(content = content),
'image_resolution': image_shape})
except:
logging.error(f"Unable to read the file: {input_annotation_file_dir}...SKIPPED")
return base_annotations
def get_augmentation(self, base_annotation, seed):
image_file_dir = base_annotation['image_file_dir']
image_resolution = base_annotation['image_resolution']
list_of_coordinates = base_annotation['annotation_data']
ia.seed(seed)
# We have to provide the landmarks in specific format as imgaug requires
landmarks = []
for coordinate in list_of_coordinates:
# coordinate[0] is along x axis (horizontal axis) and coordinate[1] is along y axis (vertical axis) and (left, top) corner is (0, 0)
landmarks.append(Keypoint(x = coordinate[0], y = coordinate[1]))
landmarks_on_original_img = KeypointsOnImage(landmarks, shape = image_resolution)
original_image = cv2.imread(image_file_dir)
"""
Here the magic happens. If you only want rotation then remove other transformations from here.
You can even add other various types of augmentation, see documentation here:
# Documentation for image augmentation with keypoints
https://imgaug.readthedocs.io/en/latest/source/examples_keypoints.html
# Here you will find other possible transformations
https://imgaug.readthedocs.io/en/latest/source/examples_basics.html
"""
seq = iaa.Sequential([
iaa.Affine(
scale={"x": (0.8, 1.2), "y": (0.8, 1.2)}, # scale images to 80-120% of their size, individually per axis
translate_percent={"x": (-0.2, 0.2), "y": (-0.2, 0.2)}, # translate by -20 to +20 percent (per axis)
rotate=(-90, 90), # rotate by -90 to +90 degrees; for specific angle (say 30 degree) use rotate = (30)
shear=(-16, 16), # shear by -16 to +16 degrees
)
], random_order=True) # Apply augmentations in random order
augmented_image, _landmarks_on_augmented_img = seq(image = original_image, keypoints = landmarks_on_original_img)
# Now for maintaining consistency, making the augmented landmarks to maintain same data structure like base_annotation
# i.e, making it a list of tuples.
landmarks_on_augmented_img = []
for index in range(len(landmarks_on_original_img)):
landmarks_on_augmented_img.append((_landmarks_on_augmented_img[index].x,
_landmarks_on_augmented_img[index].y))
return augmented_image, landmarks_on_augmented_img
def split_extension(self, path):
# Assuming there is no dots (.) except just before extension
# Returns [directory_of_file_without_extension, extension]
return os.path.splitext(path)
def sanity_check(self, landmarks_aug, image_resolution):
# Returns false if the landmark is outside of image resolution.
# Or, if the resolution is faulty.
for index in range(len(landmarks_aug)):
if landmarks_aug[index][0] < 0 or landmarks_aug[index][1] < 0:
return False
if landmarks_aug[index][0] >= image_resolution[1] or landmarks_aug[index][1] >= image_resolution[0]:
return False
if image_resolution[0] <= 0:
return False
if image_resolution[1] <= 0:
return False
return True
def serialize(self, serialization_data, image):
"""This method to write the annotation file and the corresponding image.
"""
# Now it is time to actually writing the image file and the annotation file!
# We have to make sure the output folder exists
# and "head" is the folder's directory here.
image_file_dir = serialization_data['image_file_dir']
annotation_file_dir = self.split_extension(image_file_dir)[0] + '.txt'
point_coordinates = serialization_data['annotation_data'] # List of tuples
total_points = len(point_coordinates)
# Getting the corresponding output folder for current image
head, tail = os.path.split(image_file_dir)
# Creating the folder if it doesn't exist
if not os.path.isdir(head):
os.makedirs(head)
# Writing annotation file
with open(annotation_file_dir, 'w') as f:
s = ""
s += str(total_points)
s += '\n'
for point in point_coordinates:
s += "{:.6f}".format(point[0]) + ' ' + "{:6f}".format(point[1]) + '\n'
f.write(s)
if self.put_landmarks:
# Optionally put landmarks in the output images.
for index in range(total_points):
cv2.circle(image, (int(point_coordinates[index][0]), int(point_coordinates[index][1])), 2, (255, 255, 0), 2)
cv2.imwrite(image_file_dir, image)
def augmentat_with_landmarks(self):
base_annotations = self.get_base_annotations()
for base_annotation in base_annotations:
if self.keep_original == True:
# As we are basically copying the same original data in new directory, changing the original image's directory with the new one with re.sub()
base_data = {'image_file_dir': re.sub(self.original_data_dir, self.augmentation_data_dir, base_annotation['image_file_dir']),
'annotation_data': base_annotation['annotation_data']}
self.serialize(serialization_data = base_data, image = cv2.imread(base_annotation['image_file_dir']))
for index in range(self.num_of_augmentations_per_image):
# Getting a new augmented image in each iteration from the same base image.
# Seeding (SEED) for reproducing same result across all execution in the future.
# Also seed must be different for each iteration, otherwise same looking augmentation will be generated.
image_aug, landmarks_aug = self.get_augmentation(base_annotation, seed = SEED + index)
# As for spatial transformations for some images, the landmarks can go outside of the image.
# So, we have to discard those cases (optionally).
if self.sanity_check(landmarks_aug, base_annotation['image_resolution']) or not self.discard_overflow_and_underflow:
# Getting the filename without extension to insert an index number in between to generate a new filename for augmented image
filepath_without_ext, ext = self.split_extension(base_annotation['image_file_dir'])
# As we are writing newly generated images to similar sub folders (just in different base directory)
# that is replacing original_data_dir with augmentation_data_dir.
# So, to do this we are using, re.sub(what_to_replace, with_which_to_replace, from_where_to_replace)
filepath_for_aug_img_without_ext = re.sub(self.original_data_dir, self.augmentation_data_dir, filepath_without_ext)
new_filepath_wo_ext = filepath_for_aug_img_without_ext + '_' + str(index)
augmentation_data = {
'image_file_dir': new_filepath_wo_ext + ext,
'annotation_data': landmarks_aug
}
self.serialize(serialization_data = augmentation_data, image = image_aug)
# Make put_landmarks = False if you do not want landmarks to be shown in output
# original_data_dir is the single parent folder directory inside of which all image folder(s) exist.
img_aug = ImageAugmentation(original_data_dir = 'parent/folder/directory/of/img/folder', put_landmarks = True)
img_aug.augmentat_with_landmarks()
Following is a snapshot of sample output of the code:
Please note that, I have used a package imgaug. I will suggest you to install the 0.4.0 version, as I have found it to be working. See the reason here and it's accepted answer.
When you try things like that it's very important to choose the proper coordinate system. In your case you have to put the origin (0,0) point in the center of the image.
Once you apply the rotation to the coordinates with the origin point in the center, the face points will be properly aligned on the new image.
I am trying to work on an efficient numpy solution to perform a running average of an array of color images across the 4th dimension. A set of color images in a directory is read in a loop and I would like to average in subsets of 3. ie. If there are n = 5 color images in the directory I would like to average [1,2,3],[2,3,4], [3,4,5], [4,5,1], and [5,1,2] thus writing 5 output average images.
from os import listdir
from os.path import isfile, join
import numpy as np
import cv2
from matplotlib import pyplot as plt
mypath = 'C:/path/to/5_image/dir'
onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]
img = np.empty(len(onlyfiles), dtype=object)
temp = np.zeros((960, 1280, 3, 3), dtype='uint8')
temp_avg = np.zeros((960, 1280, 3), dtype='uint8')
for n in range(0, len(onlyfiles)):
img[n] = cv2.imread(join(mypath, onlyfiles[n]))
for n in range(0, len(img)):
if (n+2) < len(img)-1:
temp[:, :, :, 0] = img[n]
temp[:, :, :, 1] = img[n + 1]
temp[:, :, :, 2] = img[n + 2]
temp_avg = np.mean(temp,axis=3)
plt.imshow(temp_avg)
plt.show()
else:
break
This script is in no way complete or elegant. The issues i am having is while plotting the average images the color space seems distorted and appears like CMKY. I am not accounting for the last two moving windows [4,5,1] and [5,1,2]. Critique and suggestions welcome.
For performing local operations (such as a running average) across the pixels of an image (or across multiple images), convolution with a kernel is usually a good approach.
Here's how this could be done in your case.
Generating Some Example Data
I used the following to generate 10 images containing random noise to work with:
for i in range(10):
an_img = np.random.randint(0, 256, (960,1280,3))
cv2.imwrite("img_"+str(i)+".png", an_img)
Preparing the Images
This is how I load the images back in:
# Get target file names
mypath = os.getcwd() # or whatever path you like
fnames = [f for f in listdir(mypath) if f.endswith('.png')]
# Create an array to hold all the images
first_img = cv2.imread(join(mypath, fnames[0]))
y,x,c = first_img.shape
all_imgs = np.empty((len(fnames),y,x,c), dtype=np.uint8)
# Load all the images
for i,fname in enumerate(fnames):
all_imgs[i,...] = cv2.imread(join(mypath, fnames[i]))
Some notes:
I use f.endswith('.png') to be a bit more specific with how I generate the list of filenames, allowing other files to be in the same directory without causing problems.
I place all of the images in a single 4D uint8 array of shape (image,y,x,c) instead of the object array you were using. This is necessary to employ the convolution approach below.
I use the first image to get the dimensions of the images, which makes the code just a little bit more general.
Performing Local Averaging by Kernel Convolution
This is all it takes.
from scipy.ndimage import uniform_filter
done = uniform_filter(all_imgs, size=(3,0,0,0), origin=-1, mode='wrap')
Some notes:
I am using scipy.ndimage because it readily allows for its convolution filters to be applied to images with many dimensions (4 in your case). For cv2, I am only aware of cv2.filter2D, which does not have that functionality as far as I know. However, I am not very familiar with cv2, so I may be wrong about this (will edit if someone corrects me in a comment).
The size kwarg specifies the size of the kernel to use along each dimension of the array. By using (3,0,0,0), I make sure that only the first dimension (=the different images) is used for the averaging.
By default, the running window (or rather the kernel) is used to compute the value of its central pixel. To match this more closely with your code, I used origin=-1, so the kernel computes the value of the pixel one to the left of its center.
By default, the edge cases (the two last images in this case) are handled by padding with a reflection. Your question suggests that what you want is to use the first images again instead. This is done using mode='wrap'.
By default, the filter returns the result in the same dtype as the input, here np.uint8. This is probably desirable, but your example code produces floats, so perhaps you want the filter to return floats as well, which you can do by simply changing the dtype of the input, i.e. done = uniform_filter(all_imgs.astype(np.float), size....
As for the distorted color space when you plot your averages; I cannot reproduce that. Your approach seems to produce the correct output for my random noise example images (after correction of the issue I pointed out in my comment to your question). Perhaps you could try plt.imshow(temp_avg, interpolation='none') to avoid possible artefacting from imshow's interpolation?