I would like to create two pydicom file from one. But I can't save file in *.dcm format with annotations.
import pydicom
from pydicom.data import get_testdata_files
# read the dicom file
ds = pydicom.dcmread(test_image_fps[0])
# find the shape of your pixel data
shape = ds.pixel_array.shape
# get the half of the x dimension. For the y dimension use shape[0]
half_x = int(shape[1] / 2)
# slice the halves
# [first_axis, second_axis] so [:,:half_x] means slice all from first axis, slice 0 to half_x from second axis
data = ds.pixel_array[:, :half_x]
print('The image has {} x {}'.format(data.shape[0],
data.shape[1]))
# print the image information given in the dataset
print(data)
data.save_as("/my/path/after.dcm")
'numpy.ndarray' object has no attribute 'save_as
Info on this can be foud in the pydicom documentation.
Remark on "your" ;) code: data = ds.pixel_array[:, :half_x] assigns a view of the numpy.ndarray that is ds.pixel_array to data. Calling data.save_as() expectedly fails because that is an attribute of ds not data. As per the documentation you need to write to the ds.PixelData attribute like so:
ds.PixelData = data.tobytes() # where data is a numpy.ndarray or a view of an numpy.ndarray
# if the shape of your pixel data changes ds.Rows and ds.Columns must be updated,
# otherwise calls to ds.pixel_array.shape will fail
ds.Rows = 512 # update with correct number of rows
ds.Columns = 512 # update with the correct number of columns
ds.save_as("/my/path/after.dcm")
Related
I am using the sliding window technic to an image and i am extracting the mean values of pixels of each one window. So the results are someting like this [[[[215.015625][123.55036272][111.66057478]]]].now the question is how could i save all these values for every one window into a txt file or at a CSV because i want to use them for further compare similarities? whatever i tried the error is same..that it is a 4D array and not an 1D or 2D. I ll appreciate any help really.! Thank you in advance
import cv2
import matplotlib.pyplot as plt
import numpy as np
# read the image and define the stepSize and window size
# (width,height)
image2 = cv2.imread("bird.jpg")# your image path
image = cv2.resize(image2, (224, 224))
tmp = image # for drawing a rectangle
stepSize = 10
(w_width, w_height) = (60, 60 ) # window size
for x in range(0, image.shape[1] - w_width, stepSize):
for y in range(0, image.shape[0] - w_height, stepSize):
window = image[x:x + w_width, y:y + w_height, :]
# classify content of the window with your classifier and
# determine if the window includes an object (cell) or not
# draw window on image
cv2.rectangle(tmp, (x, y), (x + w_width, y + w_height), (255, 0, 0), 2) # draw rectangle on image
plt.imshow(np.array(tmp).astype('uint8'))
# show all windows
plt.show()
mean_values=[]
mean_val, std_dev = cv2.meanStdDev(image)
mean_val = mean_val[:3]
mean_values.append([mean_val])
mean_values = np.asarray(mean_values)
print(mean_values)
Human Readable Option
Assuming that you want the data to be human readable, saving the data takes a little bit more work. My search showed me that there's this solution for saving 3D data to a text file. However, it's pretty simple to extend this example to 4D for your use case. This code is taken and adapted from that post, thank you Joe Kington and David Cheung.
import numpy as np
data = np.arange(2*3*4*5).reshape((2,3,4,5))
with open('test.csv', 'w') as outfile:
# We write this header for readable, the pound symbol
# will cause numpy to ignore it
outfile.write('# Array shape: {0}\n'.format(data.shape))
# Iterating through a ndimensional array produces slices along
# the last axis. This is equivalent to data[i,:,:] in this case.
# Because we are dealing with 4D data instead of 3D data,
# we need to add another for loop that's nested inside of the
# previous one.
for threeD_data_slice in data:
for twoD_data_slice in threeD_data_slice:
# The formatting string indicates that I'm writing out
# the values in left-justified columns 7 characters in width
# with 2 decimal places.
np.savetxt(outfile, twoD_data_slice, fmt='%-7.2f')
# Writing out a break to indicate different slices...
outfile.write('# New slice\n')
And then once the data has been saved all you need to do is load it and reshape it (np.load()) will default to reading in the data as a 2D array but np.reshape() will allow us to recover the structure. Again, this code is adapted from the previous post.
new_data = np.loadtxt('test.csv')
# Note that this returned a 2D array!
print(new_data.shape)
# However, going back to 3D is easy if we know the
# original shape of the array
new_data = new_data.reshape((2,3,4,5))
# Just to check that they're the same...
assert np.all(new_data == data)
Binary Option
Assuming that human readability is not necessary, I would recommend using the built-in *.npy format which is described here. This stores the data in a binary format.
You can save the array by doing np.save('NAME_OF_ARRAY.npy', ARRAY_TO_BE_SAVED) and then load it with SAVED_ARRAY = np.load('NAME_OF_ARRAY.npy').
You can also save several numpy array in a single zip file with the np.savez() function like so np.savez('MANY_ARRAYS.npz', ARRAY_ONE, ARRAY_TWO). And you load the zipped arrays in a similar fashion SEVERAL_ARRAYS = np.load('MANY_ARRAYS.npz').
Following this formula for alpha blending two color values, I wish to apply this to n numpy arrays of rgba image data (though the expected use-case will, in practice, have a very low upper bound of arrays, probably > 5). In context, this process will be constrained to arrays of identical shape.
I could in theory achieve this through iteration, but expect that this would be computationally intensive and terribly inefficient.
What is the most efficient way to apply a function between two elements in the same position between two arrays across the entire array?
A loose example:
# in context, the numpy arrays come from here, as either numpy data in the
# first place or a path
def import_data(source):
# first test for an extant numpy array
try:
assert(type(source) is np.ndarray)
data = source
except AssertionError:
try:
exists(source)
data = add_alpha_channel(np.array(Image.open(source)))
except IOError:
raise IOError("Cannot identify image data in file '{0}'".format(source))
except TypeError:
raise TypeError("Cannot identify image data from source.")
return data
# and here is the in-progress method that will, in theory composite the stack of
# arrays; it context this is a bit more elaborate; self.width & height are just what
# they appear to be—-the final size of the composited output of all layers
def render(self):
render_surface = np.zeros((self.height, self.width, 4))
for l in self.__layers:
foreground = l.render() # basically this just returns an np array
# the next four lines just find the regions between two layers to
# be composited
l_x1, l_y1 = l.origin
l_x2 = l_x1 + foreground.shape[1]
l_y2 = l_y1 + foreground.shape[0]
background = render_surface[l_y1: l_y2, l_x1: l_x2]
# at this point, foreground & background contain two identically shaped
# arrays to be composited; next line is where the function i'm seeking
# ought to go
render_surface[l_y1: l_y2, l_x1: l_x2] = ?
Starting with these two RGBA images:
I implemented the formula you linked to and came up with this:
#!/usr/local/bin/python3
from PIL import Image
import numpy as np
# Open input images, and make Numpy array versions
src = Image.open("a.png")
dst = Image.open("b.png")
nsrc = np.array(src, dtype=np.float)
ndst = np.array(dst, dtype=np.float)
# Extract the RGB channels
srcRGB = nsrc[...,:3]
dstRGB = ndst[...,:3]
# Extract the alpha channels and normalise to range 0..1
srcA = nsrc[...,3]/255.0
dstA = ndst[...,3]/255.0
# Work out resultant alpha channel
outA = srcA + dstA*(1-srcA)
# Work out resultant RGB
outRGB = (srcRGB*srcA[...,np.newaxis] + dstRGB*dstA[...,np.newaxis]*(1-srcA[...,np.newaxis])) / outA[...,np.newaxis]
# Merge RGB and alpha (scaled back up to 0..255) back into single image
outRGBA = np.dstack((outRGB,outA*255)).astype(np.uint8)
# Make into a PIL Image, just to save it
Image.fromarray(outRGBA).save('result.png')
Output image
I'm trying to ajust values of images in a pandas dataframe
Each row of the dataframe (images) holds an image of shape (7,7,3), 7x7 pixels and 3 colours.
So when I try to adjust the top left pixel of the first image like so:
All other images (rows) are affected as well.
print(images.loc[0,'image'][0][0], images.loc[1,'image'][0][0])
images.loc[0,'image'][0][0]=[1,2,3]
print(images.loc[0,'image'][0][0], images.loc[1,'image'][0][0])
[0,0,0] [0,0,0]
[1,2,3] [1,2,3]
This only happens when I adjust a single pixel.
If I edit the image in its entirety, the other images/rows are not affected.
images[0,'image']=[image]
does work properly
added mvce:
import numpy as np
import pandas as pd
images = pd.DataFrame(columns=['image'])
image = np.zeros([2, 2, 2])
images.loc[0, 'image'] = image
images = pd.concat([images] * 2)
images = images.reset_index(drop=True)
print(images.loc[0, 'image'][0][0], '\n')
images.loc[0, 'image'][0][0] = [1, 1]
print(images.loc[0, 'image'][0][0], images.loc[1, 'image'][0][0])
The problem is in the lines
image=np.zeros([2,2,2])
and
images=pd.concat([images]*2)
You create a single numpy object. This object is referenced twice in the final dataframe. To illustrate, if you explicitly make a copy of the object, the problem disappears:
import copy
images=pd.DataFrame(columns=['image'])
image=np.zeros([2,2,2])
images.loc[0,'image']=image
images=pd.concat([copy.deepcopy(images), copy.deepcopy(images)]) # explicitly duplicate the object to avoid reference to the same object
images=images.reset_index(drop=True)
print(images.loc[0,'image'][0][0],'\n')
images.loc[0,'image'][0][0]=[1,1]
print(images.loc[0,'image'][0][0],images.loc[1,'image'][0][0])
edit: to adress your comment, how to create many copies, you could try:
images = [np.zeros([2,2,2]) for lv in range(10000)] # create list containing independent instances of numpy arrays
images = pd.Series(images, index = range(10000))
images = images.to_frame('images')
images # should now be a dataframe containing independent numpy arrays in its 'image' column.
I have a lot of images (pydicom files). I would like to divide in half. From 1 image, I would like 2 images: part left and part right.
Input: 1000x1000
Output: 500x1000 (width x height).
Currently, I can only read a file.
ds = pydicom.read_file(image_fps[0]) # read dicom image from filepath
First part, I would like to put half in one folder and the other half to second.
This is what I have:
enter image description here
This is what I want:
enter image description here
I use Mask-RCNN to object localization problem. I would like crop 50% of image size (pydicom file).
EDIT1:
import SimpleITK as sitk
filtered_image = sitk.GetImageFromArray(left_part)
sitk.WriteImage(filtered_image, '/home/wojtek/Mask/nnna.dcm', True)
I have dicom file, but I can't display it.
this transfer syntax JPEG 2000 Image Compression (Lossless Only), can not be read because Pillow lacks the jpeg 2000 decoder plugin
Once you have executed pydicom.dcm_read() your pixel data is available at ds.pixel_array. You can just slice the data you want and save it with any suitable library. In this example I will be using matplotlib as I also use that for verifying whether my slicing is correct. Adjust to your needs obviously, one thing you need to do is generate the correct path/filenames for saving. Have fun!
(this script assumes the filepaths are available in a paths variable)
import pydicom
import matplotlib
# for testing if the slice is correct
from matplotlib import pyplot as plt
for path in paths:
# read the dicom file
ds = pydicom.dcmread(path)
# find the shape of your pixel data
shape = ds.pixel_array.shape
# get the half of the x dimension. For the y dimension use shape[0]
half_x = int(shape[1] / 2)
# slice the halves
# [first_axis, second_axis] so [:,:half_x] means slice all from first axis, slice 0 to half_x from second axis
left_part = ds.pixel_array[:, :half_x]
right_part = ds.pixel_array[:,half_x:]
# to check whether the slices are correct, matplotlib can be convenient
# plt.imshow(left_part); do not do this in the loop
# save the files, see the documentation for matplotlib if you want a different format
# bmp, png are surely supported
path_to_left_image = 'generate\the\path\and\filename\for\the\left\image.bmp'
path_to_right_image = 'generate\the\path\and\filename\for\the\right\image.bmp'
matplotlib.image.imsave(path_to_left_image, left_part)
matplotlib.image.imsave(path_to_right_image, right_part)
If you want to save the DICOM files keep in mind that they may not be valid DICOM if you do not update the appropriate data. For instance the SOP Instance UID is technically not allowed to be the same as in the original DICOM file, or any other SOP Instance UID for that matter. How important that is, is up to you.
With a script like below you can define named slices and split any dicom image file it finds in the supplied path into the appropriate slices.
import os
import pydicom
import numpy as np
def save_partials(parts, path_to_directory):
"""
parts: list of tuples, each tuple specifying a name and a list of four slice offsets
path_to_directory: path to directory containing dicom files
any file with a .dcm extension will have its image data split into the specified slices and saved accordingly.
original file will not be modified
"""
dir_content = [os.path.join(path_to_directory, item) for item in os.listdir(path_to_directory)]
files = [i for i in dir_content if os.path.isfile(os.path.join(path_to_directory, i))]
for file in files:
root, extension = os.path.splitext(file)
if extension.lower() != '.dcm':
# not a .dcm file, continue with next iteration of loop
continue
for part in parts:
ds = pydicom.read_file(file)
if not isinstance(ds.pixel_array, np.ndarray):
# no image data available
continue
part_name = part[0]
p = part[1] # slice list
ds.PixelData = ds.pixel_array[p[0]:p[1], p[2]:p[3]].tobytes()
ds.Rows = p[1] - p[0]
ds.Columns = p[3] - p[2]
##
## Here you can modify any tags using ds.KeyWord
##
new_file_name = "{r}-{pn}{ext}".format(r=root, pn=part_name, ext=extension)
ds.save_as(new_file_name)
print('saved {}'.format(new_file_name))
dir_path = '/home/wojtek/Mask'
parts = [('left', [0,512,0,256]),
('right', [0,512,256,512])]
save_partials(parts, dir_path)
I am trying to create a training data file which is structured as follows:
[Rows = Samples, Columns = features]
So if I have 100 samples and 2 features the shape of my np.array would be (100,2) etc.
Data
The list bellow contains path-strings to the .nrrd 3D sample patch-data files which have been processed using method 01.
['/Users/FK/Documents/image/01/subject1F_200.nrrd',
'/Users/FK/Documents/image/01/subject2F_201.nrrd']
Lets call the directory dir_01.
For testing purposes the following 3D patch can be used. It has the same shape as the .nrrd file when read:
subject1F_200_PP01 = np.random.rand(128,128, 128)
subject1F_201_PP01 = np.random.rand(128,128, 128)
# and so on...
The list bellow contains path-strings to the .nrrd 3D sample patch-data files which have been processed using method 02.
['/Users/FK/Documents/image/02/subject1F_200.nrrd',
'/Users/FK/Documents/image/02/subject2F_201.nrrd']
Lets call the directory dir_02.
For testing purposes the following 3D patch can be used. It has the same shape as the .nrrd file when read:
subject1F_200_PP02 = np.random.rand(128,128, 128)
subject1F_201_PP02 = np.random.rand(128,128, 128)
# and so on...
Both the subjects are the same, but the patch data has been pre-processed differently.
Feature Functions
In order to calculate the features I need to use the following functions:
np.median (regular python function and returns a single value)
my_own_function1 (regular python function and returns a np.array)
my_own_function2 (I can only access it using a matlab engine and returns a np.array)
In this scenario my final numpy array should have a (2,251) shape. Since I have to samples (rows) and 251 features (columns) from my 3 functions.
Here is my code (credits to M.Fabré)
Read the patches
# Helps me read the files for features 1. and 2. Uses a python .nrrd reader
def read_patches_multi1(files_1):
for file_1 in files_1:
yield nrrd.read(str(file_1))
# Helps me read the files for features 3. Uses a matlab .nrrd reader
def read_patches_multi2(files_2):
for file_2 in files_2:
yield eng.nrrdread(str(file_2))
Calculate
def parse_patch_multi(patch1, patch2):
# Structure for python .nrrd reader
data_1 , option = patch1
# Structure for matlab .nrrd reader
data_2 = patch2
# Uses itertools to combine single float32 value with np.array values
return [i for i in itertools.chain(np.median(data_1), my_own_function1(data_1), my_own_function2(data_2))]
Execution
# Directories
dir_01 = '/Users/FK/Documents/image/01/'
dir_02 = '/Users/FK/Documents/image/02/'
# Method 01 patch data
file_dir_1 = Path(dir_01)
files_1 = file_dir_1.glob('*.nrrd')
patches_1 = read_patches_multi1(files_1)
# Method 02 patch data
file_dir_2 = Path(dir_02)
files_2 = file_dir_2.glob('*.nrrd')
patches_2 = read_patches_multi2(files_2)
# I think the error lies here...
training_file_multi = np.array([parse_patch_multi(patch1,patch2) for (patch1, patch2) in (patches_1, patches_2)], dtype=np.float32)
I have tried multiple approaches but I am keep getting syntax error or the wrong structure. Or the following type error:
TypeError: unsupported Python data type: numpy.ndarray
I found a solution but it does not seem too elegant
I create two funcitons:
def parse_patch_multi1(patch1):
# Structure for python .nrrd reader
data_1 , option = patch1
# Uses itertools to combine single float32 value with np.array values
return [i for i in itertools.chain(np.median(data_1), 0) my_own_function1(data_1)]
def parse_patch_multi2(patch2):
# Structure for python .nrrd reader
data_2 = patch2
# Uses itertools to combine single float32 value with np.array values
return [i for i in itertools.chain(my_own_function2(data_2)]
Execution
# Directories
dir_01 = '/Users/FK/Documents/image/01/'
dir_02 = '/Users/FK/Documents/image/02/'
# Method 01 patch data
file_dir_1 = Path(dir_01)
files_1 = file_dir_1.glob('*.nrrd')
patches_1 = read_patches_multi1(files_1)
# Method 02 patch data
file_dir_2 = Path(dir_02)
files_2 = file_dir_2.glob('*.nrrd')
patches_2 = read_patches_multi2(files_2)
training_file_multi1 = np.array([parse_patch_multi1(patch1) for (patch1) in patches_1], dtype=np.float32)
training_file_multi2 = np.array([parse_patch_multi2(patch2) for (patch2) in patches_1], dtype=np.float32)
The trick
concatenate the two np.arrays along Axis 1
training_file_combined= np.concatenate((training_file_multi1, training_file_multi2), axis=1)
Shape of the matrix (2,252)