I am trying to create a JPEG compressed DICOM image using pydicom. A nice source material about colorful DICOM images can be found here, but it's mostly theory and C++. In the code example below I create a pale blue ellipsis inside output-raw.dcm (uncompressed) which looks fine like this:
import io
from PIL import Image, ImageDraw
from pydicom.dataset import Dataset
from pydicom.uid import generate_uid, JPEGExtended
from pydicom._storage_sopclass_uids import SecondaryCaptureImageStorage
WIDTH = 100
HEIGHT = 100
def ensure_even(stream):
# Very important for some viewers
if len(stream) % 2:
return stream + b"\x00"
return stream
def bob_ross_magic():
image = Image.new("RGB", (WIDTH, HEIGHT), color="red")
draw = ImageDraw.Draw(image)
draw.rectangle([10, 10, 90, 90], fill="black")
draw.ellipse([30, 20, 70, 80], fill="cyan")
draw.text((11, 11), "Hello", fill=(255, 255, 0))
return image
ds = Dataset()
ds.is_little_endian = True
ds.is_implicit_VR = True
ds.SOPClassUID = SecondaryCaptureImageStorage
ds.SOPInstanceUID = generate_uid()
ds.Modality = "OT"
ds.SamplesPerPixel = 3
ds.BitsAllocated = 8
ds.BitsStored = 8
ds.HighBit = 7
ds.PixelRepresentation = 0
ds.PhotometricInterpretation = "RGB"
ds.Rows = HEIGHT
ds.Columns = WIDTH
image = bob_ross_magic()
ds.PixelData = ensure_even(image.tobytes())
ds.save_as("output-raw.dcm", write_like_original=False) # File is OK
# Create compressed image
output = io.BytesIO()
image.save(output, format="JPEG")
ds.PixelData = ensure_even(output.getvalue())
ds.PhotometricInterpretation = "YBR_FULL_422"
ds.file_meta.TransferSyntaxUID = JPEGExtended
ds.save_as("output-jpeg.dcm", write_like_original=False) # File is corrupt
At the very end I am trying to create compressed DICOM: I tried setting various transfer syntaxes, compressions with PIL, but no luck. I believe the generated DICOM file is corrupt. If I were to convert the raw DICOM file to JPEG compressed with gdcm-tools:
$ gdcmconv -J output-raw.dcm output-jpeg.dcm
By doing a dcmdump on this converted file we could see an interesting structure, which I don't know how to reproduce using pydicom:
$ dcmdump output-jpeg.dcm
# Dicom-File-Format
# Dicom-Meta-Information-Header
# Used TransferSyntax: Little Endian Explicit
(0002,0000) UL 240 # 4, 1 FileMetaInformationGroupLength
(0002,0001) OB 00\01 # 2, 1 FileMetaInformationVersion
(0002,0002) UI =SecondaryCaptureImageStorage # 26, 1 MediaStorageSOPClassUID
(0002,0003) UI [1.2.826.0.1.3680043.8.498.57577581978474188964358168197934098358] # 64, 1 MediaStorageSOPInstanceUID
(0002,0010) UI =JPEGLossless:Non-hierarchical-1stOrderPrediction # 22, 1 TransferSyntaxUID
(0002,0012) UI [1.2.826.0.1.3680043.2.1143.] # 48, 1 ImplementationClassUID
(0002,0013) SH [GDCM 2.8.4] # 10, 1 ImplementationVersionName
(0002,0016) AE [gdcmconv] # 8, 1 SourceApplicationEntityTitle
# Dicom-Data-Set
# Used TransferSyntax: JPEG Lossless, Non-hierarchical, 1st Order Prediction
... ### How to do the magic below?
(7fe0,0010) OB (PixelSequence #=2) # u/l, 1 PixelData
(fffe,e000) pi (no value available) # 0, 1 Item
(fffe,e000) pi ff\d8\ff\ee\00\0e\41\64\6f\62\65\00\64\00\00\00\00\00\ff\c3\00\11... # 4492, 1 Item
(fffe,e0dd) na (SequenceDelimitationItem) # 0, 0 SequenceDelimitationItem
I tried to use pydicom's encaps module, but I think it's mostly for reading data, not writing. Anyone else have any ideas how to deal with this issue, how to create/encode these PixelSequences? Would love to create JPEG compressed DICOMs in plain Python without running external tools.
DICOM requires compressed Pixel Data be encapsulated (see the tables especially). Once you have your compressed image data you can use the encaps.encapsulate() method to create bytes suitable for use with Pixel Data:
from pydicom.encaps import encapsulate
# encapsulate() requires a list of bytes, one item per frame
ds.PixelData = encapsulate([ensure_even(output.getvalue())])
# Need to set this flag to indicate the Pixel Data is compressed
ds['PixelData'].is_undefined_length = True # Only needed for < v1.4
ds.PhotometricInterpretation = "YBR_FULL_422"
ds.file_meta.TransferSyntaxUID = JPEGExtended
ds.save_as("output-jpeg.dcm", write_like_original=False)
Trying the solution from #scaramallion, with more detail looks to work:
import numpy as np
from PIL import Image
import io
# set some parameters
num_frames = 4
img_size = 10
# Create a fake RGB dataset
random_image_array = (np.random.random((num_frames, img_size, img_size, 3))*255).astype('uint8')
# Convert to PIL
imlist = []
for i in range(num_frames): # convert the multiframe image into RGB of single frames (Required for compression)
# Save the multipage tiff with jpeg compression
f = io.BytesIO()
imlist[0].save(f, format='tiff', append_images=imlist[1:], save_all=True, compression='jpeg')
# The BytesIO object cursor is at the end of the object, so I need to tell it to go back to the front
img = Image.open(f)
# Get each one of the frames converted to even numbered bytes
img_byte_list = []
for i in range(num_frames):
with io.BytesIO() as output:
img.save(output, format='jpeg')
except EOFError:
# Not enough frames in img
ds.PixelData = encapsulate([x for x in img_byte_list])
ds['PixelData'].is_undefined_length = True
ds.is_implicit_VR = False
ds.LossyImageCompression = '01'
ds.LossyImageCompressionRatio = 10 # default jpeg
ds.LossyImageCompressionMethod = 'ISO_10918_1'
ds.file_meta.TransferSyntaxUID = '1.2.840.10008.'
ds.save_as("output-jpeg.dcm", write_like_original=False)
I have a list of image frames frames that I would like to be able to display in Streamlit application: st.video(frames_converted).
Streamlit takes HTML5 and video requires H264 encoding
Want to complete all processing in-memory (as opposed to the much more common saving to temporary file
Current attempt:
## Convert frames to video for streamlit
height, width, layers = frames[0].shape
codec = cv.VideoWriter_fourcc(*'H264')
fps = 1
video = cv.VideoWriter("temp_video",codec, fps, (width,height)) # Initialize video object
for frame in frames:
Current Blocker
RuntimeError: Invalid binary data format: <class 'cv2.VideoWriter'>
We may encode an "in memory" MP4 video using PyAV as described in my following answer - the video is stored in BytesIO object.
We may pass the BytesIO object as input to Streamlit (or convert the BytesIO object to bytes array and use the array as input).
Code sample:
import numpy as np
import cv2 # OpenCV is used only for writing text on image (for testing).
import av
import io
import streamlit as st
n_frmaes = 100 # Select number of frames (for testing).
width, height, fps = 192, 108, 10 # Select video resolution and framerate.
output_memory_file = io.BytesIO() # Create BytesIO "in memory file".
output = av.open(output_memory_file, 'w', format="mp4") # Open "in memory file" as MP4 video output
stream = output.add_stream('h264', str(fps)) # Add H.264 video stream to the MP4 container, with framerate = fps.
stream.width = width # Set frame width
stream.height = height # Set frame height
#stream.pix_fmt = 'yuv444p' # Select yuv444p pixel format (better quality than default yuv420p).
stream.pix_fmt = 'yuv420p' # Select yuv420p pixel format for wider compatibility.
stream.options = {'crf': '17'} # Select low crf for high quality (the price is larger file size).
def make_sample_image(i):
""" Build synthetic "raw BGR" image for testing """
p = width//60
img = np.full((height, width, 3), 60, np.uint8)
cv2.putText(img, str(i+1), (width//2-p*10*len(str(i+1)), height//2+p*10), cv2.FONT_HERSHEY_DUPLEX, p, (255, 30, 30), p*2) # Blue number
return img
# Iterate the created images, encode and write to MP4 memory file.
for i in range(n_frmaes):
img = make_sample_image(i) # Create OpenCV image for testing (resolution 192x108, pixel format BGR).
frame = av.VideoFrame.from_ndarray(img, format='bgr24') # Convert image from NumPy Array to frame.
packet = stream.encode(frame) # Encode video frame
output.mux(packet) # "Mux" the encoded frame (add the encoded frame to MP4 file).
# Flush the encoder
packet = stream.encode(None)
output_memory_file.seek(0) # Seek to the beginning of the BytesIO.
#video_bytes = output_memory_file.read() # Convert BytesIO to bytes array
st.video(output_memory_file) # Streamlit supports BytesIO object - we don't have to convert it to bytes array.
# Write BytesIO from RAM to file, for testing:
#with open("output.mp4", "wb") as f:
# f.write(output_memory_file.getbuffer())
#video_file = open('output.mp4', 'rb')
#video_bytes = video_file.read()
We can't use cv.VideoWriter, because it does not support in-memory video encoding (cv.VideoWriter requires a "true file").
I have a function that returns a frame as result. I wanted to know how to make a video out of a for-loop with this function without saving every frame and then creating the video.
What I have from now is something similar to:
import cv2
out = cv2.VideoWriter('video.mp4',cv2.VideoWriter_fourcc(*'DIVX'), 14.25,(500,258))
for frame in frames:
img_result = MyImageTreatmentFunction(frame) # returns a numpy array image
Then the video will be created as video.mp4 and I can access it on memory. I'm asking myself if there's a way to have this video in a variable that I can easily convert to bytes later. My purpose for that is to send the video via HTTP post.
I've looked on ffmpeg-python and opencv but I didn't find anything that applies to my case.
We may use PyAV for encoding "in memory file".
PyAV is a Pythonic binding for the FFmpeg libraries.
The interface is relatively low level, but it allows us to do things that are not possible using other FFmpeg bindings.
Here are the main stages for creating MP4 in memory using PyAV:
Create BytesIO "in memory file":
output_memory_file = io.BytesIO()
Use PyAV to open "in memory file" as MP4 video output file:
output = av.open(output_memory_file, 'w', format="mp4")
Add H.264 video stream to the MP4 container, and set codec parameters:
stream = output.add_stream('h264', str(fps))
stream.width = width
stream.height = height
stream.pix_fmt = 'yuv444p'
stream.options = {'crf': '17'}
Iterate the OpenCV images, convert image to PyAV VideoFrame, encode, and "Mux":
for i in range(n_frmaes):
img = make_sample_image(i) # Create OpenCV image for testing (resolution 192x108, pixel format BGR).
frame = av.VideoFrame.from_ndarray(img, format='bgr24')
packet = stream.encode(frame)
Flush the encoder and close the "in memory" file:
packet = stream.encode(None)
The following code samples encode 100 synthetic images to "in memory" MP4 memory file.
Each synthetic image applies OpenCV image, with sequential blue frame number (used for testing).
At the end, the memory file is written to output.mp4 file for testing.
import numpy as np
import cv2
import av
import io
n_frmaes = 100 # Select number of frames (for testing).
width, height, fps = 192, 108, 23.976 # Select video resolution and framerate.
output_memory_file = io.BytesIO() # Create BytesIO "in memory file".
output = av.open(output_memory_file, 'w', format="mp4") # Open "in memory file" as MP4 video output
stream = output.add_stream('h264', str(fps)) # Add H.264 video stream to the MP4 container, with framerate = fps.
stream.width = width # Set frame width
stream.height = height # Set frame height
stream.pix_fmt = 'yuv444p' # Select yuv444p pixel format (better quality than default yuv420p).
stream.options = {'crf': '17'} # Select low crf for high quality (the price is larger file size).
def make_sample_image(i):
""" Build synthetic "raw BGR" image for testing """
p = width//60
img = np.full((height, width, 3), 60, np.uint8)
cv2.putText(img, str(i+1), (width//2-p*10*len(str(i+1)), height//2+p*10), cv2.FONT_HERSHEY_DUPLEX, p, (255, 30, 30), p*2) # Blue number
return img
# Iterate the created images, encode and write to MP4 memory file.
for i in range(n_frmaes):
img = make_sample_image(i) # Create OpenCV image for testing (resolution 192x108, pixel format BGR).
frame = av.VideoFrame.from_ndarray(img, format='bgr24') # Convert image from NumPy Array to frame.
packet = stream.encode(frame) # Encode video frame
output.mux(packet) # "Mux" the encoded frame (add the encoded frame to MP4 file).
# Flush the encoder
packet = stream.encode(None)
# Write BytesIO from RAM to file, for testing
with open("output.mp4", "wb") as f:
I have 14 videos of 30 minutes (7 hours of videodata). I read in every video seperately, perform some morphological processing on each frame and then use cv2.imwrite() to save each processed frame. I'd like to make 1 big videofile of 7 hours of all processed frames. So far, I've been trying to use this code:
import numpy as np
import glob
img_array = []
for filename in glob.glob('C:/New folder/Images/*.jpg'):
img = cv2.imread(filename)
height, width, layers = img.shape
size = (width,height)
out = cv2.VideoWriter('project.avi',cv2.VideoWriter_fourcc(*'DIVX'), 15, size)
for i in range(len(img_array)):
But an error is given when creating the img_array (memory overload). Is there any other way to make a 7 hour video from +250.000 frames?
Thank you.
check that all pictures are of the same size
as stated by others, don't read all pictures at once. it's not necessary.
Usually I'd prefer to create the VideoWriter before the loop but you need the size for that, and you only know that after you've read the first image. That's why I initialize that variable to None and create the VideoWriter once I have the first image
Also: DIVX and .avi may work but that's not the best option. the built-in option is to use MJPG (with .avi), which is always available in OpenCV. I would however recommend .mkv and avc1 (H.264) for general video, or you could look for a lossless codec that stores data in RGB instead of YUV (which may distort color information from screenshots... and also drawn lines and other hard edges). You could try the rle (note the space) codec, which is a lossless codec based on run-length encoding.
import cv2 # `import cv2 as cv` is preferred these days
import numpy as np
import glob
out = None # VideoWriter initialized after reading the first image
outsize = None
for filename in glob.glob('C:/New folder/Images/*.jpg'):
img = cv2.imread(filename)
assert img is not None, filename # file could not be read
(height, width, layers) = img.shape
thissize = (width, height)
if out is None: # this happens once at the beginning
outsize = thissize
out = cv2.VideoWriter('project.avi', cv2.VideoWriter_fourcc(*'DIVX'), 15, outsize)
assert out.isOpened()
else: # error checking for every following image
assert thissize == outsize, (outsize, thissize, filename)
# finalize the video file (write headers/footers)
You could also do this with an invocation of ffmpeg on the command line (or from your program):
How to create a video from images with FFmpeg?
You don't need to store each frame inside an array.
You can read the frame and write it to the video directly.
You can modify your code as:
import numpy as np
import glob
out = None
for filename in glob.glob('C:/New folder/Images/*.jpg'):
img = cv2.imread(filename)
if not out:
height, width, layers = img.shape
size = (width,height)
out = cv2.VideoWriter('project.avi',cv2.VideoWriter_fourcc(*'DIVX'), 15, size)
I've face this problem when figuring out how to export external images in blender script. But I guess this is not related straight to blender anymore, more to numpy and how to handle arrays. Here is post about first problem.
So the problem is that when saving numpy array to image it will distorted and there is multiple same images. Look below image for a better understanding.
The goal is trying to figure out how to make this work with numpy and python using the blender's own pixel data. So avoiding to use libraries like PIL or cv2 that do not include in blender python.
When saving data where is images that all is final size works correctly. And when trying to merge 4 smaller pieces to final larger image it not exported correctly.
I've done example script with python in blender to demonstrate the problem:
# Example script to show how to merge external images in Blender
# using numpy. In this example we use 4 images (2x2) that should
# be merged to one actual final image.
# Regular (not cropped render borders) seems to work fine but
# how to merge cropped images properly???
# Usage: Just run script and it will export image named "MERGED_IMAGE"
# to root of this project folder and you'll see what's the problem.
import bpy, os
import numpy as np
ctx = bpy.context
scn = ctx.scene
# Get all image files
def get_files_in_folder(path):
path = bpy.path.abspath(path)
render_files = []
for root, dirs, files in os.walk(path):
for file in files:
if (file.lower().endswith(('.png', '.jpg', '.jpeg', '.tiff', '.bmp', '.gif'))):
return render_files
def merge_images(image_files, image_cropped = True):
image_pixels = []
final_image_pixels = 0
for file in image_files:
if image_cropped is True:
filepath = bpy.path.abspath('//Cropped\\' + file)
filepath = bpy.path.abspath('//Regular\\' + file)
loaded_pixels = bpy.data.images.load(filepath, check_existing=True).pixels
np_array = np.array(image_pixels)
# Merge images
if image_cropped:
final_image_pixels = np_array
for arr in np_array:
final_image_pixels += arr
# Save output image
output_image = bpy.data.images.new('MERGED_IMAGE', alpha=True, width=256, height=256)
output_image.file_format = 'PNG'
output_image.alpha_mode = 'STRAIGHT'
output_image.pixels = final_image_pixels.ravel()
output_image.filepath_raw = bpy.path.abspath("//MERGED_IMAGE.png")
images_cropped = get_files_in_folder("//Cropped")
images_regular = get_files_in_folder('//Regular')
# Change between these to get different example
#merge_images(images_regular, False)
So I guess the problem is related to how to handle image pixel data and arrays with numpy.
Here is project folder in zip file that contains working test script example, where you can test how this works in blender. https://drive.google.com/file/d/1R4G_fubEzFWbHZMLtAAES-QsRhKyLKWb/view?usp=sharing
Since all of your images are the same dimension of 128x128, and since OpenCV images are Numpy arrays, here are three methods. You can save the image using cv2.imwrite.
Input images:
Method #1: np.hstack + np.vstack
hstack1 = np.hstack((image1, image2))
hstack2 = np.hstack((image3, image4))
hstack_result = np.vstack((hstack1, hstack2))
Method #2: np.concatenate
concatenate1 = np.concatenate((image1, image2), axis=1)
concatenate2 = np.concatenate((image3, image4), axis=1)
concatenate_result = np.concatenate((concatenate1, concatenate2), axis=0)
Method #3: cv2.hconcat + cv2.vconcat
hconcat1 = cv2.hconcat([image1, image2])
hconcat2 = cv2.hconcat([image3, image4])
hconcat_result = cv2.vconcat([hconcat1, hconcat2])
Result should be the same for all methods
Full code
import cv2
import numpy as np
# Load images
image1 = cv2.imread('Fart_1_2.png')
image2 = cv2.imread('Fart_2_2.png')
image3 = cv2.imread('Fart_1_1.png')
image4 = cv2.imread('Fart_2_1.png')
# Method #1
hstack1 = np.hstack((image1, image2))
hstack2 = np.hstack((image3, image4))
hstack_result = np.vstack((hstack1, hstack2))
# Method #2
concatenate1 = np.concatenate((image1, image2), axis=1)
concatenate2 = np.concatenate((image3, image4), axis=1)
concatenate_result = np.concatenate((concatenate1, concatenate2), axis=0)
# Method #3
hconcat1 = cv2.hconcat([image1, image2])
hconcat2 = cv2.hconcat([image3, image4])
hconcat_result = cv2.vconcat([hconcat1, hconcat2])
# Display
cv2.imshow('concatenate_result', concatenate_result)
cv2.imshow('hstack_result', hstack_result)
cv2.imshow('hconcat_result', hconcat_result)
I'm trying to extract images from a pdf using PyPDF2, but when my code gets it, the image is very different from what it should actually look like, look at the example below:
But this is how it should really look like:
Here's the pdf I'm using:
Here's my code:
pdf_filename = "SAMPLE.pdf"
pdf_file = open(pdf_filename, 'rb')
cond_scan_reader = PyPDF2.PdfFileReader(pdf_file)
page = cond_scan_reader.getPage(0)
xObject = page['/Resources']['/XObject'].getObject()
i = 0
for obj in xObject:
# print(xObject[obj])
if xObject[obj]['/Subtype'] == '/Image':
if xObject[obj]['/Filter'] == '/DCTDecode':
data = xObject[obj]._data
img = open("{}".format(i) + ".jpg", "wb")
i += 1
And since I need to keep the image in it's colour mode, I can't just convert it to RBG if it was CMYK because I need that information.
Also, I'm trying to get dpi from images I get from a pdf, is that information always stored in the image?
Thanks in advance
I used pdfreader to extract the image from your example.
The image uses ICCBased colorspace with the value of N=4 and Intent value of RelativeColorimetric. This means that the "closest" PDF colorspace is DeviceCMYK.
All you need is to convert the image to RGB and invert the colors.
Here is the code:
from pdfreader import SimplePDFViewer
import PIL.ImageOps
fd = open("SAMPLE PDF.pdf", "rb")
viewer = SimplePDFViewer(fd)
img = viewer.canvas.images['Im0']
# this displays ICCBased 4 RelativeColorimetric
print(img.ColorSpace[0], img.ColorSpace[1].N, img.Intent)
pil_image = img.to_Pillow()
pil_image = pil_image.convert("RGB")
inverted = PIL.ImageOps.invert(pil_image)
Read more on PDF objects: Image (sec. 8.9.5), InlineImage (sec. 8.9.7)
Hope this works: you probably need to use another library such as Pillow:
Here is an example:
from PIL import Image
image = Image.open("path_to_image")
if image.mode == 'CMYK':
image = image.convert('RGB')
Reference: Convert from CMYK to RGB