I want to convert a pdf file to png to manipulate within Python, and the save it back as a pdf, but in the process a grey zone gets created around the fonts (my image is a simple black and white typed document). It's very faint, a bit hard to see on a screen, but when printed it becomes fairly visible.
Here's the specific command I use:
PDF to PNG (in greyscale, super-sampling to preserve image quality):
convert -density 500 -alpha off file_in.pdf -scale 1700x2200 -bordercolor black -border 1x1 -fuzz 20% -trim +repage -colorspace Gray -depth 4 file_out.png
within Python
import Image
img = Image.open('file_out.png')
img.save('file_out2.pdf')
I also tried converting pdf to png with Ghostscript:
gs -sDEVICE=png16m -sOutputFile=file.png -dNOPAUSE -dBATCH -r300 file_out.pdf
with the save result.
Here's part of what
identify -verbose file.png
gives for the ImageMagick png :
Format: PNG (Portable Network Graphics)
Class: PseudoClass
Geometry: 1700x2200+0+0
Resolution: 500x500
Print size: 3.4x4.4
Units: Undefined
Type: Grayscale
Base type: Grayscale
Endianess: Undefined
Colorspace: Gray
Depth: 8/4-bit
Channel depth:
gray: 4-bit
Anyone have a solution? or at least an explanation?
Edit:
I found that using '-sample 1700x2200' instead of '-scale 1700x2200' fixed the grey around the fonts, but then the thin lines almost disappear and the font suffers from aliasing...
The pdf format is basically a vector format that can also include bitmapped ("raster") images.
If the original pdf contains a scanned document, it will usually only contain a bitmapped image (often in tiff or jpeg format) and then converting it to png is fine (if you stick to the original resolution of the image).
But if the original contains vector graphics (including text strings), converting those to a bitmap will generally introduce sampling errors. To avoid those, you canuse 1-bit color depth ("black-and-white" format) and a resolution that at least matches the printer. This will produce quite a large file png file, though. Using the tiff format might yield a smaller file. The "tiff-inside-pdf" format is something you see often when large drawings are scanned. According to ImageMagick's identify program, such a tiff file looks something like this:
Format: TIFF (Tagged Image File Format)
Class: DirectClass
Geometry: 13231x9355+0+0
Resolution: 400x400
Print size: 33.0775x23.3875
Units: PixelsPerInch
Type: Bilevel
Base type: Bilevel
Endianess: MSB
Colorspace: Gray
Depth: 1-bit
Channel depth:
gray: 1-bit
Dispite the huge size, the tiff file is only 144 kb. The tiff2pdf program (part of the tiff package) can convert these to nice and small pdf files.
But the best way to preserve the document's format is to edit the pdf file itself, instead of converting it to another format.
There is a Python module for manipulating pdf documents; PyPDF2. But since you don't specifiy what you want to do with the document, it is impossible to say if this can do what you want. There is also ReportLab, but that's more for generating pdf files. If you have the cairo library installed on your system, pycairo is a less heavyweight option to generate pdf documents.
An excellent utility in general for manipulating pdf files is pdftk (written in java).
Edit: Sampling in grayscale will always introduce sampling artefacts. These are not errors in themselves, just a consequence of the sampling process.
Decompiling the pdf file into PostScript as Ben Jackson mentions can be done. There are a couple of utilities that can help you with that; pdftops from the poppler-utils package, and pdf2ps that comes with ghostscript. In my experience, pdftops tends to produce better usable output.
But I haven't found a good way to automate this process. Below is a fragment from the Numpy User Guide decompiled with pdftops:
(At)
[7.192997
0
2.769603
0] Tj
-314 TJm
(the)
[2.769603
0
4.9813
0
4.423394
0] Tj
-313 TJm
(core)
[4.423394
0
4.9813
0
3.317546
0
4.423394
0] Tj
-314 TJm
(of)
[4.9813
0
3.317546
0] Tj
-313 TJm
(the)
[2.769603
0
4.9813
0
4.423394
0] Tj
-314 TJm
(NumPy)
[7.192997
0
4.9813
0
7.750903
0
5.539206
0
4.9813
0] Tj
-314 TJm
(package,)
[4.9813
0
4.423394
0
4.423394
0
4.9813
0
4.423394
0
4.9813
0
4.423394
0
2.49065
0] Tj
-329 TJm
This produces the sentence "At the core of the Numpy package," So if you look into the PostScript file for anything between (), you'll get the strings.
So changing individual words or removing short pieces is not that hard;
Find the correct word(s) in the decompiled PostScript.
Edit them (and the surrounding parameters!)
Re-compile to pdf (with ghostscript).
But you would have to look into the beginning of the document and see what the functions Tj and TJm do. If you want to replace text, you'll have to remove them and put in new text and code with the correct parameters for Tj and TJm. This requires an understanding of PostScript. And if you are replacing a sentence, you usually cannot replace it with a longer sentence; there will not be enough space...
Therefore it is generally advisable to try and get the original application to change the output.
Is there no way to get a good sampling in greyscale? What I want to do is open the file with PIL, to add some text and overlay an image
A PDF is a compressed PostScript document (plus metadata). PostScript is a programming language. If you use pdf2ps you can then add code to the PostScript to draw over any existing parts of the PDF. Then convert back with pdf2ps.
Here's another question that deals with that idea directly: Is it possible in Ghostscript to add watermark to every page in PDF
Related
I'm trying to minimize file size of PNG images written with pyvips.Image.pngsave(). Original files written with just .pngsave(output) are at https://github.com/CDDA-Tilesets/UltimateCataclysm and we'll look at giant.png which is 119536 bytes.
ImgBot was able to reduce file size to 50672.
pngsave(output, compression=9, palette=True, strip=True) to 58722
But the convert command from ImageMagick is still able to reduce file size further after the latter, to 42833 with default options:
$ convert giant_pyvips_c9.png giant_pyvips_magick.png
The question is whether it's possible to fit the same image into 42833 bytes using only pyvips to avoid adding another step to our workflow?
Update: Warning
palette size is limited to 256 colors and pyvips doesn't warn you if conversion becomes lossy.
Try turning off filtering:
$ vips copy giant.png x.png[palette,compression=9,strip,filter=0]
$ ls -l x.png
-rw-r--r-- 1 john john 41147 Feb 14 10:58 x.png
Background: PNG filters put the image though a difference filter before compression. Compressing differences to neighbouring pixels rather than absolute pixel values can boost the compression ratio if there is some local pattern in values. pyvips uses an adaptive filter by default.
Palette images encode an index into a look up table rather than anything related to luminance, so there is much less local correlation. In this case, filtering actually hurts compression.
http://www.w3.org/TR/PNG-Filters.html
You can see the values allowed for the filter= parameter here:
https://github.com/libvips/libvips/blob/master/libvips/include/vips/foreign.h#L579-L598
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
I have a DICOM file with an image that I would like to visualize. The image looks fine when I opened with DICOM software. However, when I tried to visualize the image in python with pyDICOM using the following code:
ds = dcmread("1.2.826.0.1.3680043.214.533944214201189834015365840769154197151.dcm")
pix = ds.pixel_array
plt.imshow(pix, cmap=plt.cm.bone)
The image turn greenish.
Here is the image exported in JPEG with the correct color from the same DICOM file.
These are the metadata of this DICOM file:
(0022, 001b) Refractive State Sequence 0 item(s) ----
(0028, 0002) Samples per Pixel US: 3
(0028, 0003) Samples per Pixel Used US: 2
(0028, 0004) Photometric Interpretation CS: 'RGB'
(0028, 0006) Planar Configuration US: 0
(0028, 0008) Number of Frames IS: "1"
Could you please help me to visualize the DICOM image with the correct color.
It appears that you are trying to visualize a JPEG (ITU 81) compressed DICOM file. According to the Photometric Interpretation value you gave us, it looks like Lossless compressed.
In any case the fact that the image is displayed with a greenish aspect is most likely caused by an inconsistency in between the color space indicated by the DICOM Photometric Interpretation and the actual color space indicated by the encapsulated Pixel Data JPEG stream.
Here is a small experiment you can try at home with your DICOM file:
$ sudo apt-get install libgdcm-tools
$ gdcmraw 1.2.826.0.1.3680043.214.533944214201189834015365840769154197151.dcm /tmp/pixel_data_stream.jpg
$ file /tmp/pixel_data_stream.jpg
/tmp/pixel_data_stream.jpg: JPEG image data
$ gdcmimg /tmp/pixel_data_stream.jpg /tmp/dicom.dcm
$ gdcminfo /tmp/dicom.dcm
[...]
TransferSyntax is 1.2.840.10008.1.2.4.70 [JPEG Lossless, Non-Hierarchical, First-Order Prediction (Process 14 [Selection Value 1]): Default Transfer Syntax for Lossless JPEG Image Compression]
NumberOfDimensions: 2
Dimensions: (256,256,1)
SamplesPerPixel :3
BitsAllocated :16
BitsStored :16
HighBit :15
PixelRepresentation:0
ScalarType found :UINT16
PhotometricInterpretation: RGB
PlanarConfiguration: 0
TransferSyntax: 1.2.840.10008.1.2.4.70
[...]
The goal of the experiment was to verify what was the actual color space as found in the JPEG bitstream. By removing the DICOM enveloppe (gdcmraw step), we make sure that gdcmimg step will have no access to the DICOM Photometric Interpretation value and should deduce the right value directly from the JPEG stream itself.
In other word, if gdcmimg reconstruct a DICOM file with a Photometric Interpretation different from the one as found in the original file, then you found your issue.
If you have access to the DCMTK toolkit, in particular dcmdjpeg, pay attention to the following advanced process option:
$ man dcmdjpeg
[...]
processing options
color space conversion:
+cp --conv-photometric
convert if YCbCr photometric interpretation (default)
# If the compressed image uses YBR_FULL or YBR_FULL_422 photometric
# interpretation, convert to RGB during decompression.
+cl --conv-lossy
convert YCbCr to RGB if lossy JPEG
# If the compressed image is encoded in lossy JPEG, assume YCbCr
# color model and convert to RGB.
+cg --conv-guess
convert to RGB if YCbCr is guessed by library
# If the underlying JPEG library "guesses" the color space of the
# compressed image to be YCbCr, convert to RGB.
+cgl --conv-guess-lossy
convert to RGB if lossy JPEG and YCbCr is
guessed by the underlying JPEG library
# If the compressed image is encoded in lossy JPEG and the underlying
# JPEG library "guesses" the color space to be YCbCr, convert to RGB.
+ca --conv-always
always convert YCbCr to RGB
# If the compressed image is a color image, assume YCbCr color model
# and convert to RGB.
+cn --conv-never
never convert color space
# Never convert color space during decompression.
Once your compressed DICOM file is decompressed by dcmdjpeg, the inconsistency in between the DICOM enveloppe and the JPEG bitstream is removed (only the Photometric Interpretation of the DICOM header can indicate the color space).
If you really want to keep your DICOM file with the original compressed bit stream. You may be able to fix the Photometric Interpretation value using something like this:
$ gdcmanon --dumb --replace 0028,0004=YBR_FULL 1.2.826.0.1.3680043.214.533944214201189834015365840769154197151.dcm /tmp/corrected.dcm
Or (if the compression is lossy):
$ gdcmanon --dumb --replace 0028,0004=YBR_FULL_422 1.2.826.0.1.3680043.214.533944214201189834015365840769154197151.dcm /tmp/corrected422.dcm
When the DICOM transfer syntax is 1.2.840.10008.1.2.4.70 (Jpeg lossless) the photometric interpretation shouldn't be RGB but one of the YBR (YBR_FULL or YBR_PARTIAL or the chroma subsampled versions).
Despite the DICOM standard allowing for the RGB color space with jpeg lossy images, I've never seen a jpeg lossy image with the RGB color space that was rendered correctly in RGB (but was rendered just fine as soon as the color space was replaced by YBR).
There are several DICOM datasets out there that have this mistake and the only way to show them correctly is to replace RGB with YBR_FULL or the subsampled YBR_FULL_422.
I have a file that contains a 240x320 image but its byte format I opened it in a hex editor and got what something like an array 16 columns 4800 raw.
Im completely new to this thats why im facing trouble I have tried using a python script but it gave an error on line 17, in data = columnvector[0][i]:
IndexError: list index out of range.
I have tried a java code but that was an error as well, I wanted to try some c# codes but none of the codes i found explains how i can feed my file to the code. This is the python code
import csv
import sys
import binascii
csv.field_size_limit(500 * 1024 * 1024)
columnvector = []
with open('T1.csv', 'r') as csvfile:
csvreader = csv.reader(csvfile,delimiter=' ', quotechar='|')
for row in csvreader:
columnvector.append(row)
headers =['42','4D','36','84','03','00','00','00','00','00','36','00','00','00','28','00','00','00',
'40','01','00','00','F0','00','00','00','01','00','18','00','00','00','00','00','00','84','03','00','C5','00',
'00','00','C5','00','00','00','00','00','00','00','00','00','00','00']
hexArray=[]
for i in range(0,76800):
data = columnvector[0][i]
hexArray.extend([data,data,data])
with open('T1.txt', 'wb') as f:
f.write(binascii.unhexlify(''.join(headers)))
f.write(binascii.unhexlify(''.join(hexArray)))
I want to convert the file to an image using any method I honestly don't care what method to use just as long as it gets the job done.
this is some the files
https://github.com/Mu-A/OV7670-files/tree/Help
You can make the binary data into images without writing any Python, just use ImageMagick in the Terminal. It is included in most Linux distros and is available for macOS and Windows.
If your image is 320x240 it should be:
320 * 240 bytes long if single channel (greyscale), or
320 * 240 * 3 if 3-channel RGB.
As your images are 76800, I am assuming they are greyscale.
So, in Terminal, to make that raw data into a JPEG, use:
magick -depth 8 -size 320x240 gray:T1 result.jpg
or, if you are using version 6 of ImageMagick, use:
convert -depth 8 -size 320x240 gray:T1 result.jpg
If you want a PNG with automatic contrast-stretch, use:
magick -depth 8 -size 320x240 gray:T1 -auto-level result.png
Unfortunately none of your images come out to anything sensible. Here is T1, for example:
The histograms do look somewhat sensible though:
I think you have something fundamentally wrong so I would try reverting to first principles to debug it. I would shine a torch into, or point the camera at a window and save a picture called bright.dat and then cover the lens with a black card and take another image called dark.dat. Then I would plot a histogram of the data and see if the bright one was at the rightmost end and the dark one was at the leftmost end. Make a histogram like this:
magick -depth 8 -size 320x240 Gray:bright.dat histogram:brightHist.png
and:
magick -depth 8 -size 320x240 Gray:dark.dat histogram:darkHist.png
for i in range(0,76800):
is a hardcoded value, and because columnvector[0][i] does not have that many values, you get that IndexError: list index out of range.
Consider why you need to set your range from 0-76800 or if the value can be dynamically sourced from len() of something.
Another simple way to make an image from a binary file is to convert it to a NetPBM image.
As your file is 320x240 and 8-bit binary greyscale, you just need to make a header with that information in it and append your binary file:
printf "P5\n320 240\n255\n" > image.pgm
cat T1 >> image.pgm
You can now open image.pgm with feh, Photoshop, GIMP or many other image viewers.
You can download a multi-channel, 16-bit png file from here (shown below). I have tried multiple Python packages for reading this multi-channel 16-bit-per-channel image. But none work, and if they do somehow they transform the images (scaling etc). I tried using imageio, PIL.Image, scipy.ndimage.imread and a couple more. It seems that they all can read single-channel 16-bit pngs properly but convert the multi-channel images into 8-bit-per-channel. For instance, this is a GitHub issue thst indicates imageio cannot read multi-channel 16-bit images. Another issue (here) for Pillow seems to say the same thing.
So I wonder, does anyone know how can I read multi-channel, 16-bit png files in Python without using OpenCV package properly? Feel free to offer solutions from other packages that I didn't mention anything about here.
Option 1 - Split into channels with ImageMagick
You could use ImageMagick (it is installed on most Linux distros and is available for macOS and Windows) at the command line.
For example, this will separate your 16-bit 3-channel PNG into its constituent channels that you can then process individually in Pillow:
magick input.png -separate channel-$d.png
Now there are 3 separate channels:
-rw-r--r-- 1 mark staff 2276 1 Apr 16:47 channel-2.png
-rw-r--r-- 1 mark staff 3389 1 Apr 16:47 channel-1.png
-rw-r--r-- 1 mark staff 2277 1 Apr 16:47 channel-0.png
and they are each 16-bit, single channel images that Pillow can open:
magick identify channel-*
Sample Output
channel-0.png PNG 600x600 600x600+0+0 16-bit Grayscale Gray 2277B 0.000u 0:00.000
channel-1.png PNG 600x600 600x600+0+0 16-bit Grayscale Gray 3389B 0.000u 0:00.000
channel-2.png PNG 600x600 600x600+0+0 16-bit Grayscale Gray 2276B 0.000u 0:00.000
If you are using ImageMagick v6, replace magick with convert and replace magick identify with plain identify.
Option 2 - Split into channels with NetPBM
As an alternative to ImageMagick, you could use the much lighter weight NetPBM tools to do the same thing:
pngtopam < rainbow.png | pamchannel - 0 -tupletype GRAYSCALE > channel-0.pam
pngtopam < rainbow.png | pamchannel - 1 -tupletype GRAYSCALE > channel-1.pam
pngtopam < rainbow.png | pamchannel - 2 -tupletype GRAYSCALE > channel-2.pam
Pillow can then open the PAM files.
Option 3 - Use PyVips
As an alternative, you could use the extremely fast, memory-efficient pyvips to process your images. Here is an example from the documentation that:
crops 100 pixels off each side
shrinks an image by 10% with bilinear interpolation
sharpens with a convolution and re-saves the image.
Here is the code:
#!/usr/local/bin/python3
import sys
import pyvips
im = pyvips.Image.new_from_file(sys.argv[1], access='sequential')
im = im.crop(100, 100, im.width - 200, im.height - 200)
im = im.reduce(1.0 / 0.9, 1.0 / 0.9, kernel='linear')
mask = pyvips.Image.new_from_array([[-1, -1, -1],
[-1, 16, -1],
[-1, -1, -1]], scale=8)
im = im.conv(mask, precision='integer')
im.write_to_file("result.png")
The result is 16-bit like the input image:
identify result.png
result.png PNG 360x360 360x360+0+0 16-bit sRGB 2900B 0.000u 0:00.000
As you can see it is still 16-bit, and trimming 100px off each side results in 600px becoming 400px and then the 10% reduction makes that into 360px.
Option 4 - Convert to TIFF and use PyLibTiff
A fourth option, if the number of files is an issue, might be to convert your images to TIFF with ImageMagick
convert input.png output.tif
and they retain their 16-bit resolution, and you process them with PyLibTiff as shown here.
Option 5 - Multi-image TIFF processed as ImageSequence
A fifth option, could be to split your PNG files into their constituent channel and store them as a multi-image TIFF, i.e. with Red as the first image in the sequence, green as the second and blue as the third. This means there is no increase in he number of files and also you can store more than 3 channels per file - you mentioned 5 channels somewhere in your comments:
convert input.png -separate multiimage.tif
Check there are now 3 images, each 16-bit, but all in the same, single file:
identify multiimage.tif
multiimage.tif[0] TIFF 600x600 600x600+0+0 16-bit Grayscale Gray 10870B 0.000u 0:00.000
multiimage.tif[1] TIFF 600x600 600x600+0+0 16-bit Grayscale Gray 10870B 0.000u 0:00.000
multiimage.tif[2] TIFF 600x600 600x600+0+0 16-bit Grayscale Gray 10870B 0.000u 0:00.000
Then process them as an image sequence:
from PIL import Image, ImageSequence
im = Image.open("multiimage.tif")
index = 1
for frame in ImageSequence.Iterator(im):
print(index)
index = index + 1
I had the same problem and I found out that imageio can do the job:
img = imageio.imread('path/to/img', format='PNG-FI')
With this option you can read and write multi-channel 16-bit png images (by default imageio uses PNG-PIL as format for reading png files). This works for png images, but changing the format can probably help when dealing with other image types (here a full list of available imageio formats).
To use this format you may need to install the FreeImage plugin as shown in the documentation.
I have some experiments with JPEG, the doc said "100 completely disables the JPEG quantization stage."
However, I still got some pixel modification during saving. Here is my code:
import Image
red = [20,30,40,50,60,70];
img = Image.new("RGB", [1, len(red)], (255,255,255))
pix = img.load()
for x in range(0,len(red)):
pix[0,x] = (red[x],255,255)
img.save('test.jpg',quality=100)
img = Image.open('test.jpg')
pix = img.load()
for x in range(0,len(red)):
print pix[0,x][0],
I got unexpected output: 22 25 42 45 62 65
What should I do to preserve the pixel value ? Please note that I also tried with PHP using imagejpeg and It gives me the correct value when quality=100.
I can use png to preserve, but I want to know the reason behind this and if there is any option to avoid
JPEG consists of many different steps, many of which introduce some loss. By using a sample image containing only red, you've probably run across the worst offender - downsampling or chroma subsampling. Half of the color information is thrown away because the eye is more sensitive to brightness changes than color changes.
Some JPEG encoders can be configured to turn off subsampling, including PIL and Pillow by setting subsampling=0. In any case it won't give you a completely lossless file since there are still other steps that introduce a loss.
JPEG will always carry risk of lossyness, see Is Jpeg lossless when quality is set to 100?.
Your best bet is to use another format, especially if your experiments are for science :) Even if you're forced to start with JPEG (which seems unlikely) you should immediately convert to a lossless format for any kind of analysis and modification.
If you really want to try lossless JPEG work with python you can try jpegtran, "the lossless jpeg image transformation software from the Independent Jpeg Group", but as #Mark notes, this won't get you very far.
By the way, quantization is used in lossy or lossless compression alike, so my guess is that
...100 completely disables the JPEG quantization stage.[1]
simply means that it's not compressed at all.
Believe I've figured out how to keep the current color subsampling and other quality details:
from PIL import Image, JpegImagePlugin as JIP
img = Image.open(filename)
img.save(
filename + '2.jpg', # copy
format='JPEG',
exif=img.info['exif'], # keep EXIF info
optimize=True,
qtables=img.quantization, # keep quality
subsampling=JIP.get_sampling(img), # keep color res
)
Per https://www.exiv2.org/tags.html I've found that the YCbCrSubSampling tag is not kept in EXIF in JPEG files:
In JPEG compressed data a JPEG marker is used
instead of this tag.
This must be why there is another function in a seemingly out of the way place to to grab it.
(Believe I found it here: https://newbedev.com/determining-jpg-quality-in-python-pil)