Essentially my problem is just finding an easy way to create an image file from an array.
My problem is unparsing CUPS raster files into images. The CUPS RGB raster file header is 1800 bytes. If I input the width and height I can read the raster array contained in the file correctly into Photoshop in Mac order, with interleaved 16 bit data 00RRGGBB. I have written a utility which extracts the width and height from the header.
I'd like to write another command-line utility which takes the width, height and file-name as inputs, truncates the first 1800 bytes off the raster file, and creates a Tiff or BMP or whatever is easiest to write image with the array that is contained in the rest - any well-known image format will do.
program should be C or Python, run under Mac, Linux.
For Python, PIL is the tool for this task. Use the putdata() (search the link for putdata) method on image objects to put the pixels from a list into an image.
You can try GDAL,which supports many image formats.You can use RasterIO(...) method for reading image data.
Related
I am trying to read image.jpg (RGB) into an array in python without any additional module but it doesn't work?
pic = open('image.jpg')
array=[]
with open(p, 'rb') as inf:
jpgdata = inf.read()
values=jpgdata.split()
array=array.append(values[:][:])
print (array)
Can anyone help me how to read an image 3 bands (RGB) in python without using external module?
A JPEG image is not just a series of pixels, unlike some other formats like BMP.
In order to get the pixel data from a JPEG image you need to decompress it, which involves reading its header data, then rebuilding the data from 8x8px blocks which contain information regarding the brightness and color (YCbCr).
You need to:
Build the Huffman tree and revert the blocks
Invert the discrete cosine transform with the given parameters
Revert the YCbCr into RGB
Place each block into its corresponding location in the image
Building a simple decoder from scratch is certainly possible, but it's not going to be done in a few lines.
The main problem:
I have a map step where I render a large amount of sectors of an image in parallel:
1 2
3 4
worker a -> 1
worker b -> 2
...
merge 1,2,3,4 to make final image
If it can fit in memory
With images that are relatively small and can fit in RAM, one can simply use PIL's functionality:
def merge_images(image_files, x, y):
images = map(Image.open, image_files)
width, height = images[0].size
new_im = Image.new('RGB', (width * x, height * y))
for n, im in enumerate(images):
new_im.paste(im, ((n%x) * width, (n//y) * height))
return new_im
Unfortunately, I am going to have many, many large sectors. I want to merge the pictures finally into a single image of about 40,000 x 60,000 pixels, which I estimate to be around 20 GB's. (Or maybe even more)
So obviously, we can't approach this problem on RAM. I know there are alternatives like memmap'ing arrays and writing to slices, which I will try. However, I am looking for as-out-of-the-box-as-possible solutions.
Does anyone know of any easier alternatives? Even though all the approaches I've tried so far is in python, it doesn't need to be in python.
pyvips can do exactly what you want very quickly and efficiently. For example:
import sys
import pyvips
images = [pyvips.Image.new_from_file(filename, access="sequential")
for filename in sys.argv[2:]]
final = pyvips.Image.arrayjoin(images, across=10)
final.write_to_file(sys.argv[1])
The access="sequential" option tells pyvips that you want to stream the image. It will only load pixels on demand as it generates output, so you can merge enormous images using only a little memory. The arrayjoin operator joins an array of images into a grid across tiles across. It has quite a few layout options: you can specify borders, overlaps, background, centring behaviour and so on.
I can run it like this:
$ for i in {1..100}; do cp ~/pics/k2.jpg $i.jpg; done
$ time ../arrayjoin.py x.tif *.jpg
real 0m2.498s
user 0m3.579s
sys 0m1.054s
$ vipsheader x.tif
x.tif: 14500x20480 uchar, 3 bands, srgb, tiffload
So it joined 100 JPG images to make a 14,000 x 20,000 pixel mosaic in about 2.5s on this laptop, and from watching top, needed about 300mb of memory. I've used it to join over 30,000 images into a single file, and it would go higher. I've made images of over 300,000 by 300,000 pixels.
The pyvips equivalent of PIL's paste is insert. You could use that too, though it won't work so well for very large numbers of images.
There's also a command-line interface, so you could just enter:
vips arrayjoin "${echo *.jpg}" x.tif --across 10
To join up a large set of JPG images.
I would suggest using the TIFF file format. Most TIFF files are striped (one or more scan lines are stored as a block on file), but it is possible to write tiled TIFF files (where the image is divided into tiles, and each is stored as an independent block on file).
LibTIFF is the canonical way of reading and writing TIFF files. It has an easy way of creating a new TIFF file, and add tiles one at the time. Thus, your program can create the TIFF file, obtain one sector, write it as (one or more) tiles to the TIFF file, obtain the next sector, etc. You would have to choose your tile size to evenly divide one sector.
There is a Python binding to LibTIFF called, what else, PyLibTIFF. It should allow you to follow the above model from within Python. That same repository has pure Python module to read and write TIFF files, I don't know if that is able to write TIFF files in tiles, or if it allows to write them in chunks. There are many other Python modules for reading and writing TIFF files, but most will write one matrix as a TIFF file, rather than allow you to write a file one tile at a time.
My overall goal is to crop several regions from an input mirax (.mrxs) slide image to JPEG output files.
Here is what one of these images looks like:
Note that the darker grey area is part of the image, and the regions I ultimately wish to extract in JPEG format are the 3 black square regions.
Now, for the specifics:
I'm able to extract the color channels from the mirax image into 3 separate TIFF files using vips on the command line:
vips extract_band INPUT.mrxs OUTPUT.tiff[tile,compression=jpeg] C --n 1
Where C corresponds to the channel number (0-2), and each output file is about 250 MB in size.
The next job is to somehow recognize and extract the regions of interest from the images, so I turned to several python imaging libraries, and this is where I encountered difficulties.
When I try to load any of the TIFFs using OpenCV using:
i = cv2.imread('/home/user/input_img.tiff',cv2.IMREAD_ANYDEPTH)
I get an error error: (-211) The total matrix size does not fit to "size_t" type in function setSize
I managed to get a little more traction with Pillow, by doing:
from PIL import Image
tiff = Image.open('/home/user/input_img.tiff')
print len(tiff.tile)
print tiff.tile[0]
print tiff.info
which outputs:
636633
('jpeg', (0, 0, 128, 128), 8, ('L', ''))
{'compression': 'jpeg', 'dpi': (25.4, 25.4)}
However, beyond loading the image, I can't seem to perform any useful operations; for example doing tiff.tostring() results in a MemoryError (I do this in an attempt to convert the PIL object to a numpy array) I'm not sure this operation is even valid given the existence of tiles.
From my limited understanding, these TIFFs store the image data in 'tiles' (of which the above image contains 636633) in a JPEG-compressed format.
It's not clear to me, however, how would one would extract these tiles for use as regular JPEG images, or even whether the sequence of steps in the above process I outlined is a potentially useful way of accomplishing the overall goal of extracting the ROIs from the mirax image.
If I'm on the right track, then some guidance would be appreciated, or, if there's another way to accomplish my goal using vips/openslide without python I would be interested in hearing ideas. Additionally, more information about how I could deal with or understand the TIFF files I described would also be helpful.
The ideal situations would include:
1) Some kind of autocropping feature in vips/openslide which can generate JPEGs from either the TIFFs or original mirax image, along the lines of what the following command does, but without generated tens of thousands of images:
vips dzsave CMU-1.mrxs[autocrop] pyramid
2) Being able to extract tiles from the TIFFs and store the data corresponding to the image region as a numpy array in order to detect the 3 ROIs using OpenCV or another methd.
I would use the vips Python binding, it's very like PIL but can handle these huge images. Try something like:
from gi.repository import Vips
slide = Vips.Image.new_from_file(sys.argv[1])
tile = slide.extract_area(left, top, width, height)
tile.write_to_file(sys.argv[2])
You can also extract areas on the command-line, of course:
$ vips extract_area INPUT.mrxs OUTPUT.tiff left top width height
Though that will be a little slower than a loop in Python. You can use crop as a synonym for extract_area.
openslide attaches a lot of metadata to the image describing the layout and position of the various subimages. Try:
$ vipsheader -a myslide.mrxs
And have a look through the output. You might be able to calculate the position of your subimages from that. I would also ask on the openslide mailing list, they are very expert and very helpful.
One more thing you could try: get a low-res overview, corner-detect on that, then extract the tiles from the high-res image. To get a low-res version of your slide, try:
$ vips copy myslide.mrxs[level=7] overview.tif
Level 7 is downsampled by 2 ** 7, so 128x.
I have svg files which I would like to compare based on their dimensions.
I read about PIL as the best image tool in python. Does PIL handle svg files? I can't seem to find this anywhere.
When googling I saw people interpreting svg files as text which seems counterintuitive.
What if not PIL is be the best way to get the x & y dimensions of a .svg file?
Thanks
PIL handles many image types, but not (yet?) SVG. Partly, this is because SVG is a set of instructions to produce an image, not a container for raw image data.
Fortunately, SVG can be read as XML, using the tool of your choice; for example, xml.etree.ElementTree in the Python standard library.
Unfortunately, by its nature, SVG doesn't have a single native size. Instead, it has two size concepts: the view box, and the height and width attributes.
If your svg file has width and height attributes, you can safely use those as the x and y dimensions, respectively. Otherwise, if it has a viewBox attribute, it is meant to scale to any size you need it to; however, you can use its third and fourth numbers as width and height, if you need to.
Worse, SVG files could lack either one. In that case, one could potentially compute a height and width based on the elements in the file, but that's trickier than anyone really wants to do, given the full capabilities of the format.
How can I replace a colour across multiple images with another in python? I have a folder with 400 sprite animations. I would like to change the block coloured shadow (111,79,51) with one which has alpha transparencies. I could easily do the batch converting using:
img = glob.glob(filepath\*.bmp)
however I dont know how I could change the pixel colours. If it makes any difference, the images are all 96x96 and i dont care how long the process is. I am using python 3.2.2 so I cant really use PIL (I think)
BMP is a windows file format, so you will need PIL or something like it; or you can roll your own reader/writer. The basic modules won't help as far as I'm aware. You can read PPM and GIF using Tk (PhotoImage()) which is part of the standard distribution and use get() and put() on that image to change pixel values. See references online, because it's not straight-forward - the pixels come from get() as 3-tuple integers, but need to go back to put() as space-separated hex text!
Are your images in indexed mode (8 bit per pixel with a palette),or "truecolor" 32bpp images? If they are in indexed modes, it would not be hard to simply mark the palette entry for that color to be transparent across all files.
Otherwise, you will really have to process all pixel data. It also could be done by writting a Python script for GIMP - but that would require Python-2 nonetheless.