I am currently working on a media project. We've shooted looong clips, mainly dark if not black. I have decomposed these clips into their frames (>500k single frames) and put them in some folders. Now, my goal is to find out and select those frames that are not black or mainly dark: it's around a thousand out of the total.
This seems a job that a simple Python script can handle without too much effort. I know that scikit-image is quite common to work with images, but don't know how to come up with a script that does the job neatly. I have some experience with scientific programming but this with images manipulation is a bit out of my field.
For example, this image should be reported as black and thus ignored, while this other one, although in low light, should be kept as good.
Ideally, it would be optimal to have a script that uses one or more criteria to determine if an image is totally dark or not, and in the latter case put it into another folder for human (me) inspection.
Any help is exteremely appreciated!
You can get the mean of each image very simply without writing any code using ImageMagick which is available for Windows, Linux and macOS.
Like this:
magick identify -format '%[fx:mean*255] %f\r\n' black.jpg
1.01936 black.jpg
and:
magick identify -format '%[fx:mean*255] %f\r\n' nonblack.jpg
1.72921 nonblack.jpg
To improve performance, I would use GNU Parallel on macOS or Linux, but in Windows, I would open a new command prompt for each directory and run several scripts in parallel, or start one script processing all the files ending in 0 or 1, a second one processing files ending in 2 or 3, a third one processing files ending in 4,5 or 6 and a final one processing files ending in 7,8 or 9.
If I was doing it in Python I would use a multiprocessing pool to speed things up, by the way.
Opencv is enough to solve this problem.
use np.mean(image, axis=2) to get mean of different channels, then you can easily check the black ones.
As pointed out in the replies, taking a 'mean' of the image helped. After reading in the image, I compute np.mean(img, axis = 2).mean() so that I have the mean of the three colour channels. If this mean is low (<2) then the image is discarded, otherwise the file is copied to another folder.
The code is not really time efficient as it takes ~3 hours for 200k files, but does the trick!
You'll probably want to use PIL (Python Image Library).
I did a quick search for code that calculates the average of an image and found this snippet:
Image Average Color
import Image
def get_average_color((x,y), n, image):
""" Returns a 3-tuple containing the RGB value of the average color of the
given square bounded area of length = n whose origin (top left corner)
is (x, y) in the given image"""
r, g, b = 0, 0, 0
count = 0
for s in range(x, x+n+1):
for t in range(y, y+n+1):
pixlr, pixlg, pixlb = image[s, t]
r += pixlr
g += pixlg
b += pixlb
count += 1
return ((r/count), (g/count), (b/count))
image = Image.open('test.png').load()
r, g, b = get_average_color((24,290), 50, image)
print r,g,b
Maybe you could just iterate through all of the images in your folder and log (or copy) ones that are above a certain values.
There's probably a more elegant way to do this using PIL but maybe this will get you started.
Hope it helps!
Related
UPDATE: I tried increasing size in the chess.svg.board and it somehow cleared all the rendering issues at size = 900 1800
I tried using the svglib and reportlab to make .png files from .svg, and here is how the code looks:
import sys
import chess.svg
import chess
from svglib.svglib import svg2rlg
from reportlab.graphics import renderPM
board = chess.Board("rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR")
drawing = chess.svg.board(board, size=350)
f = open('file.svg', 'w')
f.write(drawing)
drawing = svg2rlg("file.svg")
renderPM.drawToFile(drawing, "file.png", fmt="png")
If you try to open file.png there is a lot of missing parts of the image, which i guess are rendering issues. How can you fix this?
Sidenote: also getting a lot of 'x_order_2: colinear!' messages when running this on a discord bot, but I am not sure if this affects anything yet.
THIS!! I am having the same error with the same libraries... I didn't find a solution but just a workaround which probably won't help too much in your case, where the shapes generating the bands are not very sparse vertically.
I'll try playing with the file dimensions too, but so far this is what I got. Note that my svg consists of black shapes on a white background (hence the 255 - x in the following code)
Since the appearance of the bands is extremely random, and processing the same file several times in a row produces different results, I decided to take advantage of randomness: what I do is I export the same svg a few times into different pngs, import them all into a list and then only take those pixels that are white in all the exported images, something like:
images_files = [my_convert_function(svgfile=file, index=i) for i in range(3)]
images = [255 - imageio.imread(x) for x in images_files]
result = reduce(lambda a,b: a & b, images)
imageio.imwrite(<your filename here>, result)
[os.remove(x) for x in images_files]
where my_convert_function contains your same svg2rlg and renderPM.drawToFile, and returns the name of the png file being written. The index 'i' is to save several copies of the same png with different names.
It's some very crude code but I hope it can help other people with the same issue
The format parameter has to be in uppercase
renderPM.drawToFile(drawing, "file.png", fmt="PNG")
I'm trying to make a plugin for gimp that opens two images as separate layers and transforms one of them (more on that below). I'm using GIMP 2.10.12.
I've been struggling to find a proper complete documentation for GIMP's Python interface and am mostly just working from what code snippets I've been able to find. This is what I have so far:
#!/usr/bin/env python2
import os
from gimpfu import *
def load_pair(img_f):
mask_f = img_f.replace(IMG_DIR, PRED_DIR)
result_f = os.path.splitext(img_f.replace(IMG_DIR, SAVE_DIR))[0]
result_dir = os.path.dirname(result_f)
if not os.path.isdir(result_dir):
os.makedirs(result_dir)
img = gimp.Image(100, 100)
pdb.gimp_display_new(img)
for f, name, pos in ((img_f, "Image", 0), (mask_f, "Mask", 1)):
layer = pdb.gimp_file_load_layer(img, f)
pdb.gimp_layer_set_name(layer, name)
pdb.gimp_image_insert_layer(img, layer, None, pos)
register(
"python_fu_open_image_pair",
...,
"<Toolbox>/Image/Open Image Pair",
"",
[(PF_FILE, "img_f", "Image:", None)],
[],
load_pair
)
main()
This kind of does what I want but with a couple of problems.
Question 1
Currently I'm using gimp.Image(100, 100) to open a new image. This means I have to then Fit Canvas to Layers and adjust the zoom and position every time I load a new image pair.
Is there a way to find an image's size from pdb before opening it or do I have to use another library (like PIL) for this? I'm trying to keep my plugin's dependencies to a minimum.
The two images are guaranteed to have the same size.
Since File->Open automatically adjusts the canvas to the image size, I would hope there'd be a nice way to achieve this.
Question 2
I would like to automatically create and set the current working file to result_f + '.xcf' (see above code) - such that File -> Save would automatically save to this file. Is this possible in pdb?
Question 3
Most importantly, I currently have the Mask images saved as black-and-white images. Upon loading a mask as a new layer, I'd like to transform the black colour to transparent and white colour to green (0,255,0). Additionally, since they are saved as .jpg images, the white and black aren't necessarily exactly 255 and 0 intensities but can be off by a bit.
How do I do this automatically in my plugin?
The good way would be to load the first image normally, and the rest as additional layers. Otherwise you can reset the canvas size (pdb.gimp_image_resize(...)) once you have loaded all the layers, and then create the Display.
You can give a name and a default file to the image by setting image.name and image.filename.
To convert the white to green use pdb.plug_in_colors_channel_mixer(...) and set all the gains to 0., except green in green. Make the black transparent use pdb.plug_in_colortoalpha(...).
PS: For color2alpha:
import gimpcolor
color=gimpcolor.RGB(0,255,0) # green, integer args: 0->255)
# or
color=gimpcolor.RGB(0.,1.,0) # green, floating point args (0.->1.)
pdb.plug_in_colortoalpha(image, layer, color)
The Python doc is a direct copy of the Scheme one. In Python, the RUN-INTERACTIVE parameter is not positional, so it doesn't appear in most calls, if you need it, it is a keyword parameter.
I am trying to trim parts of the image where a complete row of Image doesn't have anything except white color.
I tried using matplot lib
convert image into matrix and saw if (r,g,b) = (0,0,0) or (1,1,1) and removed entire row in image if every (r,g,b) is of above kind in the row
matrix looks like [ [ [r,g,b], [r,g,b]....] ],...., [ [r,g,b], [r,g,b]....] ] ]
i achieved my requirement but i am running this for around 500 images and it is taking 30 minutes around. Can i do it in better ways?
and the required image should be like
Edit-1 :
tried with trim method from wand package
with wand_img(filename=path) as i:
# i.trim(color=Color('white'))
# i.trim(color=Color('white'))
i.trim()
i.trim()
i.save(filename='output.png')
but not working for the following type of images
You could use ImageMagick which is installed on most Linux distros and is available for macOS and Windows.
To trim one image, start a Terminal (or Command Prompt on Windows) and run:
magick input.png -fuzz 20% -trim result.png
That will give you this - though I added a black border so you can make out the extent of it:
If you have lots to do, you can do them in parallel with GNU Parallel like this:
parallel -X magick mogrify -trim ::: *png
I made 1,000 copies of your image and did the whole lot in 4 seconds on a MacBook Pro.
If you don't have GNU Parallel, you can do 1,000 images in 12 seconds like this:
magick mogrify -trim *png
If you want to do it with Python, you could try something like this:
#!/usr/bin/env python3
from PIL import Image, ImageChops
# Load image and convert to greyscale
im = Image.open('image.png').convert('L')
# Invert image and find bounding box
bbox = ImageChops.invert(im).getbbox()
# Debug
print(*bbox)
# Crop and save
result = im.crop(bbox)
result.save('result.png')
It gives the same output as the ImageMagick version. I would suggest you use a threading tool to do lots in parallel for best performance.
The sequential version takes 65 seconds for 1,000 images and the multi-processing version takes 14 seconds for 1,000 images.
Using two trims in Imagemagick 6.9.10.25 Q16 Mac OSX Sierra works fine for me. Your image has a black bar on the right hand side. The first trim will remove that. The second trim will remove the remaining excess white. You may need to add some fuzz (tolerance) amount for the trim. But I did not need it.
Input:
convert img.png -trim +write tmp1.png -trim result.png
Result of first trim (tmp1.png):
Final Result after second trim:
ADDITION:
Looking at the docs for Python Wand:
trim(*args, **kwargs)
Remove solid border from image. Uses top left pixel as a guide by default, or you can also specify the color to remove.
Parameters:
color (Color) – the border color to remove. if it’s omitted top left pixel is used by default
fuzz (numbers.Integral) – Defines how much tolerance is acceptable to consider two colors as the same.
You will need to specify color=black for the first trim, since this version of trim uses the top left corner for trimming. Command line Imagemagick looks at all corners. If that fails, then add some fuzz value.
I am trying to get the Python 2.7 PIL Library to work with JPEG images that are only available as a stream coming from a HDD image and are not complete.
I have set the option:
ImageFile.LOAD_TRUNCATED_IMAGES = True
And load the stream as far as it is available (or better said: as far as I am 100% sure that this data is still a image, not some other file type). I have tested different things and as far as I can tell (for JPEGs) PIL only accepts it as a valid JPEG Image if it finds the 0xFFDA (Start of Scan Marker). This is a short example of how I load the data:
from PIL import Image
from StringIO import StringIO
ImageFile.LOAD_TRUNCATED_IMAGES = True
with open("/path/to/image.raw", 'rb') as fp:
fp.seek("""jump to position in image where JPEG starts""")
data = fp.read("""number of bytes I know that those belong to that jpeg""")
img = Image.open(StringIO(data)) # This would throw exception if the data does
# not contain the 0xffda marker
pixel = img.load() # Would throw exception if LOAD_TRUNCATED_IMAGES = false
height,width = img.size
for i in range(height):
for j in range(width):
print pixel[i,j]
On the very last line I expected (or hoped) to see at least the read pixel data to be displayed. But for every pixel it returns (0,0,0).
The Question: Is what I am trying here not possible with PIL?
Some weeks ago I tried the same with a image file I truncated myself, simply by cutting data from it with an editor. It worked for the pixel-data that was available. As soon as it reached a pixel that I cut off, the program threw an exception (I will try this again later today to make sure that I am not remembering wrong).
If somebody is wondering why I am doing this: I need to make sure that the image/picture inside that hdd image is in consecutive blocks/clusters and is not fragmented. To make sure of this I wanted to use pixel matching.
EDIT:
I have tried it again and this is what I have seen.
I opened a truncated image in GIMP and it showed me a few pixel lines in the upper part, but PIL was not able to at least give me the RGB values of those pixels. It always returns (0,0,0).
I made the image slightly bigger such that the lower 4/5 of the image was not visible, but that was enough for PIL to show me the RGB values that were available. Everything else was (0,0,0).
I am still not 100% sure whether PIL can show me the RGB values, even if only view pixel-data is available.
I would try it with an uncompressed format like TGA. JPG being a compressed format may not make any sense to extract pixels from an incomplete image. JPEG actually stores the parameters for equations that describe the image, not pixel values. When you query a JPEG for a pixel value it evaluates the equations at that point and returns the result.
I have the same problem with Pillow==9.2.0
Let's downgrade to Pillow==8.3.2 and it works.
I don't really know about streaming, but I think that you simply cannot access rgb value the way you do.
Try:
rgb_im = img.convert('RGB')
r, g, b = rgb_im.getpixel((i, j))
I have a small problem using the video creation capability of OpenCV.
For the same images, I get a weird output depending on the output size I want.
Here is an example of the results I can get.
http://www.youtube.com/watch?v=1wm8VjyfdyA&feature=youtu.be
I tried with several different sets of images, and on different computers.
It seems to run fine on Windows, and I have problems with the Opencv that ships in Ubuntu packages (current 2.3.1-7).
As the problem is not reproductible on my windows, I guess its was either fixed in the 2.4 or specific to Linux.
Here is a (python) test code that highlight the problem :
import os
import cv
in_dir = "../data/inputs/sample-test"
out = "output.avi"
# loading images, create Guys and store it into guys
frameSize = (652, 498)
#frameSize = (453, 325)
fourcc = cv.CV_FOURCC('F', 'M', 'P', '4')
my_video = cv.CreateVideoWriter(out,
fourcc,
15,
frameSize,
1)
for root, _, files in os.walk(in_dir):
for a_file in files:
guy_source = os.path.join(in_dir, a_file)
print guy_source
image = cv.LoadImage(guy_source)
small_im = cv.CreateImage(frameSize,
image.depth ,
image.nChannels)
cv.Resize(image, small_im, cv.CV_INTER_LINEAR)
cv.WriteFrame(my_video, small_im)
print "Finished !"
My concern is that depending on the output size, the video is fine (652, 498 is ok for example).
The behaviour is the same whatever codec I use.
If not a fix, I´d like some more information about the reason for this bug.
As I want to ship for Ubuntu, I´d better use their packaging system and keep the 2.3 for some time.
So I would like to know how I can wisely solve the problem, by choosing educated sizes.
Any information is welcome
Thx !
This is a common problem in video coding. As you can see, the image is shifted with a small amount to left each row.
As you may know, the image is saved as a long row of chars: BGRBGRBGR....
It is also defined by its width and height, and by step - the distance, in bytes, between two consecutive rows. A naive supposition is that the step is 3(channels)*width. But in addition, for memory alignment reasons, the image rows are padded with some extra bits, in order to make the step value a multiple of 4 (usually) or 16. The reason is that hardware codec acceleration works with aligned data - 32bit architectures read 32bits at once, and for SIMD processing, aligned data is loaded faster.
So the image will be represented as
BGRBGR00
BGRBGR00
Now, if a codec does not know of this padding, it will read the width of the image as 2, and will interpret the data as follows:
BGRBGR
00BGRB
0000BG // note the extra padding
To make sure you do not experience this issue, you should select image width in such a way that the step value (channels*width) is a multiple of four. All of the standard resolutions have this property, and this is one of the reasons they were selected so:
640x480
1024x768
etc