I have a mp4 video of 720x1280, and I want it in different sizes like:
0.66%, 0.5% and 0.33%.
For each of these sizes I use:
clip = mp.VideoFileClip(file)
clip_resized1 = clip.resize(height=int(clip.h * float(0.66666)))
clip_resized1.write_videofile(name + '-2x' + ext)
I do this for each of the sizes but some of them work and some not. The 0.66 not works, just like the 0.33. The 0.5% works just fine.
It creates the files for every size, but they are corrupt, and can't open them (except 0.5 as I said, which works ok).
Any clue on this? Any better solution for resizing in Python?
The issue I believe is that most video player cannot play mp4 if one of the dimensions of the clip is an odd number. For instance 720x1280 works on all players but 721x1280 will only play on some video players like VLC.
So make sure that clip.h and clip.w are both even before writing to a video file. There are several ways you can do that, either indicate the new dimensions of the clip yourself, like clip.resize((844, 476)), or redimension the clip of 66% and add a 1px black margin at the top, like clip.resize(0.66).margin(top=1)
Related
I am currently working on a media project. We've shooted looong clips, mainly dark if not black. I have decomposed these clips into their frames (>500k single frames) and put them in some folders. Now, my goal is to find out and select those frames that are not black or mainly dark: it's around a thousand out of the total.
This seems a job that a simple Python script can handle without too much effort. I know that scikit-image is quite common to work with images, but don't know how to come up with a script that does the job neatly. I have some experience with scientific programming but this with images manipulation is a bit out of my field.
For example, this image should be reported as black and thus ignored, while this other one, although in low light, should be kept as good.
Ideally, it would be optimal to have a script that uses one or more criteria to determine if an image is totally dark or not, and in the latter case put it into another folder for human (me) inspection.
Any help is exteremely appreciated!
You can get the mean of each image very simply without writing any code using ImageMagick which is available for Windows, Linux and macOS.
Like this:
magick identify -format '%[fx:mean*255] %f\r\n' black.jpg
1.01936 black.jpg
and:
magick identify -format '%[fx:mean*255] %f\r\n' nonblack.jpg
1.72921 nonblack.jpg
To improve performance, I would use GNU Parallel on macOS or Linux, but in Windows, I would open a new command prompt for each directory and run several scripts in parallel, or start one script processing all the files ending in 0 or 1, a second one processing files ending in 2 or 3, a third one processing files ending in 4,5 or 6 and a final one processing files ending in 7,8 or 9.
If I was doing it in Python I would use a multiprocessing pool to speed things up, by the way.
Opencv is enough to solve this problem.
use np.mean(image, axis=2) to get mean of different channels, then you can easily check the black ones.
As pointed out in the replies, taking a 'mean' of the image helped. After reading in the image, I compute np.mean(img, axis = 2).mean() so that I have the mean of the three colour channels. If this mean is low (<2) then the image is discarded, otherwise the file is copied to another folder.
The code is not really time efficient as it takes ~3 hours for 200k files, but does the trick!
You'll probably want to use PIL (Python Image Library).
I did a quick search for code that calculates the average of an image and found this snippet:
Image Average Color
import Image
def get_average_color((x,y), n, image):
""" Returns a 3-tuple containing the RGB value of the average color of the
given square bounded area of length = n whose origin (top left corner)
is (x, y) in the given image"""
r, g, b = 0, 0, 0
count = 0
for s in range(x, x+n+1):
for t in range(y, y+n+1):
pixlr, pixlg, pixlb = image[s, t]
r += pixlr
g += pixlg
b += pixlb
count += 1
return ((r/count), (g/count), (b/count))
image = Image.open('test.png').load()
r, g, b = get_average_color((24,290), 50, image)
print r,g,b
Maybe you could just iterate through all of the images in your folder and log (or copy) ones that are above a certain values.
There's probably a more elegant way to do this using PIL but maybe this will get you started.
Hope it helps!
I am new to PsychoPy, having previously worked with Pygame for several months (I switched to enable stimuli to be presented on multiple screens).
I am trying to figure out how to use PsychoPy to display an animation created using a sequence of images. I previously achieved this in Pygame by saving the entire sequence of images in a single large png file (a spritesheet) and then flipping only a fraction of that image (eg. 480 x 480 pixels) per frame, while moving onto the next equally sized section of the image in the next frame. This is roughly what my code looked like in Pygame. I would be really keen to hear if there is an equivalent way of generating animations in PsychoPy by selecting only parts of an image to be displayed with each frame. So far, googling this has not provided any answers!
gameDisplay=pygame.display.set_mode((800, 480))
sequence=pygame.image.load('C:\Users\...\image_sequence.png')
#This image contains 10 images in a row which I cycle through to get an animation
image_width=480
image_height=480
start=time.time()
frame_count=0
refresh=0
while time.time()<=start+15:
gameDisplay.blit(sequence,(160,0),(frame_count*image_width,0,image_width,image_height))
if time.time()>= start+(refresh*0.25): #Flip a new image say every 250 msec
pygame.display.update()
frame_count+=1
refresh+=1
if frame_count ==10:
frame_count=0
You could use a square aperture to restrict what's visible and then move the image. So something like this (untested, but could give you some ideas):
from psychopy import visual
win = visual.Window(units='pix') # easiest to use pixels as unit
aperture = visual.Aperture(win, shape='rect', size=(480, 480))
image = visual.ImageStim('C:\Users\...\image_sequence.png')
# Move through x positions
for x in range(10):
image.pos = [(-10.0/2*+0.5+x)*480, 0] # not sure this is right, but it should move through the x-positions
image.draw()
win.flip()
If you have the original images, I think that it would be simpler to just display the original images in sequence.
import glob
from psychopy import visual
image_names = glob.glob('C:\Users\...\*.png')
# Create psychopy objects
win = visual.Window()
image_stims = [visual.ImageStim(win, image) for image in image_names]
# Display images one by one
for image in image_stims:
image.draw()
win.flip()
# add more flips here if you want a lower frame rate
Perhaps it is even fast enough to load them during runtime without dropping frames, which would simplify the code and load memory less:
# Imports, glob, and win here
# Create an ImageStim and update the image each frame
stim = visual.ImageStim(win)
for name in image_names:
stim.image = name
stim.draw()
win.flip()
Actually, given a spritesheet you might be able to do something funky and more efficient using the GratingStim. This loads an image as a texture and then allows you to set the spatial frequncy (sf) and phase of that texture. If 1.0/sf (in both dimensions) is less than the width of the stimulus (in both dimensions) only a fraction of the texture will be shown and the phase determines which fraction that will be. It isn't designed for this purpose - it's usually used to create more than one cycle of texture not less than one - but I think it will work.
I am currently working on a project to capture and process photos on a raspberry Pi.
The photos are 6000X4000 about 2 mb, from a nikon D5200 camera.
Everything is working fine, i have made a proof of concept in Java and want to transform this to python or C depending on which language is faster on the raspberry.
No the problem is that the images need to be cropped and re-sized, this takes a very long time in the raspberry. In java the whole process of reading the image, cropping and writing the new image takes about 2 minutes.
I have also tried ImageMagick but in command-line this even takes up to 3 minutes.
With a small python script i made this is reduces to 20 seconds, but this is still a bit to long for my project.
Currently i am installing OpenCV to check if this is faster, this process takes around 4 hours so i thought in the meantime i can ask a question here.
Does anybody have any good idea's or libraries to speed up the process of cropping and re-sizing the images.
Following is the python code i used
import Image
def crop_image(input_image, output_image, start_x, start_y, width, height):
"""Pass input name image, output name image, x coordinate to start croping, y coordinate to start croping, width to crop, height to crop """
input_img = Image.open(input_image)
box = (start_x, start_y, start_x + width, start_y + height)
output_img = input_img.crop(box)
output_img.save(output_image +".jpg")
def main():
crop_image("test.jpg","output", 1000, 0, 4000, 4000)
if __name__ == '__main__': main()
First approach (without sprites)
import pyglet
#from pyglet.gl import *
image = pyglet.resource.image('test.jpg')
texture = image.get_texture()
## -- In case you plan on rendering the image, use the following gl set:
#gl.glTexParameteri(gl.GL_TEXTURE_2D, gl.GL_TEXTURE_MAG_FILTER, gl.GL_NEAREST)
texture.width = 1024
texture.height = 768
texture.get_region(256, 192,771, 576)
texture.save('wham.png') # <- To save as JPG again, install PIL
Second attempt (with sprites, unfinished)
import pyglet, time
start = time.time() #DEBUG
texture = pyglet.image.load('test.jpg')
print('Loaded image in',time.time()-start,'sec') #DEBUG
sprite = pyglet.sprite.Sprite(texture)
print('Converted to sprite in',time.time()-start,'sec') #DEBUG
print(sprite.width) #DEBUG
# Gives: 6000
sprite.scale = 0.5
print('Rescaled image in',time.time()-start,'sec') #DEBUG
print(sprite.width) #DEBUG
# Gives: 3000
Both solutions end up around 3-5 seconds on an extremely slow PC with a shitty mechanical disk running under Windows XP with.. i can't even count the number of applications running including active virus scans etc.. But note that I can't remember how to save a sprite to disk, you need to access to AbstractImage data container within the sprite to get it out.
You will be heavily limited to your disk/memory-card I/O.
My image was 16MB 6000x4000 pixels.. Which i was suprised it whent as fast as 3 seconds to load.
Have you tried jpegtran. It provides for lossless cropping of jpeg. It should be in the libjpeg-progs package. I suspect that decoding the image to crop it, then re-encoding it is too much for the SD card to take.
I am working on stream generator for my video mapping set, but I am not able to get the image steady.
I open a v4l2loopback device with python-v4l2 and generate a video stream through it based on png, so can generate live video's in my vj set and still video map them and apply effects.
Test case:
1) load v4l2loopback module
2) run python:
import fcntl, numpy
from v4l2 import *
from PIL import Image
height = 600
width = 634
device = open('/dev/video4', 'wr')
print(device)
capability = v4l2_capability()
print(fcntl.ioctl(device, VIDIOC_QUERYCAP, capability))
print("v4l2 driver: " + capability.driver)
format = v4l2_format()
format.type = V4L2_BUF_TYPE_VIDEO_OUTPUT
format.fmt.pix.pixelformat = V4L2_PIX_FMT_RGB32
format.fmt.pix.width = width
format.fmt.pix.height = height
format.fmt.pix.field = V4L2_FIELD_NONE
format.fmt.pix.bytesperline = format.fmt.pix.width * 4
format.fmt.pix.sizeimage = format.fmt.pix.width * format.fmt.pix.height * 4
format.fmt.pix.colorspace = V4L2_COLORSPACE_SRGB
print(fcntl.ioctl(device, VIDIOC_S_FMT, format))
img = Image.open('img/0.png')
img = img.convert('RGBA')
while True:
device.write(numpy.array(img))
3) run Cheese or other v4l2 stream viewer.
The result is a proper colored and sized image, but it jumps every frame from left to right and always a little more to the left so you get a sliding and jumpy video result.
What am I doing wrong?
Best regards,
Harriebo
ps: if you woul like to see the results check: link So far the LiVES, puredate, gem video mapping setup is working greath with the v4l2 streams.
So I got it a sort of working, but not sure if it's the right way. What I need to do for a stable video stream:
1) don't use custom resolutions, they get messy.
2) send every frame twice. I think this has to do with interlacing / top / bottom frame.
3) for 640x480 shift all pixels 260 spaces to the left in the array, other wise the image is not straight, not for 1024x768 doh... not sure why this is.
4) play is at a slightly lower frame rate as the program can generate.
After all that it is a 99% stable every 10 sec. or so there is one buggy frame. I think it has to do that the framerate the program generates is not 100% stable.
Suggestions on why or how I can do this better are still welcome.
For updates see: https://github.com/umlaeute/v4l2loopback/issues/32
I have a small problem using the video creation capability of OpenCV.
For the same images, I get a weird output depending on the output size I want.
Here is an example of the results I can get.
http://www.youtube.com/watch?v=1wm8VjyfdyA&feature=youtu.be
I tried with several different sets of images, and on different computers.
It seems to run fine on Windows, and I have problems with the Opencv that ships in Ubuntu packages (current 2.3.1-7).
As the problem is not reproductible on my windows, I guess its was either fixed in the 2.4 or specific to Linux.
Here is a (python) test code that highlight the problem :
import os
import cv
in_dir = "../data/inputs/sample-test"
out = "output.avi"
# loading images, create Guys and store it into guys
frameSize = (652, 498)
#frameSize = (453, 325)
fourcc = cv.CV_FOURCC('F', 'M', 'P', '4')
my_video = cv.CreateVideoWriter(out,
fourcc,
15,
frameSize,
1)
for root, _, files in os.walk(in_dir):
for a_file in files:
guy_source = os.path.join(in_dir, a_file)
print guy_source
image = cv.LoadImage(guy_source)
small_im = cv.CreateImage(frameSize,
image.depth ,
image.nChannels)
cv.Resize(image, small_im, cv.CV_INTER_LINEAR)
cv.WriteFrame(my_video, small_im)
print "Finished !"
My concern is that depending on the output size, the video is fine (652, 498 is ok for example).
The behaviour is the same whatever codec I use.
If not a fix, I´d like some more information about the reason for this bug.
As I want to ship for Ubuntu, I´d better use their packaging system and keep the 2.3 for some time.
So I would like to know how I can wisely solve the problem, by choosing educated sizes.
Any information is welcome
Thx !
This is a common problem in video coding. As you can see, the image is shifted with a small amount to left each row.
As you may know, the image is saved as a long row of chars: BGRBGRBGR....
It is also defined by its width and height, and by step - the distance, in bytes, between two consecutive rows. A naive supposition is that the step is 3(channels)*width. But in addition, for memory alignment reasons, the image rows are padded with some extra bits, in order to make the step value a multiple of 4 (usually) or 16. The reason is that hardware codec acceleration works with aligned data - 32bit architectures read 32bits at once, and for SIMD processing, aligned data is loaded faster.
So the image will be represented as
BGRBGR00
BGRBGR00
Now, if a codec does not know of this padding, it will read the width of the image as 2, and will interpret the data as follows:
BGRBGR
00BGRB
0000BG // note the extra padding
To make sure you do not experience this issue, you should select image width in such a way that the step value (channels*width) is a multiple of four. All of the standard resolutions have this property, and this is one of the reasons they were selected so:
640x480
1024x768
etc