As far as I understand ffmpeg-python is main package in Python to operate ffmpeg directly.
Now I want to take a video and save it's frames as separate files at some fps.
There are plenty of command line ways to do it, e.g. ffmpeg -i video.mp4 -vf fps=1 img/output%06d.png described here
But I want to do it in Python. Also there are solutions [1] [2] that use Python's subprocess to call ffmpeg CLI, but it looks dirty for me.
Is there any way to to make it using ffmpeg-python?
The following works for me:
ffmpeg
.input(url)
.filter('fps', fps='1/60')
.output('thumbs/test-%d.jpg',
start_number=0)
.overwrite_output()
.run(quiet=True)
I'd suggest you try imageio module and use the following code as a starting point:
import imageio
reader = imageio.get_reader('imageio:cockatoo.mp4')
for frame_number, im in enumerate(reader):
# im is numpy array
if frame_number % 10 == 0:
imageio.imwrite(f'frame_{frame_number}.jpg', im)
You can also use openCV for that.
Reference code:
import cv2
video_capture = cv2.VideoCapture("your_video_path")
video_capture.set(cv2.CAP_PROP_FPS, <your_desired_fps_here>)
saved_frame_name = 0
while video_capture.isOpened():
frame_is_read, frame = video_capture.read()
if frame_is_read:
cv2.imwrite(f"frame{str(saved_frame_name)}.jpg", frame)
saved_frame_name += 1
else:
print("Could not read the frame.")
#norus solution is actually good, but for me it was missing the ss and r parameters in the input. I used a local file instead of a url.
This is my solution:
ffmpeg.input(<path/to/file>, ss = 0, r = 1)\
.filter('fps', fps='1/60')\
.output('thumbs/test-%d.jpg', start_number=0)\
.overwrite_output()]
.run(quiet=True)
ss is the starting second in the above code starts on 0
r is the ration, because the filter fps is set to 1/60 an r of 1 will return 1 frame per second, of 2 1 frame every 2 seconds, 0.5 a frame every half second....
Related
I have 20 images in a folder.
I want to load first two images and process, then load the next two images and process and so on.
I want to know how to achieve this in python openCV
Sequence to follow; Load image 1, 2 > process (i will do this bit), then load image 2, 3 > process, 3, 4 > process, 4,5 > process...so on
I don't really know if you just want to process them 2 by 2 or 2 at the same time so here is both!
Process 2 by 2 sequentially:
import os
import cv2
files = os.listdir('<image_folder>')
for i in range(0, len(files), 2):
image1 = cv2.imread(files[i])
image2 = cv2.imread(files[i+1])
process(image1)
process(image2)
Process 2 at the same time:
A useful tool is the map function in python's multiprocessing library. It's actually very simple to use, example:
from multiprocessing import Pool
p = Pool(2)
for i in range(0, len(files), 2):
p.map(process, [cv2.imread(files[i]),
cv2.imread(files[i+1])])
The list holds your elements and you're trying to apply function process to each of those elements in parallel. p.map will do that for you no problem!
Good luck!
import glob2
import cv2
images = glob2.glob('imageFolder/*.jpg')
images = list(zip(images, images[1:] + images[:1]))
for item in images:
img1 = cv2.imread(item[0])
img2 = cv2.imread(item[1])
#process here
On the client side, I am sending a blob audio (wav) file. On the server side, I am trying to convert the blob file to an audio wav file. I did the following:
blob = request.FILES['file']
name = "TEST.wav"
audio = wave.open(name, 'wb')
audio.setnchannels(1)
audio.writeframes(blob.read())
I thought that converting the blob would be similar to converting a blob image to a jpeg file, but was very incorrect in that assumption. That didn't work; I get an error - "Error: sample width not specified." I then used setsampwidth() and tossed in an arbitrary number between 1 and 4 (after looking at the wave.py source file...I don't know why the bytes have to be between 1 and 4). After that another error is thrown - "Error: sampling rate not specified." How do I specify the sampling rate?
What does the setnchannels(), setsampwidth() methods do? Is there an "easy" way I generate the wav file from the blob?
Previously, I never do it before.. but, in my test this script below is worked well for me.. (But the audio output isn't same like original file).
>>> nchannels = 2
>>> sampwidth = 2
>>> framerate = 8000
>>> nframes = 100
>>>
>>> import wave
>>>
>>> name = 'output.wav'
>>> audio = wave.open(name, 'wb')
>>> audio.setnchannels(nchannels)
>>> audio.setsampwidth(sampwidth)
>>> audio.setframerate(framerate)
>>> audio.setnframes(nframes)
>>>
>>> blob = open("original.wav").read() # such as `blob.read()`
>>> audio.writeframes(blob)
>>>
I found this method at https://stackoverflow.com/a/3637480/6396981
Finally, by changing the value of nchannels and sampwidth with 1. and I got an audio that same with original file.
nchannels = 1
sampwidth = 1
framerate = 8000
nframes = 1
Tested under Python2, and got an error UnicodeDecodeError: 'utf-8' codec can't decode byte 0x95 in position 4: invalid start byte on Python3.
I have encountered the same problem as well. My problem was having a low pitched output compared to the original. I manage to reverse engineer the original audio to get the nframes, samplerate, and sampwidth using getnframes(),getframerate(), and getsampwidth() respectively. At last, I managed to tweak the sample frequency/ frame rate to somehow bring the perfect tone.
The tweaking became perfect at a certain offset frequency than the original. Mine worked fine at an offset sum of the sixteenth of the original samplerate.
i.e.
OffsetFrequency = OriginalFrequency/16
Frequency = OriginalFrequency + OffsetFrequency
I have a series of images in *.tif format that I want to use to create a video. I am using OpenCV 3.1.0 in Python 2.7. Below is a snippet of my code:
import cv2
import numpy as np
nIMAGES = 10
files = glob.glob(DIR + '\\' + tpname +'\\*.tif' )
image_stack = np.empty((500, 220, nIMAGES))
mov = DIR + '\\' + tpname + '\\' + tpname + '_mov.avi'
MOV = cv2.VideoWriter(filename=mov, fourcc=cv2.VideoWriter_fourcc('F', 'M', 'P', '4'), fps=2, frameSize=(220, 500)) # frame size is 220 x 500
for i in np.arange(0, nIMAGES):
print 'Working on: ' + files[i][-14:-4]
image = cv2.imread(files[i], 0)
crop_image = image[50:550, 252:472] #crop y:h, x:w
# now let's create the movie:
crop_image = cv2.applyColorMap(crop_image, cv2.COLORMAP_JET)
MOV.write(crop_image)
MOV.release()
When I run this code, I create an AVI file that is 0 Kb (it hasn't saved anything to it).
I believe I am missing something like frame = cv2.VideoCapture.read(crop_image) in which case I would convert the write line to MOV.write(frame). However, I get an AttributeError in that the VideoCapture.read is not an attribute.
I am using this OpenCV webpage as my guide: http://docs.opencv.org/2.4/modules/highgui/doc/reading_and_writing_images_and_video.html
I had to make two changes to get this to work:
Following the advice on this question: OpenCV 2.4 VideoCapture not working on Windows I had to copy the fmpeg.dll files to my Python directory (C:\Anaconda). I also had to relabel the folders to include the version of my opencv (3.1.0) e.g. opencv_ffmpeg310_64.dll
The other change I needed to make was to change my codec to MJPG
Much appreciation goes to #Micka for helping me very quickly.
I have code using pytesseract and work perfect, only don't work when the image I try to recognize are 0 to 9. If image only have one digit don't give any result.
This a sample of image I'm working
https://drive.google.com/folderview?id=0B68PDhV5SW8BdFdWYVRwODBVZk0&usp=sharing
And this the code I'm using
import pytesseract
varnum= pytesseract.image_to_string(Image.open('images/table/img.jpg'))
varnum = float(varnum)
print varnum
Thanks!!!!
With this code I'm able to read all numbers
import pytesseract
start_time = time.clock()
y = pytesseract.image_to_string(Image.open('images/table/1.jpg'),config='-psm 10000')
x = pytesseract.image_to_string(Image.open('images/table/1.jpg'),config='-psm 10000')
print y
print x
y = pytesseract.image_to_string(Image.open('images/table/68.5.jpg'),config='-psm 10000')
x = pytesseract.image_to_string(Image.open('images/table/68.5.jpg'),config='-psm 10000')
print y
print x
print time.clock() - start_time, "seconds"
result
>>>
1
1
68.5
68.5
0.485644155358 seconds
>>>
You would need to set the Page Segmentation mode to be able to read single character/digits.
From the tesseract-ocr manual (which is what pytesseract internally uses), you can set the page segmentation mode using -
-psm N
Set Tesseract to only run a subset of layout analysis and assume a
certain form of image. The options for N are:
10 = Treat the image as a single character.
So you should set the -psm option to 10. Example -
varnum= pytesseract.image_to_string(Image.open('images/table/img.jpg'),config='-psm 10')
Is there a way to read in a bmp file in Python that does not involve using PIL? PIL doesn't work with version 3, which is the one I have. I tried to use the Image object from graphics.py, Image(anchorPoint, filename), but that only seems to work with gif files.
In Python it can simply be read as:
import os
from scipy import misc
path = 'your_file_path'
image= misc.imread(os.path.join(path,'image.bmp'), flatten= 0)
## flatten=0 if image is required as it is
## flatten=1 to flatten the color layers into a single gray-scale layer
I realize that this is an old question, but I found it when solving this problem myself and I figured that this might help someone else in the future.
It's pretty easy actually to read a BMP file as binary data. Depending on how broad support and how many corner-cases you need to support of course.
Below is a simple parser that ONLY works for 1920x1080 24-bit BMP's (like ones saved from MS Paint). It should be easy to extend though. It spits out the pixel values as a python list like (255, 0, 0, 255, 0, 0, ...) for a red image as an example.
If you need more robust support there's information on how to properly read the header in answers to this question: How to read bmp file header in python?. Using that information you should be able to extend the simple parser below with any features you need.
There's also more information on the BMP file format over at wikipedia https://en.wikipedia.org/wiki/BMP_file_format if you need it.
def read_rows(path):
image_file = open(path, "rb")
# Blindly skip the BMP header.
image_file.seek(54)
# We need to read pixels in as rows to later swap the order
# since BMP stores pixels starting at the bottom left.
rows = []
row = []
pixel_index = 0
while True:
if pixel_index == 1920:
pixel_index = 0
rows.insert(0, row)
if len(row) != 1920 * 3:
raise Exception("Row length is not 1920*3 but " + str(len(row)) + " / 3.0 = " + str(len(row) / 3.0))
row = []
pixel_index += 1
r_string = image_file.read(1)
g_string = image_file.read(1)
b_string = image_file.read(1)
if len(r_string) == 0:
# This is expected to happen when we've read everything.
if len(rows) != 1080:
print "Warning!!! Read to the end of the file at the correct sub-pixel (red) but we've not read 1080 rows!"
break
if len(g_string) == 0:
print "Warning!!! Got 0 length string for green. Breaking."
break
if len(b_string) == 0:
print "Warning!!! Got 0 length string for blue. Breaking."
break
r = ord(r_string)
g = ord(g_string)
b = ord(b_string)
row.append(b)
row.append(g)
row.append(r)
image_file.close()
return rows
def repack_sub_pixels(rows):
print "Repacking pixels..."
sub_pixels = []
for row in rows:
for sub_pixel in row:
sub_pixels.append(sub_pixel)
diff = len(sub_pixels) - 1920 * 1080 * 3
print "Packed", len(sub_pixels), "sub-pixels."
if diff != 0:
print "Error! Number of sub-pixels packed does not match 1920*1080: (" + str(len(sub_pixels)) + " - 1920 * 1080 * 3 = " + str(diff) +")."
return sub_pixels
rows = read_rows("my image.bmp")
# This list is raw sub-pixel values. A red image is for example (255, 0, 0, 255, 0, 0, ...).
sub_pixels = repack_sub_pixels(rows)
Use pillow for this. After you installed it simply import it
from PIL import Image
Then you can load the BMP file
img = Image.open('path_to_file\file.bmp')
If you need the image to be a numpy array, use np.array
img = np.array(Image.open('path_to_file\file.bmp'))
The numpy array will only be 1D. Use reshape() to bring it into the right shape in case your image is RGB. For example:
np.array(Image.open('path_to_file\file.bmp')).reshape(512,512,3)
I had to work on a project where I needed to read a BMP file using python, it was quite interesting, actually the best way is to have a review on the BMP file format (https://en.wikipedia.org/wiki/BMP_file_format) then reading it as binairy file, to extract the data.
You will need to use the struct python library to perform the extraction
You can use this tutorial to see how it proceeds https://youtu.be/0Kwqdkhgbfw
Use the excellent matplotlib library
import matplotlib.pyplot as plt
im = plt.imread('image.bmp')
It depends what you are trying to achieve and on which platform?
Anyway using a C library to load BMP may work e.g. http://code.google.com/p/libbmp/ or http://freeimage.sourceforge.net/, and C libraries can be easily called from python e.g. using ctypes or wrapping it as a python module.
or you can compile this version of PIL https://github.com/sloonz/pil-py3k
If you're doing this in Windows, this site, should allow you to get PIL (and many other popular packages) up and running with most versions of Python: Unofficial Windows Binaries for Python Extension Packages
The common port of PIL to Python 3.x is called "Pillow".
Also I would suggest pygame library for simple tasks. It is a library, full of features for creating games - and reading from some common image formats is among them. Works with Python 3.x as well.