OpenCV read video files with multiple streams/tracks - python

I have a video file that contains multiple streams as shown below using VLC media player:
Video Information
When I try to read it using Python + OpenCV using the following code:
vidObj = cv2.VideoCapture("video.avi")
ret, frame = vidObj.read()
I can only read the first track of the video. How can I read all the video tracks at the same time?

As far as I could tell, OpenCV does not allow choosing video stream, so this is not possible. However, you can do it rather easily with ffmpeg command line utilities:
import numpy as np
import json
import subprocess
def videoInfo(filename):
proc = subprocess.run([
*"ffprobe -v quiet -print_format json -show_format -show_streams".split(),
filename
], capture_output=True)
proc.check_returncode()
return json.loads(proc.stdout)
def readVideo(filename):
cmd = ["ffmpeg", "-i", filename]
streams = 0
for stream in videoInfo(filename)["streams"]:
index = stream["index"]
if stream["codec_type"] == "video":
width = stream["width"]
height = stream["height"]
cmd += "-map", f"0:{index}"
streams = streams + 1
cmd += "-f", "rawvideo", "-pix_fmt", "rgb24", "-"
shape = np.array([streams, height, width, 3])
with subprocess.Popen(cmd, stdout=subprocess.PIPE) as proc:
while True:
data = proc.stdout.read(shape.prod()) # One byte per each element
if not data:
return
yield np.frombuffer(data, dtype=np.uint8).reshape(shape)
Note that the code reads all video streams and assumes that each has the same resolution. It lacks proper error handling but got the job done in my scientific project.
For example, reading stereoscopic stream:
import matplotlib.pyplot as plt
for left, right in readVideo("testvideo.mkv"):
plt.imshow(left)
plt.show()
plt.imshow(right)
plt.show()

Related

how to convert mp3 to wav and calculate psnr in python

i want to convert my mp3 to wav file so i can input it on psnr. but when i tried to convert it, i cannot do that. can someone check my code? i was trying this code to convert and the psnr code from here
import scipy.io.wavfile as wavfile
import os.path
from subprocess import call
if (not os.path.isfile(file)):
wavConvertCommand = \
[file,
"-i", fileMp3, "-acodec", "pcm_u8", "-ar", "22050", fileWav]
call(wavConvertCommand)
def snr(file):
if (os.path.isfile(file)):
data = wavfile.read(file)[1]
singleChannel = data
try:
singleChannel = numpy.sum(data, axis=1)
except:
# was mono after all
pass
norm = singleChannel / (max(numpy.amax(singleChannel), -1 * numpy.amin(singleChannel)))
return stats.signaltonoise(norm)
my file location
F:\KULIAH\SEMESTER8\SKRIPSI\MusicLockApp\media\mp3\ytmp3free.cc_bebe-rexha-meant-to-be-lyrics-ft-florida-georgia-line-youtubemp3free.org_1_2uxNfcC.mp3

Is there a way to check if script is running from subprocess?

Let's say I have a python script which reads all the images in a folder and resizes them. The script works all by his own, it takes in two arguments - the input folder and an output folder.
To have a visual response of the progress I'm using a progressbar which is printed out to the console/terminal.
resize.py:
import argparse
import fnmatch
import os
import PIL
from PIL import Image
from progress.bar import Bar
parser = argparse.ArgumentParser(description='Photo resizer.')
parser.add_argument('input_folder', nargs='?', default='', help="Input folder")
parser.add_argument('export_folder', nargs='?', default='', help="Output folder")
args = parser.parse_args()
if args.input_folder:
input_folder = args.input_folder
if args.export_folder:
export_resized_folder = args.export_folder
NEW_SIZE = 2000
inputFiles = []
for root, dirnames, filenames in os.walk(input_folder):
for filename in fnmatch.filter(filenames, '*.jpg'):
inputFiles.append(os.path.join(root, filename))
bar = Bar("Processing photos", max=len(inputFiles), check_tty=False)
for photo in inputFiles:
filename = os.path.basename(photo)
im = Image.open(photo)
im_width, im_height = im.size
if im_width > im_height:
new_width = NEW_SIZE
new_height = int(NEW_SIZE * im_height / im_width)
else:
new_height = NEW_SIZE
new_width = int(NEW_SIZE * im_width / im_height)
new_size = (new_width, new_height)
im_resized = im.resize(new_size, resample=PIL.Image.Resampling.LANCZOS)
im_resized.save(os.path.join(export_resized_folder, filename), quality=70)
bar.next()
bar.finish()
Now I have an another script (main_gui.py) which does some batch processing and one of the jobs is to resize the images. This script provides a simple GUI. When it comes to resizing the images, I use subprocess Popen to execute the script and pass in the input and output folders as args.
So in main_gui.py I start the subprocess:
script_path = "resize.py"
process = subprocess.Popen(["python", script_path, INPUT_FOLDER, OUTPUT_FOLDER], universal_newlines=True, stdout=subprocess.PIPE)
Now I'd like to see the progress in the GUI also. I don't know if I'm doing it correctly (It is a high probability that not, this is just the first thing that came to my mind)...
So in resize.py along with the progressbar I print out information about my progress and then read it in the main_gui.py and based on that information I update a tkinter progressbar.
In resize.py:
bar = Bar("Processing photos", max=len(inputFiles), check_tty=False)
print("**TOTAL** " + str(len(inputFiles)))
...
progressCounter = 1
for photo in inputFiles:
...
bar.next()
print("**PROGRESS** " + str(progressCounter))
progressCounter += 1
...
I read these values in main_gui.py
process = subprocess.Popen(["python", script_path], universal_newlines=True, stdout=subprocess.PIPE)
while process.poll() is None:
data = process.stdout.readline().strip()
print(data)
if "**TOTAL** " in data:
total = int(data.replace("**TOTAL** ", ""))
progressbarWidget['maximum'] = total
if "**PROGRESS** " in data and self.GUI:
progressCounter = int(data.replace("**PROGRESS** ", ""))
progressbarWidget['value'] = progressCounter
progressbarWidget.update_idletasks()
And at this point I'd like in my resize.py check if it is run by itself or by the subprocess, so I don't have the unnecessary print statements.
I tried pass in an env value as Charles suggested in the comments, but couldn't get it done
Trying to detect your parent process is an unnecessary amount of magic for this use case. Making it explicit with an optional argument will let others writing their own GUIs (potentially in non-Python languages) get the machine-readable status output without needing to try to fool the detection.
parser = argparse.ArgumentParser(description='Photo resizer.')
parser.add_argument('--progress', choices=('none', 'human', 'machine-readable'), default='none',
help="Should a progress bar be written to stderr in a human-readable form, to stdout in a machine-readable form, or not at all?")
parser.add_argument('input_folder', nargs='?', default='', help="Input folder")
parser.add_argument('export_folder', nargs='?', default='', help="Output folder")
args = parser.parse_args()
...and then later...
if args.progress == 'machine-readable':
pass # TODO: Write your progress messages for the programmatic consumer to stdout here
elif args.progress == 'human':
pass # TODO: Write your progress bar for a human reader to stderr here
while on the GUI side, adding --progress=human to the argument list:
process = subprocess.Popen([sys.executable, script_path, '--progress=human'],
universal_newlines=True, stdout=subprocess.PIPE)

Is it possible to use FFmpeg to cut "random" sections from a folder of videos and concat them into 1 video?

I realize this sounds like an easy question, and one that has been answered before. However, I cannot seem to find a script which can read a folder of videos with varying lengths, copy a random segment from each video, and concat them into a single video.
An example:
I have a folder with 150 videos labeled Fashion-Setlist-01.mp4, Fashion-Setlist-02.mp4, etc.
Each are over 1 hour. I would like to pull a random 10 seconds section from each video and then randomly add them together resulting in a video. This may seem easy with only a few videos, but the plan is to read from potentially 100's of videos. It should be possible to pull multiple sections from each video as well. I suppose we could run the script twice for more segments if the video needed to be longer.
moviepy is the most appropriate tool for this (it uses ffmpeg as a backend). Concatenating videos is trivial in moviepy:
import moviepy.editor
import os
import random
import fnmatch
directory = '/directory/to/videos/'
xdim = 854
ydim = 480
ext = "*mp4"
length = 10
outputs=[]
# compile list of videos
inputs = [os.path.join(directory,f) for f in os.listdir(directory) if os.path.isfile(os.path.join(directory, f)) and fnmatch.fnmatch(f, ext)]
for i in inputs:
# import to moviepy
clip = moviepy.editor.VideoFileClip(i).resize( (xdim, ydim) )
# select a random time point
start = round(random.uniform(0,clip.duration-length), 2)
# cut a subclip
out_clip = clip.subclip(start,start+length)
outputs.append(out_clip)
# combine clips from different videos
collage = moviepy.editor.concatenate_videoclips(outputs)
collage.write_videofile('out.mp4')
EDIT: Cutting video with ffmpeg needs to be done on a key-frame. I extensively edited this code to first find the key-frames, then cut around this. It works for me.
So to do this in bash, assuming there exists some program randtime.py which outputs a random starting time in 'H:MM:SS' format, and some other program which finds the video keyframe near a given time, here's a quick hack-version:
#! /usr/bin/env bash
CUTLIST=cut_list.txt
RANDTIME=~/Bin/randtime.py
KEYFRAMER=~/Bin/find_next_key_frame.py
count=0
echo "" > "$CUTLIST"
for file in *.mp4
do
count=$(( $count + 1 ));
outfile="cut_$count.mp4"
start_time=`python "$RANDTIME"`
# Find the next keyframe, at or after the random time
start_keyframe_time=`$KEYFRAMER "$file" "$start_time"`
if [ $? -eq 0 ]; then
echo "Nearest keyframe to \"$start_time\" is \"$start_keyframe_time\""
echo "ffmpeg -loglevel quiet -y -i \"$file\" -ss $start_keyframe_time -t 00:00:10 \"$outfile\""
ffmpeg -loglevel quiet -y -i "$file" -ss $start_keyframe_time -t 00:00:10 "$outfile"
if [ $? -ne 0 ]; then
echo "ffmpeg returned an error on [$file], aborting"
# exit 1
fi
echo "file '$outfile'" >> "$CUTLIST"
else
echo "ffprobe found no suitable key-frame near \"$start_time\""
fi
done
echo "Concatenating ... "
cat "$CUTLIST"
ffmpeg -f concat -i cut_list.txt -c copy all_cuts.mp4
if [ -f "$CUTLIST" ]; then
rm "$CUTLIST"
fi
And the random time, in python:
#! /usr/bin/env python3
import random
#TODO: ensure we're at least 8 seconds before 1 hour
hrs = 0 # random.randint(0,1)
mns = random.randint(0,59)
scs = random.randint(0,59)
print("%d:%02d:%02d" % (hrs,mns,scs))
And, and again in python find the keyframe exactly on, or just after the time given.
#! /usr/bin/env python3
import sys
import subprocess
import os
import os.path
FFPROBE='/usr/bin/ffprobe'
EXE=sys.argv[0].replace('\\','/').split('/')[-1]
if (not os.path.isfile(FFPROBE)):
sys.stderr.write("%s: The \"ffprobe\" part of FFMPEG seems to be missing\n" % (EXE))
sys.exit(1)
if (len(sys.argv) == 1 or (len(sys.argv)==2 and sys.argv[1] in ('--help', '-h', '-?', '/?'))):
sys.stderr.write("%s: Give video filename and time as arguments\n" % (EXE))
sys.stderr.write("%s: e.g.: video_file.mp4 0:25:14 \n" % (EXE))
sys.stderr.write("%s: Outputs the next keyframe at or after the given time\n" % (EXE))
sys.exit(1)
VIDEO_FILE = sys.argv[1]
FRAME_TIME = sys.argv[2].strip()
if (not os.path.isfile(VIDEO_FILE)):
sys.stderr.write("%s: The vdeo file \"%s\" seems to be missing\n" % (EXE, VIDEO_FILE))
sys.exit(1)
### Launch FFMPEG's ffprobe to identify the frames
command = "ffprobe -show_frames -pretty \"%s\"" % VIDEO_FILE
frame_list = subprocess.getoutput(command)
### The list of frames is a huge bunch of lines like:
### [FRAME]
### media_type=video
### key_frame=0
### best_effort_timestamp=153088
### best_effort_timestamp_time=0:00:09.966667
### pkt_duration_time=0:00:00.033333
### height=360
### ...
### [/FRAME]
### Parse the stats about each frame, keeping only the Video Keyframes
key_frames = []
for frame in frame_list.split("[FRAME]"):
# split the frame lines up into separate "x=y" pairs
frame_dict = {}
frame_vars = frame.replace('\r', '\n').replace("\n\n", '\n').split('\n')
for frame_pair in frame_vars:
if (frame_pair.find('=') != -1):
try:
var,value = frame_pair.split('=', 1)
frame_dict[var.strip()] = value.strip()
except:
sys.stderr.write("%s: Warning: Unable to parse [%s]\n" % (EXE, frame_pair))
# Do we want to keep this frame?
# we want video frames, that are key frames
if ("media_type", "key_frame" in frame_dict and frame_dict["media_type"] == "video" and frame_dict["key_frame"] == "1"):
key_frames.append(frame_dict)
### Throw away duplicates, ans sort (why are there duplicates?)
key_frame_list = set()
for frame_dict in key_frames:
#print(str(frame_dict))
if ("best_effort_timestamp_time" in frame_dict):
key_frame_list.add(frame_dict["best_effort_timestamp_time"])
key_frame_list = list(key_frame_list)
key_frame_list.sort()
sys.stderr.write("Keyframes found: %u, from %s -> %s\n" % (len(key_frame_list), key_frame_list[0], key_frame_list[-1]))
### Find the same, or next-larger keyframe
found = False
for frame_time in key_frame_list:
#sys.stderr.write("COMPARE %s > %s\n" % (frame_time , FRAME_TIME))
if (frame_time > FRAME_TIME):
print(frame_time)
found = True
break # THERE CAN BE ONLY ONE!
### Failed? Print something possibly useful
if (found == False):
sys.stderr.write("%s: Warning: No keyframe found\n" % (EXE))
print("0:00:00")
sys.exit(-1)
else:
sys.exit(0) # All'swell

How to play streaming audio using pyglet?

The goal of this question is trying to figure out how to play streaming audio using pyglet. The first is just making sure you're able to play mp3 files using pyglet, that's the purpose of this first snippet:
import sys
import inspect
import requests
import pyglet
from pyglet.media import *
pyglet.lib.load_library('avbin')
pyglet.have_avbin = True
def url_to_filename(url):
return url.split('/')[-1]
def download_file(url, filename=None):
filename = filename or url_to_filename(url)
with open(filename, "wb") as f:
print("Downloading %s" % filename)
response = requests.get(url, stream=True)
total_length = response.headers.get('content-length')
if total_length is None:
f.write(response.content)
else:
dl = 0
total_length = int(total_length)
for data in response.iter_content(chunk_size=4096):
dl += len(data)
f.write(data)
done = int(50 * dl / total_length)
sys.stdout.write("\r[%s%s]" % ('=' * done, ' ' * (50 - done)))
sys.stdout.flush()
url = "https://freemusicarchive.org/file/music/ccCommunity/DASK/Abiogenesis/DASK_-_08_-_Protocell.mp3"
filename = "mcve.mp3"
download_file(url, filename)
music = pyglet.media.load(filename)
music.play()
pyglet.app.run()
If you've installed the libraries pip install pyglet requests and also installed AVBin at this point you should be able to listen the mp3 once it's been downloaded.
Once we've reached this point, I'd like to figure out how to play & buffering the file in a similar way to mostly of the existing web video/audio players using pyglet+requests. This means playing the files without waiting till the file has been downloaded completely.
After reading the pyglet media docs you can see there are available these classes:
media
sources
base
AudioData
AudioFormat
Source
SourceGroup
SourceInfo
StaticSource
StreamingSource
VideoFormat
player
Player
PlayerGroup
I've seen there are another similar SO questions but they haven't been solved properly and their content doesn't provide a lot of relevant details:
Play streaming audio using pyglet
How can I play audio stream without saving it into the file with pyglet?
That's why I've created a new question. How do you play streaming audio using pyglet? Could you provide a little example using the above mcve as a base?
Assuming you don't want to import a new package to do this for you - this can be done with a bit of effort.
First, let's head over to the Pyglet source code and have a look at media.load in media/__init__.py.
"""Load a Source from a file.
All decoders that are registered for the filename extension are tried.
If none succeed, the exception from the first decoder is raised.
You can also specifically pass a decoder to use.
:Parameters:
`filename` : str
Used to guess the media format, and to load the file if `file` is
unspecified.
`file` : file-like object or None
Source of media data in any supported format.
`streaming` : bool
If `False`, a :class:`StaticSource` will be returned; otherwise
(default) a :class:`~pyglet.media.StreamingSource` is created.
`decoder` : MediaDecoder or None
A specific decoder you wish to use, rather than relying on
automatic detection. If specified, no other decoders are tried.
:rtype: StreamingSource or Source
"""
if decoder:
return decoder.decode(file, filename, streaming)
else:
first_exception = None
for decoder in get_decoders(filename):
try:
loaded_source = decoder.decode(file, filename, streaming)
return loaded_source
except MediaDecodeException as e:
if not first_exception or first_exception.exception_priority < e.exception_priority:
first_exception = e
# TODO: Review this:
# The FFmpeg codec attempts to decode anything, so this codepath won't be reached.
if not first_exception:
raise MediaDecodeException('No decoders are available for this media format.')
raise first_exception
add_default_media_codecs()
The critical line here is loaded_source = decoder.decode(...). Essentially, to load audio Pyglet takes a file and hauls it over to a media decoder (eg. FFMPEG), which then returns a list of 'frames' or packets that Pyglet can play with a built-in Player class. If the audio format is compressed (eg. mp3 or aac), Pyglet will use an external library (currently only AVBin is supported) to convert it to raw, decompressed audio. You probably already know some of this.
So if we want to see how we can stuff a stream of bytes into Pyglet's audio engine rather than a file, we'll need to take a look at one of the decoders. For this example, let's use FFMPEG as it's the easiest to access.
In media/codecs/ffmpeg.py:
class FFmpegDecoder(object):
def get_file_extensions(self):
return ['.mp3', '.ogg']
def decode(self, file, filename, streaming):
if streaming:
return FFmpegSource(filename, file)
else:
return StaticSource(FFmpegSource(filename, file))
The 'object' it inherits from is MediaDecoder, found in media/codecs/__init__.py. Back at the load function in media/__init__.py, you'll see pyglet will choose a MediaDecoder based on file extension, then return its decode function with the file as a parameter to get the audio in the form of a packet stream. That packet stream is a Source object; each decoder has its own flavor, in the form of StaticSource or StreamingSource. The former is used to store audio in memory, and the latter to play it immediately. FFmpeg's decoder only supports StreamingSource.
We can see that FFMPEG's is FFmpegSource, also located in media/codecs/ffmpeg.py. We find this Goliath of a class:
class FFmpegSource(StreamingSource):
# Max increase/decrease of original sample size
SAMPLE_CORRECTION_PERCENT_MAX = 10
def __init__(self, filename, file=None):
if file is not None:
raise NotImplementedError('Loading from file stream is not supported')
self._file = ffmpeg_open_filename(asbytes_filename(filename))
if not self._file:
raise FFmpegException('Could not open "{0}"'.format(filename))
self._video_stream = None
self._video_stream_index = None
self._audio_stream = None
self._audio_stream_index = None
self._audio_format = None
self.img_convert_ctx = POINTER(SwsContext)()
self.audio_convert_ctx = POINTER(SwrContext)()
file_info = ffmpeg_file_info(self._file)
self.info = SourceInfo()
self.info.title = file_info.title
self.info.author = file_info.author
self.info.copyright = file_info.copyright
self.info.comment = file_info.comment
self.info.album = file_info.album
self.info.year = file_info.year
self.info.track = file_info.track
self.info.genre = file_info.genre
# Pick the first video and audio streams found, ignore others.
for i in range(file_info.n_streams):
info = ffmpeg_stream_info(self._file, i)
if isinstance(info, StreamVideoInfo) and self._video_stream is None:
stream = ffmpeg_open_stream(self._file, i)
self.video_format = VideoFormat(
width=info.width,
height=info.height)
if info.sample_aspect_num != 0:
self.video_format.sample_aspect = (
float(info.sample_aspect_num) /
info.sample_aspect_den)
self.video_format.frame_rate = (
float(info.frame_rate_num) /
info.frame_rate_den)
self._video_stream = stream
self._video_stream_index = i
elif (isinstance(info, StreamAudioInfo) and
info.sample_bits in (8, 16) and
self._audio_stream is None):
stream = ffmpeg_open_stream(self._file, i)
self.audio_format = AudioFormat(
channels=min(2, info.channels),
sample_size=info.sample_bits,
sample_rate=info.sample_rate)
self._audio_stream = stream
self._audio_stream_index = i
channel_input = avutil.av_get_default_channel_layout(info.channels)
channels_out = min(2, info.channels)
channel_output = avutil.av_get_default_channel_layout(channels_out)
sample_rate = stream.codec_context.contents.sample_rate
sample_format = stream.codec_context.contents.sample_fmt
if sample_format in (AV_SAMPLE_FMT_U8, AV_SAMPLE_FMT_U8P):
self.tgt_format = AV_SAMPLE_FMT_U8
elif sample_format in (AV_SAMPLE_FMT_S16, AV_SAMPLE_FMT_S16P):
self.tgt_format = AV_SAMPLE_FMT_S16
elif sample_format in (AV_SAMPLE_FMT_S32, AV_SAMPLE_FMT_S32P):
self.tgt_format = AV_SAMPLE_FMT_S32
elif sample_format in (AV_SAMPLE_FMT_FLT, AV_SAMPLE_FMT_FLTP):
self.tgt_format = AV_SAMPLE_FMT_S16
else:
raise FFmpegException('Audio format not supported.')
self.audio_convert_ctx = swresample.swr_alloc_set_opts(None,
channel_output,
self.tgt_format, sample_rate,
channel_input, sample_format,
sample_rate,
0, None)
if (not self.audio_convert_ctx or
swresample.swr_init(self.audio_convert_ctx) < 0):
swresample.swr_free(self.audio_convert_ctx)
raise FFmpegException('Cannot create sample rate converter.')
self._packet = ffmpeg_init_packet()
self._events = [] # They don't seem to be used!
self.audioq = deque()
# Make queue big enough to accomodate 1.2 sec?
self._max_len_audioq = 50 # Need to figure out a correct amount
if self.audio_format:
# Buffer 1 sec worth of audio
self._audio_buffer = \
(c_uint8 * ffmpeg_get_audio_buffer_size(self.audio_format))()
self.videoq = deque()
self._max_len_videoq = 50 # Need to figure out a correct amount
self.start_time = self._get_start_time()
self._duration = timestamp_from_ffmpeg(file_info.duration)
self._duration -= self.start_time
# Flag to determine if the _fillq method was already scheduled
self._fillq_scheduled = False
self._fillq()
# Don't understand why, but some files show that seeking without
# reading the first few packets results in a seeking where we lose
# many packets at the beginning.
# We only seek back to 0 for media which have a start_time > 0
if self.start_time > 0:
self.seek(0.0)
---
[A few hundred lines more...]
---
def get_next_video_timestamp(self):
if not self.video_format:
return
if self.videoq:
while True:
# We skip video packets which are not video frames
# This happens in mkv files for the first few frames.
video_packet = self.videoq[0]
if video_packet.image == 0:
self._decode_video_packet(video_packet)
if video_packet.image is not None:
break
self._get_video_packet()
ts = video_packet.timestamp
else:
ts = None
if _debug:
print('Next video timestamp is', ts)
return ts
def get_next_video_frame(self, skip_empty_frame=True):
if not self.video_format:
return
while True:
# We skip video packets which are not video frames
# This happens in mkv files for the first few frames.
video_packet = self._get_video_packet()
if video_packet.image == 0:
self._decode_video_packet(video_packet)
if video_packet.image is not None or not skip_empty_frame:
break
if _debug:
print('Returning', video_packet)
return video_packet.image
def _get_start_time(self):
def streams():
format_context = self._file.context
for idx in (self._video_stream_index, self._audio_stream_index):
if idx is None:
continue
stream = format_context.contents.streams[idx].contents
yield stream
def start_times(streams):
yield 0
for stream in streams:
start = stream.start_time
if start == AV_NOPTS_VALUE:
yield 0
start_time = avutil.av_rescale_q(start,
stream.time_base,
AV_TIME_BASE_Q)
start_time = timestamp_from_ffmpeg(start_time)
yield start_time
return max(start_times(streams()))
#property
def audio_format(self):
return self._audio_format
#audio_format.setter
def audio_format(self, value):
self._audio_format = value
if value is None:
self.audioq.clear()
The line you'll be interested in here is self._file = ffmpeg_open_filename(asbytes_filename(filename)). This brings us here, once again in media/codecs/ffmpeg.py:
def ffmpeg_open_filename(filename):
"""Open the media file.
:rtype: FFmpegFile
:return: The structure containing all the information for the media.
"""
file = FFmpegFile() # TODO: delete this structure and use directly AVFormatContext
result = avformat.avformat_open_input(byref(file.context),
filename,
None,
None)
if result != 0:
raise FFmpegException('Error opening file ' + filename.decode("utf8"))
result = avformat.avformat_find_stream_info(file.context, None)
if result < 0:
raise FFmpegException('Could not find stream info')
return file
and this is where things get messy: it calls to a ctypes function (avformat_open_input) that when given a file, will grab its details and fill out all the information it needs for our FFmpegSource class. With some work, you should be able to get avformat_open_input to take a bytes object rather than a path to a file which it will open to get the same information. I'd love to do this and include a working example, but I don't have the time right now. You'd then need to make a new ffmpeg_open_filename function utilizing the new avformat_open_input function, and then a new FFmpegSource class utilizing the new ffmpeg_open_filename function. All you need now is a new FFmpegDecoder class utilizing the new FFmpegSource class.
You could then implement this by adding it to your pyglet package directly. After, you'd want to add support for a byte object argument in the load() function (located in media/__init__.py and override the decoder to your new one. And there, you would now be able to stream audio without saving it.
Or, you could simply use a package that already supports it. Python-vlc does. You could use the example here to play whatever audio you'd like from a link. If you aren't doing this just for a challenge, I would strongly recommend you use another package. Otherwise: good luck.

Pytesser inaccurate

Simple question. When I run this image through pytesser, i get $+s. How can I fix that?
EDIT
So... my code generates images similar to the image linked above, just with different numbers, and is supposed to solve the simple math problem, which is obviously impossible if all I can get out of the picture is $+s
Here's the code I'm currently using:
from pytesser import *
time.sleep(2)
i = 0
operator = "+"
while i < 100:
time.sleep(.1);
img = ImageGrab.grab((349, 197, 349 + 452, 197 + 180))
equation = image_to_string(img)
Then I'm going to go on to parse equation... as soon as I get pytesser working.
Try my little function. I'm running tesseract from the svn repo, so my results might be more accurate.
I'm on Linux, so on Windows, I'd imagine that you'll have to replace tesseract with tesseract.exe to make it work.
import tempfile, subprocess
def ocr(image):
tempFile = tempfile.NamedTemporaryFile(delete = False)
process = subprocess.Popen(['tesseract', image, tempFile.name], stdout = subprocess.PIPE, stdin = subprocess.PIPE, stderr = subprocess.STDOUT)
process.communicate()
handle = open(tempFile.name + '.txt', 'r').read()
return handle
And a sample Python session:
>>> import tempfile, subprocess
>>> def ocr(image):
... tempFile = tempfile.NamedTemporaryFile(delete = False)
... process = subprocess.Popen(['tesseract', image, tempFile.name], stdout = subprocess.PIPE, stdin = subprocess.PIPE, stderr = subprocess.STDOUT)
... process.communicate()
... handle = open(tempFile.name + '.txt', 'r').read()
... return handle
...
>>> print ocr('326_fail.jpg')
0+1
if you're in linux, use gocr is more accurate. you can use it through
os.system("/usr/bin/gocr %s") % (sample_image)
and use readlines from stdout for manipulating output result to everything what you want (i.e creating output from gocr for specific variable).

Categories