Merge audio (m4s) segments into one - python

I recently started learning Laravel, and currently watching an online course. Online courses are fine, but I like to have local copies, so I'm trying to download/merge segmented audio from Laracasts: Laravel 8 From Scratch series.
I've written some scripts (in Python) that does the following:
Download the master.json
Read master.json and download audio segments
Merge the segments into a single file (the file is not playable yet)
Process the audio file via ffmpeg (now it's playable, but has issues)
I think there's a problem with the step 3 and/or 4.
In step/script 3, I create a new file, and add the contents of the segments to the file in binary.
Then (step/script 4), run a ffmpeg command in python: ffmpeg -i merged-file.mp4 -c copy processed-file.mp4
However, the final file doesn't work/play as expected. There's a delay in the beginning, and some parts seem to be cut off/skipped.
There are three possibilities:
Segment files are problematic (not likely?)
I'm doing the merging wrong
I'm doing the ffmpeg processing wrong
Can someone guide me here?
The issues/colored parts in the ffmpeg output are:
...
[mov,mp4,m4a,3gp,3g2,mj2 # 000001cfbc0de780] could not find corresponding track id 2
[mov,mp4,m4a,3gp,3g2,mj2 # 000001cfbc0de780] could not find corresponding trex (id 2)
...
[aac # 000001cfbc0f0380] Number of bands (31) exceeds limit (6).
...
[mp4 # 000001cfbc20ecc0] track 0: codec frame size is not set
...
[mp4 # 000001cfbc20ecc0] Non-monotonous DTS in output stream 0:0; previous: 318318, current: 286286; changing to 318319. This may result in incorrect timestamps in the output file.
...
Everything required for a test case is located in GitHub (akinuri/dump/m4s-segments/). Screenshot of the contents:
Note: there are two types/formats of audio in the master.json: mp42 and dash. dash works as expected, and seem to be used in limited videos/courses. On the other hand, mp42 appears more. So I need a way to make mp42 work.

Related

MoviePy - getting progress bar values

I am running a python script which converts a video file to a audio clip using moviepy.
def convert(mp3_file,mp4_file):
videoclip = VideoFileClip(mp4_file)
audioclip = videoclip.audio
audioclip.write_audiofile(mp3_file)
audioclip.close()
videoclip.close()
I found out that Moviepy uses a library called Proglog to print a command line progress bar.
How do I get these process completion percentage values?
WARNING: Bruteforce ahead!
It is possible, however, moviepy has no official method to do this.
We can get the progress inside the source code of the module
we need to edit the module code and follow the steps. [This is for ubuntu linux]
Step 1 -
go to the ffmpg_writter.py [usually in /home/anipr/.local/lib/python3.10/site-packages/moviepy/video/io/ffmpeg_writer.py]
Step 2 -
go to line 221 and add calculate the percentage based on duration - get_the_persentage_of_progress = (t/clip.duration)*100
Step - 3
do anything with the variable get_the_persentage_of_progress
however, getting the variable back to our original code is a bit tricky.
here are a few ideas -
write it to a file.txt
insert into database
[let me know in comments if any better idea]

Get 'processed' terminal output in Python

I have some bytes that have escape sequences that move the cursor around, which causes the terminal to overwrite some text in the output. I want to "process" the bytes so that I only get the final output that appears when I print them out. For example:
b = b'Using default tag: latest\r\nlatest: Pulling from library/hello-world\r\n\r\n\x1b[1A\x1b[2K\r0e03bdcc26d7: Pulling fs layer \r\x1b[1B\x1b[1A\x1b[2K\r0e03bdcc26d7: Downloading 423B/2.529kB\r\x1b[1B\x1b[1A\x1b[2K\r0e03bdcc26d7: Download complete \r\x1b[1B\x1b[1A\x1b[2K\r0e03bdcc26d7: Extracting 2.529kB/2.529kB\r\x1b[1B\x1b[1A\x1b[2K\r0e03bdcc26d7: Extracting 2.529kB/2.529kB\r\x1b[1B\x1b[1A\x1b[2K\r0e03bdcc26d7: Pull complete \r\x1b[1BDigest: sha256:31b9c7d48790f0d8c50ab433d9c3b7e17666d6993084c002c2ff1ca09b96391d\r\nStatus: Downloaded newer image for hello-world:latest\r\ndocker.io/library/hello-world:latest\r\n'
s = b.decode("utf-8")
print(s)
The output of the above script is:
Using default tag: latest
latest: Pulling from library/hello-world
Digest: sha256:31b9c7d48790f0d8c50ab433d9c3b7e17666d6993084c002c2ff1ca09b96391d
Status: Downloaded newer image for hello-world:latest
docker.io/library/hello-wo
I want to get the above output as a string inside python, as the string you see above, instead of with escape sequences and overwritten text. You can see the difference when you compare the lengths:
b = b'Using default tag: latest\r\nlatest: Pulling from library/hello-world\r\n\r\n\x1b[1A\x1b[2K\r0e03bdcc26d7: Pulling fs layer \r\x1b[1B\x1b[1A\x1b[2K\r0e03bdcc26d7: Downloading 423B/2.529kB\r\x1b[1B\x1b[1A\x1b[2K\r0e03bdcc26d7: Download complete \r\x1b[1B\x1b[1A\x1b[2K\r0e03bdcc26d7: Extracting 2.529kB/2.529kB\r\x1b[1B\x1b[1A\x1b[2K\r0e03bdcc26d7: Extracting 2.529kB/2.529kB\r\x1b[1B\x1b[1A\x1b[2K\r0e03bdcc26d7: Pull complete \r\x1b[1BDigest: sha256:31b9c7d48790f0d8c50ab433d9c3b7e17666d6993084c002c2ff1ca09b96391d\r\nStatus: Downloaded newer image for hello-world:latest\r\ndocker.io/library/hello-world:latest\r\n'
s = b.decode("utf-8")
desired_output = """Using default tag: latest
latest: Pulling from library/hello-world
Digest: sha256:31b9c7d48790f0d8c50ab433d9c3b7e17666d6993084c002c2ff1ca09b96391d
Status: Downloaded newer image for hello-world:latest
docker.io/library/hello-world:latest"""
print(len(s), len(desired_output))
Output:
544 238
One is almost two times longer than the other, even though print(s) and print(desired_output) result in the same text appearing in the terminal.
Initially I used a pyte, a terminal emulator library, where I could feed the bytes into a emulated terminal and "screengrab" the terminal for raw text afterwards. Unfortunately this method was really slow (up to 7s for 1 million bytes).
There must be a faster way, just because my actual terminal is able to process that many bytes in much less time.
I've been searching for a solution for a while now, so I would also really appreciate any leads or partial answers.

pdflatex hang after large number of figures

I have a script that generates a number of figures and puts them in the appendix of a report, e.g.
Appendix
********
.. figure:: images/generated/image_1.png
.. figure:: images/generated/image_2.png
.. figure:: images/generated/image_3.png
... etc
It looks like after a large number (~50) of images, my pdflatex command will hang, and point to one of the graphics in my .tex file around here
...
\begin(figure)[htbp]
\centering
\noindent\sphinxincludegraphics{{image_49}.png}
\end{figure}
\begin(figure)[htbp]
\centering
\noindent\sphinxincludegraphics{{image_50}.png} <--- here
\end{figure}
\begin(figure)[htbp]
\centering
\noindent\sphinxincludegraphics{{image_51}.png}
\end{figure}
...
When pdflatex fails I can't really figure out what to make from the console output, I get a number of these lines which seem to be good news
<image_48.png, id=451, 411.939pt x 327.3831pt>
File: image_48.png Graphic file (type png)
<use image_48.png>
Package pdftex.def Info: image_48.png used on input line 1251.
(pdftex.def) Requested size: 411.93797pt x 327.3823pt.
<image_49.png, id=452, 411.939pt x 327.3831pt>
File: image_49.png Graphic file (type png)
<use image_49.png>
Package pdftex.def Info: image_49.png used on input line 1257.
(pdftex.def) Requested size: 411.93797pt x 327.3823pt.
Then after the last successful image (~50) it starts outputting
! Output loop---100 consecutive dead cycles.
\end#float ...loatpenalty <-\#Mii \penalty -\#Miv
\#tempdima \prevdepth \vbo...
l.1258 \end{figure}
I've concluded that your \output is awry; it never does a
\shipout, so I'm shipping \box255 out myself. Next time
increase \maxdeadcycles if you want me to be more patient!
[9
! Undefined control sequence.
\reserved#a ->\#nil
l.1258 \end{figure}
The control sequence at the end of the top line
of your error message was never \def'ed. If you have
misspelled it (e.g., `\hobx'), type `I' and the correct
spelling (e.g., `I\hbox'). Otherwise just continue,
and I'll forget about whatever was undefined.
If all I do is reduce the number of figures, it will run and produce a pdf without issue. Is there a hard limit to the number of images a section can have? Is there somewhere else I can look in the build log to narrow down why this is happening?
This seemed to be a combination of a couple things.
The first symptom was essentially an error caused by too many unprocessed floats. The fix for this was to add the following to the babel element of latex_elements
\usepackage[maxfloats=256]{morefloats}
The second symptom was complaining about Output loop---100 consecutive dead cycles. so the fix was simply to increase the number of cycles
\maxdeadcycles=1000
After these two adjustments, the pdflatex command will finish successfully now, even with a large number of figures.
I had this problem and the above suggestions did not work. I was however able to get it to run just fine by inserting subsections which may or may not compatible with your objectives. The script generates code as follows which is then input into another code snippet to preview the generated images,
( I'm generating svg plots from c++, converting to png, and previewing essentially raw data for selection into later plots that go into an actual document not just a collection of images )
\subsection{svghappy2.tyrosine.png}
\begin{figure}[htbp]
\testplot{svghappy2_tyrosine.png}
\caption{svghappy2.tyrosine.png}
\end{figure}
\subsection{svghappy2.valine.png}
\begin{figure}[htbp]
\testplot{svghappy2_valine.png}
\caption{svghappy2.valine.png}
\end{figure}
As the problem arises from the compiler having hard time to set all the images. Splitting between them would help. As #mike-marchywka noted sections may do the trick, but so would other things, such as \pagebreak or \FloatBarrier from placeins

OpenCV + python -- grab frames from a video file

I can't seem to capture frames from a file using OpenCV -- I've compiled from source on Ubuntu with all the necessary prereqs according to: http://opencv.willowgarage.com/wiki/InstallGuide%20%3A%20Debian
#!/usr/bin/env python
import cv
import sys
files = sys.argv[1:]
for f in files:
capture = cv.CaptureFromFile(f)
print capture
print cv.GetCaptureProperty(capture, cv.CV_CAP_PROP_FRAME_WIDTH)
print cv.GetCaptureProperty(capture, cv.CV_CAP_PROP_FRAME_HEIGHT)
for i in xrange(10000):
frame = cv.QueryFrame(capture)
if frame:
print frame
Output:
ubuntu#local:~/opencv$ ./test.py bbb.avi
<Capture 0xa37b130>
0.0
0.0
The frames are always None...
I've transcoded a video file to i420 format using:
mencoder $1 -nosound -ovc raw -vf format=i420 -o $2
Any ideas?
You don't have the gstreamer-ffmpeg or gsteamer-python or gsteamer-python-devel packages installed. I installed all three of them. and the exact same problem was resolved.
I'm using OpenCV 2.2.0, compiled on Ubuntu from source. I can confirm that the source code you provided works as expected. So the problem is somewhere else.
I couldn't reproduce your problem using mencoder (installing it is a bit of a problem on my machine) so I used ffmpeg to wrap a raw video in the AVI container:
ffmpeg -s cif -i ~/local/sample-video/foreman.yuv -vcodec copy foreman.avi
(foreman.yuv is a standard CIF image sequence you can find on the net if you look around).
Running the AVI from ffmpeg through your source gives this:
misha#misha-desktop:~/Desktop/stackoverflow$ python ocv_video.py foreman.avi
<Capture 0xa71120>
352.0
288.0
<iplimage(nChannels=3 width=352 height=288 widthStep=1056 )>
<iplimage(nChannels=3 width=352 height=288 widthStep=1056 )>
...
So things work as expected. What you should check:
Do you get any errors on standard output/standard error? OpenCV uses ffmpeg libraries to read video files, so be on the lookout for informative messages. Here's what happens if you try to play a RAW video file without a container (sounds similar to your problem):
error:
misha#misha-desktop:~/Desktop/stackoverflow$ python ocv_video.py foreman.yuv
[IMGUTILS # 0x7fff37c8d040] Picture size 0x0 is invalid
[IMGUTILS # 0x7fff37c8cf20] Picture size 0x0 is invalid
[rawvideo # 0x19e65c0] Could not find codec parameters (Video: rawvideo, yuv420p)
[rawvideo # 0x19e65c0] Estimating duration from bitrate, this may be inaccurate
GStreamer Plugin: Embedded video playback halted; module decodebin20 reported: Your GStreamer installation is missing a plug-in.
<Capture 0x19e3130>
0.0
0.0
Make sure your AVI file actually contains the required information to play back the video. At a minimum, this should be the frame dimensions. RAW video typically doesn't contain any information besides the actual pixel data, so knowing the frame dimensions and FPS is required. You can wrong-guess the FPS and still get a viewable video, but if you get the dimensions wrong, the video will be unviewable.
Make sure the AVI file you're trying to open is actually playable. Try ffplay file.avi -- if that fails, then the problem is likely to be with the file. Try using ffmpeg to transcode instead of mencoder.
Make sure you can play other videos, using the same method as above. If you can't, then it's likely that your ffmpeg install is broken.

How to unit test a Python function that draws PDF graphics?

I'm writing a CAD application that outputs PDF files using the Cairo graphics library. A lot of the unit testing does not require actually generating the PDF files, such as computing the expected bounding boxes of the objects. However, I want to make sure that the generated PDF files "look" correct after I change the code. Is there an automated way to do this? How can I automate as much as possible? Do I need to visually inspect each generated PDF? How can I solve this problem without pulling my hair out?
(See also update below!)
I'm doing the same thing using a shell script on Linux that wraps
ImageMagick's compare command
the pdftk utility
Ghostscript (optionally)
(It would be rather easy to port this to a .bat Batch file for DOS/Windows.)
I have a few reference PDFs created by my application which are "known good". Newly generated PDFs after code changes are compared to these reference PDFs. The comparison is done pixel by pixel and is saved as a new PDF. In this PDF, all unchanged pixels are painted in white, while all differing pixels are painted in red.
Here are the building blocks:
pdftk
Use this command to split multipage PDF files into multiple singlepage PDFs:
pdftk reference.pdf burst output somewhere/reference_page_%03d.pdf
pdftk comparison.pdf burst output somewhere/comparison_page_%03d.pdf
compare
Use this command to create a "diff" PDF page for each of the pages:
compare \
-verbose \
-debug coder -log "%u %m:%l %e" \
somewhere/reference_page_001.pdf \
somewhere/comparison_page_001.pdf \
-compose src \
somewhereelse/reference_diff_page_001.pdf
Ghostscript
Because of automatically inserted meta data (such as the current date+time), PDF output is not working well for MD5hash-based file comparisons.
If you want to automatically discover all cases which consist of purely white pages, you could also convert to a meta-data free bitmap format using the bmp256 output device. You can do that for the original PDFs (reference and comparison), or for the diff-PDF pages:
gs \
-o reference_diff_page_001.bmp \
-r72 \
-g595x842 \
-sDEVICE=bmp256 \
reference_diff_page_001.pdf
md5sum reference_diff_page_001.bmp
If the MD5sum is what you expect for an all-white page of 595x842 PostScript points, then your unit test passed.
Update:
I don't know why I didn't previously think of generating a histogram output from the ImageMagick compare...
The following is a command pipeline chaining 2 different commands:
the first one is the same as the above compare which generates the 'white pixels are equal, red pixels are differences'-format, only it outputs the ImageMagick internal miff format. It doesn't write to a file, but to stdout.
the second one uses convert to read stdin, generate a histogram and output the result in text form. There will be two lines:
one indicating the number of white pixels
the other one indicating the number of red pixels.
Here it goes:
compare \
reference.pdf \
current.pdf \
-compose src \
miff:- \
| \
convert \
- \
-define histogram:unique-colors=true \
-format %c \
histogram:info:-
Sample output:
56934: (61937, 0, 7710,52428) #F1F100001E1ECCCC srgba(241,0,30,0.8)
444056: (65535,65535,65535,52428) #FFFFFFFFFFFFCCCC srgba(255,255,255,0.8)
(Sample output was generated by using these reference.pdf and current.pdf files.)
I think this type of output is really well suited for automatic unit testing. If you evaluate the two numbers, you can easily compute the "red pixel" percentage and you could even decide to return PASSED or FAILED based on a certain threshold (if you don't necessarily need "zero red" for some reason).
You could capture the PDF as a bitmap (or at least a losslessly-compressed) image, and then compare the image generated by each test with a reference image of what it's supposed to look like. Any differences would be flagged as an error for the test.
The first idea that pops in my head is to use a diff utility. These are generally used to compare texts of documents but they might also compare the layout of the PDF. Using it, you can compare the expected output with the output supplied.
The first result google gives me is this. Altough it is commercial, there might be other free/open source alternatives.
I would try this using xpresser - (https://wiki.ubuntu.com/Xpresser ) You can try to match images to similar images not exact copies - which is the problem in these cases.
I don't know if xpresser is being ctively developed, or if it can be used with stand alone image files (I think so) -- anyway it takes its ideas from teh Sikuli project (which is Java with a Jython front end, while xpresser is Python).
I wrote a tool in Python to validate PDFs for my employer's documentation. It has the capability to compare individual pages to master images. I used a library I found called swftools to export the page to PNG, then used the Python Imaging Library to compare it with the master.
The relevant code looks something like this (this won't run as there are some dependencies on other parts of the script, but you should get the idea):
# exporting
gfxpdf = gfx.open("pdf", self.pdfpath)
if os.path.isfile(pngPath):
os.remove(pngPath)
page = gfxpdf.getPage(pagenum)
img = gfx.ImageList()
img.startpage(page.width, page.height)
page.render(img)
img.endpage()
img.save(pngPath)
return os.path.isfile(pngPath)
# comparing
outPng = os.path.join(outpath, pngname)
masterPng = os.path.join(outpath, "_master", pngname)
if os.path.isfile(masterPng):
output = Image.open(outPng).convert("RGB") # discard alpha channel, if any
master = Image.open(masterPng).convert("RGB")
mismatch = any(x[1] for x in ImageChops.difference(output, master).getextrema())

Categories