I am writing a little python script that converts mp3 files to aac. I am doing so by using avconv and I am wondering if I am doing it right.
Currently my command looks like this:
avconv -i input.mp3 -ab 80k output.aac
This brings me to my first question: I am using -ab 80k as this works with my test-files. On some files I can go higher and use 100k. But I'd prefer to have that always on the highest settings. Is there a way to say that?
The other question: I am using it in a python script. Currently I call it as a subprocess. What I'd prefer is not to do so, as this forces me to write a file to disc and then load it again when everything is done. Is there a way to only do it in memory? I am returning the file afterwards using web.py and don't need or want it on my disc? So would be cool not having to use temporary files at all.
Thanks for any tipps and tricks :)
I don't have the -ab option but if it is equivalent to -ar (specify the sample rate), I should point out that your ears won't be able to tell the difference between 80k and anything higher.
On the subject of temporary files, have you considered using /tmp or a specific tmpfs file system created for the purpose.
Edit:
In response to comment about tempfiles, yes you still use them but create them in /tmp or a tmpfs file system that you have created for the job. It should get cleared on reboot but I would expect you to delete the file once you have passed it on anyway.
The other point about lossless aac I may come back to you later.
Edit 2:
As I suspected the aac format is described as the logical successor to mp3 and I strongly suspect, you or someone else may know different, that as mp3 is what is termed as lossy, compressed i.e. bits ( no pun intended) missing, your desire to convert losslessly is doomed, in so much as the source is already lossy.
Of course, being no expert in the matter, I stand to be corrected.
Edit 3:
your comment about too many frames leads me to believe that you are conflating the two avconv options -ar and -b
The -b is used for video and specifies the output bit rate for video and audio. The way you are using it, I suspect that it is attempting to apply the same bit rate to audio and video but there is a limit on the audio stream.
You would have to use -b:v to tell avconv to set the video bit rate and leave the audio rate alone.
I suggest that you lose the -ab option and use -ar instead as that is audio only.
Related
I have a bunch of videos for which I want to extract specific sections (either as videos or as frames). I get the specific sections from a .json file where the start and end frames are stored according to labels, like 'cat in video', 'dog in video'. I have an existing method in Python using opencv using the method mentioned here but I found a one-liner using ffmpeg which is a lot more faster and efficient than my Python script, except that I have to manually fill in the start and end frames in this command.
ffmpeg -i in.mp4 -vf select='between(n\,x\,y)' -vsync 0 frames%d.png
I read a few questions about working with .json files in a shell script or passing arguments to a batch script which looks quite complicated and might spoil my system. Since I'm not familar working with .json files in a shell/batch script, I'm not sure how to start. Can anyone point me in the right direction on how to make a batch script that can read variables from a .json file and input it into my ffmpeg command?
Since you're already familiar with Python, I suggest you to use it to parse JSON files, then you can use ffmpeg-python library, which is a ffmpeg binding for Python. It also has a crop function, which I assume is what you need.
An alternative would be to use the os.system('ffmpeg <arguments>') calls from a Python script, which allows you to run external tools from the script.
Python natively supports JSON with its builtin json package
As for doing this in python, here is an alternative approach that you can try my ffmpegio-core package:
import ffmpegio
ffmpegio.transcode('in.mp4','frames%d.png',vf=f"select='between(n\,{x}\,{y})'",vsync=0)
If the videos are constant frame rate, it could be faster to specify the start and end timestamps as input options:
fs = ffmpegio.probe.video_streams_basic('in.mp4')[0]['frame_rate']
ffmpegio.transcode('in.mp4', 'frames%d.png', ss_in=x/fs, to_in=y/fs, vsync=0)
If you don't know the frame rate, you are calling ffprobe and ffmpeg for each file, so there is a tradeoff. But if your input video is long, it could be worthwhile.
But if speed is your primary goal, calling FFmpeg directly always is the fastest.
ffmpegio GitHub repo
Yes I know that this has been asked many many times, but the libraries on every single answer just ends up needing ffmepg.
The problem with that is that the file size increases dramatically when I include ffmpeg to my project, and I just don't want that.
I want to keep my project as lightweight as possible without needing to add 200 meabytes of data just for video to audio conversion which is just a very small part of the project.
So is there any way to
not use ffmpeg
use another lightweight converter with a python wrapper
just use the parts in ffmpeg where the webm to mp3 conversion is actually taking place
Compile your own ffmpeg using this configuration to decode Vorbis/Opus audio in WebM and encode MP3 via libmp3lame:
./configure --disable-everything --disable-network --disable-autodetect --enable-small --enable-protocol=file,pipe --enable-demuxer=matroska --enable-muxer=mp3 --enable-decoder=vorbis,opus --enable-encoder=libmp3lame --enable-libmp3lame --enable-filter=aresample
Resulting size of ffmpeg is under 2 MB.
I am looking for a way to automate tasks in external programs with Python.
I have large audio files in AAC format. I need to convert them to mp3, and then amplify them (avoiding the distortion).
I wrote a program with the pydub library that works great with small files, but my files are too large (longer than 2hs or 200mb) and I run out of memory (because that lib store the full files in RAM, I think). I can't split the file in chunks because I could not merge them again for the previous reason, and I need the file in one piece.
So, I would like to write a program that open another program to convert the file to mp3 (mediahuman audio converter) and then, amplify the converted file with another program (WavePad audio editor) but i don't know if is this possible.
In the present, I'm doing that manually, but that takes a long time of waiting and requires less than 10 clicks (spread throughout the process), which is tedious.
I leave the program I wrote. I transcribed it to remove some functions that are not relevant and are not related to this process, plus I translated the comments, variables and other things into English, so it may have some errors but the original program works well:
import glob
import os
from pydub import AudioSegment
#convert to mp3 128 bits
sound = AudioSegment.from_file("input-file.aac")
sound.export("output-file.mp3", format="mp3", bitrate="128k")
#sound.max_dBFS shows how far below the limit the highest sample is (in dB)
sound = AudioSegment.from_file("output.mp3", format="mp3")
max_gain_without_distortion = -1 * sound.max_dBFS
#increase volume by "max_gain_without_distortion" dB
from pydub.playback import play
song = AudioSegment.from_mp3("output-file.mp3")
louder_song = song + max_gain_without_distortion
#save louder song
louder_song.export("output.mp3", format='mp3')
PC specifications: ///
OS: windows 10 pro 64 bits ///
RAM: 4gb ///
CPU: dualcore 3ghz ///
PYTHON VERSION: 3.7.1 ///
Pydub version: v0.23.1-0-g46782a9 ///
ffmpeg/avlib version: "Build: ffmpeg-20190219-ff03418-win32-static" ///
As agreed in comments, as a solution I am going to propose using a command line tool: FFmpeg. Here's the command you need:
ffmpeg -i input-file.aac -b:v 128k -filter:a loudnorm output.mp3
using loudnorm. You can also apply gain directly as explained in the docs, but one should expect inferior results. Normalization can be done in number of ways, I suggest reading this post.
By combining it with e.g. find . -name '*.wav' -type f you can easily find and convert all files in a directory tree.
If you're bent on using Python, you can check Python bindings. Basics:
import ffmpeg
ffmpeg.input('stereo.aac').output('mono.mp3').run()
Initially I was going to propose using sox: Sound eXchange, the Swiss Army knife of audio manipulation. It's not Python, though has Python bindings: pysox. However, it turned out it does not support aac format (still has dozens of other formats). I thought it could be interesting to mention it anyway, as one could convert first to more popular format with ffmpeg and pipe results to sox. The latter has many more options for modification of audio stream.
Convert wav to mp3 and resample to 128kbit:
sox -r 128k input-file.wav output-file.mp3
The OP asks to "increase volume by max_gain_without_distortion dB" and for this we can use either gain or norm as explained in docs:
sox -r 128k input-file.wav output-file.mp3 gain −n -3
After docs, The −n option normalises the audio to 0dB FSD; it is often used in conjunction with a negative gain-dB to the effect that the audio is normalised to a given level below 0dB.
sox −−norm -r 128k input-file.wav output-file.mp3
I have a little program here (python 2.7) that runs on an old machine and it basically keeps getting pictures (for timelapses) by running an external binary and converts them to an efficient format to save up disk space.
I want to minimize the disk operations, because it's already pretty old and I want it to last some more time.
At the moment the program writes the data from the camera on the disk, then converts it and removes the original data. However it does that for every image, 1- it writes a large file on disk, 2- reads it to convert, 3- and then deletes it... a bunch of disc operations that aren't necessary and could be done in ram, because the original file doesn't have to be stored and is only used as a basis to create another one.
I was sure a ramdisk was the solution, then I googled on how to do that, and google returned me a bunch of links that discourage the use of ramdisk, the reasons are many: because they are not useful in modern systems (i'm running a pretty new linux kernel); they should only be used if you want to decrypt data that shouldn't hit the disk; some tests shows that ramdisk could be actually slower than hd; the operating system has a cache...
So I'm confused...
In this situation, should I use a ramdisk?
Thank you.
PS: If you want more info: I have a proprietary high-res camera, and a proprietary binary that I run to capture a single image, I can specify where it will write the file, which is a huge TIFF file, and then the python program runs the convert program from imagemagick to convert it to JPEG and then compress it in tar.bz2, so the quality is almost the same but the filesize is 1/50 of the TIFF.
My experience with ramdisks is congruent with what you've mentioned here. I lost performance when I moved to them because there was less memory available for the kernel to do it's caching intelligently and that messed things up.
However, from your question, I understand that you want to optimise for number of disk operations rather than speed in which case a RAM disk might make sense. As with most of these kinds of problems, monitoring is the right way to do it.
Another thing that struck me was that if your original image is not that big, you might want to buy a cheap USB stick and do the I/O on that rather than on your main drive. Is that not an option?
Ah, proprietary binaries that only give certain options. Yay. The simplest solution would be adding a solid state hard drive. You will still be saving to disk, but disk IO will be much higher for reading and writing.
A better solution would be outputting the tiff to stdout, perhaps in a different format, and piping it to your python program. It would never hit the hard drive at all, but it would be more work. Of course, if the binary doesn't allow you to do this, then it's moot.
If on Debian (and possibly its derivatives), use "/run/shm" directory.
I want to make an editor that does the following:
1) takes an mp3 audio file
2) Takes a picture --a jpg file
3) Outputs a simple video format e.g. .mov which consists of the jpg file with the mp3 file in the background
4) Does NOTHING else
I want to use this as a project to learn just the basics of all this stuff however I do not want to code basic things by hand. Where do I start and what key steps do I take to accomplish this?
I am decent with PHP and Java and do not mind learning Python for this. I actually would ideally want to write this in Python to gain experience.
Thanks!
If you want to code such a solution yourself - forget Python, compile ffmpeg and use it's classes directly from your code after you carefully read them (or maybe use pyffmpeg, which still requires you to know ffmpeg internals).
However, I'm pretty sure that what you want could be done with ffmpeg executable alone from command line - but that way your Python code would end as a wrapper around os.Popen (it's quite popular solution actually).
I think it's a matter of what level of understanding you're aiming at: either you're ok with reading ffmpeg docs and believing it's going to work (then: use Python), or you need to dive deep into ffmpeg sources to gain real understanding what's going on (which I don't have, btw) - and then using pythonic bindings will just stand in your way.
I have needed ffmpeg (from django) a few times already and never had to do anything more than just assemble a list with ffmpeg command line args. On the other hand I would very much like to actually understand what the hell I'm doing, but no one seemed interested in paying me for groking ffmpeg sources. :-(
I'm pretty sure you could do this all from the mencoder commandline (use -speed option I think; might need to give it a duplicate of your jpg for every few seconds of video you want as it can only slow things down by a factor of 100 at the most).
If you opt for the ffmpeg CLI solution, or need a process to try and replicate with the libraries directly, the relevant CLI command would be the straightforward:
ffmpeg -i input.jpg -i input.mp3 output.mov