Audio waveform visualisation in Python/Django

Audio waveform visualisation in Python/Django - python

I've looked around Stack Overflow for an answer to this, but nowhere seems to give the correct answer or direction...
My project will allow a user to upload a WAV, which ultimately will be converted to a low quality MP3 using FFmpeg on the server and it'll all be stored and served on Amazon S3. The next obstacle is working out how to extract a reliable waveform visualisation from this uploaded sound. I'm using Python and Django on Linux Ubuntu 10 on a VPS for this project...
I'm, at the vert least, needing some sort of direction... I'm at a lost of where to start to look for such a tool?

This one (uses audiolab, PIL and numpy) is decent: http://www.freesound.org/blog/?p=10

To make a graph or plot of the waveform, the usual Python appoach is to get the waveform into a numpy array, and then use matplotlib to make the plot.
The easiest way to read the data into a numpy array is to use scipy.io.wavfile.read, though if you prefer not to use scipy (it's a big package), it's not difficult to read and convert the data using Python's wav module.

Not trying to answer my own question here, but it's a suggestion that may help others clearly when seeing this quesion...
After lots of searching around, I found this solution... It seems well done, but does anyone else know anything about it?
Seems to do the lot!
http://code.google.com/p/timeside/

Related

Manipulate an application window frame using Python

TLDR: Is there a Python library that allows me to get a application window frame as an image and rewrite it to the said application?
So the whole story is that I want to write an application using Python that does something similar to Lossless Scaling and Magpie. I want to grab an application window (a videogame window, for example), get the current frame as an image, then use some Machine Learning/Deep Learning algorithm (like FSR or DLSS) to upscale said image, then rewrite the current frame from the application with said upscaled image.
So far, I have been playing around with some upscaling algorithms like the one from Real-ESRGAN, but now my main problem is how to upscale the video game images in real-time. The only thing I found that does something related to what I need to do is PyAutoGUI. But this package only allows you to take screenshots of an application but not rewrite the graphics of said application.
I hope I have clarified my problem; feel free to comment if you still have any questions.
Thank you for reading this post, and have a good day.

Doing this with Python is going to be very difficult. A lot of the performance involved in this sort of thing is in avoiding as many memory copies as possible, and Python's idiom for string and bytes processing unfortunately makes quite a few additional copies in the course of any idiomatic program. I say this as a die-hard Python fan who is constantly trying to cram Python in everywhere it doesn't belong: you'd be better off doing this in Rust.
Update: After receiving some feedback from some folks with more direct experience in this sort of thing, I may have overstated the difficulty here. Many ML tools in Python provide zero-copy access, you can easily access and manipulate memory-mapped data from numpy and there is even a CUDA protocol for doing this to data in GPU memory, so while it's not exactly easy, as long as your operations are implemented as numpy operations and not as pure-python pixel-by-pixel logic, it shouldn't be much harder than other python machine learning applications which require access to native APIs for accessing their source data.
However, there's no way to access framebuffer data directly from python, so step 1 is going to be writing your own bindings over the relevant DirectX APIs. Since Magpie is open source, you can see which APIs it's using, for example, in its various C++ "Frame Source" backends. For example, this looks relevant: https://github.com/Blinue/Magpie/blob/42cfcba1222b07e4cec282eaff639aead229f123/Runtime/GraphicsCaptureFrameSource.cpp#L87
You can then look those APIs up on MSDN; that one, for example, is here: https://learn.microsoft.com/en-us/uwp/api/windows.graphics.capture.direct3d11captureframepool.createfreethreaded?view=winrt-22621
CFFI is a good choice for writing native wrappers: https://cffi.readthedocs.io/en/latest/
Gluing these together appropriately is left as an exercise for the reader :).

Python read microphone without PyAudio

I'm trying to collect data from a Geiger counter using the microphone and process it using Python. However, I'm using a university computer so I'm not allowed to install the PyAudio module which seems necessary for this job (Python read microphone). Are there any equivalent functions in numpy, matplotlib or scipy?

Here's an outline an approach that I think might work:
The hardest part of this is getting data from the microphone, and you'll need a tool that's built for this. Since you're on Windows, you could look for a prebuilt tool to do this. You could try to run something as a subprocess, but probably better is to use ctypes and windll.kernel32 to call a Windows recording API. Googling "windll.kernel32 recording" produces some reasonable hits, like this.
If you do go the subprocess route, you'll probably end up calling something that first writes the output to a .wav file. If that's the case, you could then read the file using either the Python wave module, or scipy.io.wavefile.read. (Note wave files can be more complex than these modules can read, so when you set the parameters, don't go crazy.)
Finally, this idea of getting the data into the computer by recording the audio from the device is quite problematic, and will lead to problems as external audio noises will need to be sorted out. It would be much better to find a way to get the data into the computer without the intervening audio.

I know the question go answered and accepted, but I'd like to offer 2 other options:
python virtualenv would work around the "not allowed to install anything on the computer" which I guess is more imposed by local IT than dept policy
use ffmpeg in a wrapper. Drop the statically compiled executable in a known and acceptable location. use subprocess to start it with appropriate command line switches to output the captured audio to stdout (read as a file-like object on python's side)
both these options are free as in free beer and add a straightforward to simple cross platform support.

is there a simple portable way to read a gif image using python (no PIL)?

I am using python inside another application (CINEMA 4D) create a nice connection to out issue tracker (Jira) inside the application. Rationale behind this is to make it really easy for our plugin users to report and track bugs and have things like machine specs, screenshots or attaching scene files (including textures) automatically.
So far it as been a really smooth ride and the integration is coming along great. I started grabbing the icons for issue priorities, projects, issue types, etc. from Jira as well so they can be displayed for better overview. To read the image files I am using CINEMA 4D functionality that is available inside its python binding.
The problem now is, that most icons from Jira come in GIF format and the CINEMA 4D SDK doesn't read GIF files directly (actually it does read them, but only through a back door so users can load them, but I can't use that functionality through Python or the SDK). So I need another way to read the GIF files.
There are a few questions on stackoverflow that go into this direction, but they all seem to recommend PIL. This doesn't feel like the right solution for a few reasons:
While that looks nice, it's not part of the standard distribution and seems to be really only maintained for Windows (even though there are builds for Mac OS X).
It also seems to install itself into the current system installation of Python, but CINEMA 4D comes with its own, so I'd have to rip it apart and distribute it with my plugin.
And then it is quite large, while I really only want a compact script to have a compact solution (preferably out of the box, but that doesn't seem to be an option)
I was wondering if there is a simpler or at least more compact way. Since GIF seems to be a relatively simple file format, I am wondering if there may even be a simple parser as a python function/class.
I found a link where somebody disassembles a gif files embedded frames, but doesn't actually grab the image contents: Python, how i can get gif frames
I'm fine with putting in some time on my own, and I would've already been coding away if the file format was something uncompressed, but I am a little reluctant since the compression seems to raise the bar slightly.

Memory usage in django-imagekit is unacceptable -- ideas on fixes?

Django-imagekit, which I'm using to process user uploaded images on a social media website, uses an unacceptably high level of memory. I'm looking for ideas on how to get around this problem.
We are using django-imagekit to copy user uploaded images it into three predefined sizes, and saves the four copies (3 processed plus 1 original) into our AmazonS3 bucket.
This operation is quickly causing us to go over our memory limit on our Heroku dynos. On the django-imagekit github page, I've seen a few suggestions for hacking the library to use less memory.
I see three options:
Try to hack django-imagekit, and deal with the ensuing update problems from using a modified third party library
Use a different imaging processing library
Do something different entirely -- resize the images on in the browser perhaps? Or use a third party service? Or...?
I'm looking for advice on which of these routes to take. In particular, if you are familiar with django-imagekit, or if you know of / are using a different image processing library in a Django app, I'd love to hear your thoughts.
Thanks a lot!
Clay

Try to change image size with PIL from console and see if memory usage is ok. Image resize is a simple task, I don't believe you should use side applications. Besides, split your task into 3 tasks(3 images?).

How can I represent avi video as set of matrices using Python?

I have video files written in avi format and I would like to analyze these videos using Python. For that I would like to represent every frame of the video as a 2D matrix.
How can I do that? Google search gives me PyMedia as a way to go? Is it really the best choice or there some other approaches that I should to considered?
If the PyMedia is a good choice, could anybody pleas to give me a link where I can get exe files to install the module on Windows from binaries?
By the way, is it a good idea, in general, to use Python for these purposes? I like Python very much because of its simplicity and I prefer to use it, but if it is really not suitable for analysis of video, I am ready to use something else.
ADDED
Some people claim that PyMedia is "dead". Is it true?

Yeah, the latest news on the PyMedia web site is dated 01 Feb 2006. That's a pretty bad sign.
The most active and up-to-date open project for manipulating video is ffmpeg. Apparently there is a recently updated python wrapper for it: http://code.google.com/p/pyffmpeg/
In general Python is much too slow for doing any sort of pixel analysis of video. Therefore there will be practically zero libraries of any reasonable level of quality and support for helping at the pixel level of granularity. There are well supported libraries for working at an image level of granularity though. PIL seems to be a popular choice: http://www.pythonware.com/products/pil/

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.