Record RTSP stream to file (.wav)

Record RTSP stream to file (.wav) - python

I'm trying to save X seconds from a audio stream to a file. I have a RTSP server, and I made a simple script in python to save several seconds from this server to record in a file (.wav).
def main():
########################### MAIN INIT ###########################
instance = vlc.Instance("-vvv", "--no-video", "--clock-jitter=0", "--sout-audio", "--sout",
"#transcode{acodec=s16l,channels=2}:std{access=file,mux=wav,dst=test.wav}")
# Create a MediaPlayer with the default instance
player = instance.media_player_new()
# Load the media file
media = instance.media_new("rtsp://XXX.XX.XXX.XX:YYYY/")
# Add the media to the player
player.set_media(media)
# Play for 10 seconds then exit
player.play()
time.sleep(10)
if __name__ == '__main__':
main()
But when I run the script it creates the file "test.wav" but it's a text plane file instead of wav, what it's I'm waiting for.
Log show me next info:
[00000000022aec08] core input error: ES_OUT_RESET_PCR called
[00007f6704040518] core decoder error: cannot continue streaming due to errors
So I really appreciate someone who can help me.
Thank so much.

Wav files are structured with different fields representing different information as you probably know - see an exmample here from this link (https://github.com/kushalpandya/WavStagno):
It sounds like your output is not formatted correctly - there are tools available to inspect a WAV file which would be a good place to start, or if you are bale to share a link to the file here then people can take a look.
If what you are trying to do is to listen to the stream and save it at the same time, then you likely want to use the duplicate functionality - there is a god example here (albeit video based): https://stackoverflow.com/a/16758988/334402

Related

Creating a simple IBM Assistant using their TTS and STT. I get a Bytes and Strings error. I am using VLC to play audio. How can I fix this?

This is the code. Its purpose is to use VLC for IBM's Text to Speech to speak within the Python IDE. It's my first step for the assistant. This question is different from a regular strings and bytes error because it involves IBM Cloud instead of a simple program error.
import vlc
from ibm_watson import TextToSpeechV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
authenticator = IAMAuthenticator("API Key Here")
text_to_speech = TextToSpeechV1(
authenticator=authenticator
)
text_to_speech.set_service_url(
'https://api.us-south.text-to-speech.watson.cloud.ibm.com/instances/113cd664-f07b-44fe-a11d-a46cc50caf84')
# define VLC instance
instance = vlc.Instance('--input-repeat=-1', '--fullscreen')
# Define VLC player
player = instance.media_player_new()
# Define VLC media
media = instance.media_new(
text_to_speech.synthesize(
'Hello world',
voice='en-US_AllisonVoice',
accept='audio/wav').get_result().content)
# Set player media
player.set_media(media)
# Play the media
player.play()
I get this error...
Traceback (most recent call last):
File "C:/Users/PycharmProjects/IBM Test/iBM tEST.py", line 24, in <module>
accept='audio/wav').get_result().content)
File "C:\Users\PycharmProjects\IBM Test\venv\lib\site-packages\vlc.py", line 1947, in media_new
if ':' in mrl and mrl.index(':') > 1:
TypeError: a bytes-like object is required, not 'str'
I have tried this...
text_to_speech.synthesize('Hello world'.encode(), ...)
I get this error back...
b'Hello world' is not JSON serializable
If anyone recognizes this issue, please let me know what I could be doing wrong. I am trying to play a simple text line in my Python IDE. I am coding in PyCharm.
I know that this block of code works because it is directly from IBM's API documentation. I have used this for myself to test...
from ibm_watson import TextToSpeechV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
authenticator = IAMAuthenticator('{apikey}')
text_to_speech = TextToSpeechV1(
authenticator=authenticator
)
text_to_speech.set_service_url('{url}')
with open('hello_world.wav', 'wb') as audio_file:
audio_file.write(
text_to_speech.synthesize(
'Hello world',
voice='en-US_AllisonVoice',
accept='audio/wav'
).get_result().content)
This code saves what is inputted as text into an mp3 file called Hello World. I am basically trying to integrate that into a system that plays the speech directly into the IDE. If anyone knows of any alternative methods other than VLC, please let me know.

If you pay close attention to the error message you will see that the error is actually being thrown by the vlc code. Which implies that the output from the TTS speech is not what vlc is expecting.
You need to break up your code and first verify what output you are getting from TTS. If it is audio, then you can work out how the vlc code expects it. I suspect it not in the format that the TTS is outputting.
Updated answer
The output from TTS is a data stream of audio content, in Python this will be a byte array. It looks as though VLC is looking for a string. This makes no sense if VLC is looking for audio data. If however, it was looking for a string, then that string could be a file destination. So I think you need to write the file, and give the file destination to VLC.
IMHO based on the question you are asking and the code you have cobbled together, your coding skills are not up to the challenge, and you maybe better off spending a couple of weeks going through some Python coding tutorials. You may find the investment in training time pays off without you struggling with what are quite fundamental coding issues here.

Flask Audio File to Wave Object Python

I want to convert an audio file received from flask api (of type class 'werkzeug.datastructures.FileStorage') to a Wave (https://pypi.org/project/Wave/) object. Usually, you do this by supplying a path on your comp:
import wave
wav = wave.open("test.wav", "r")
But this doesn't work as I do not want to save the audio file to my computer. This is how I get the audio file in my flask script:
audio = request.files["audio"]
Please let me know what I can do! Thanks.

You can try the following modification of your code:
audio = request.files['audio_file']
The request.files is a dictionary. The dictionary key that will allow you to retrieve the audio file is 'audio_file' instead of 'audio'.

you can use save() function
audio = request.files["audio"]
path='./videos/sample.wav';
audio.save(path)
check for further details
https://werkzeug.palletsprojects.com/en/2.0.x/datastructures/#werkzeug.datastructures.FileStorage.save

Playing audio in python at given timestamp

I am trying to find a way in python to play a section of an audio file given a start and end time.
For example, say I have an audio file that is 1 min in duration. I want to play the section from 0:30 to 0:45 seconds.
I do not want to process or splice the file, only playback of the given section.
Any suggestions would be greatly appreciated!
Update:
I found a great solution using pydub:
https://github.com/jiaaro/pydub
from pydub import AudioSegment
from pydub.playback import play
audiofile = #path to audiofile
start_ms = #start of clip in milliseconds
end_ms = #end of clip in milliseconds
sound = AudioSegment.from_file(audiofile, format="wav")
splice = sound[start_ms:end_ms]
play(splice)

step one is to get your python to play entire audio file ... several libraries are available for this ... see if the library has a time specific api call ... you can always roll up your sleeves and implement this yourself after you read the audio file into a buffer or possibly stream the file and stop streaming at end of chosen time section
Another alternative is to leverage command line tools like ffmpeg which is the Swiss Army Knife of audio processing ... ffmpeg has command line input parms to do time specific start and stop ... also look at its sibling ffplay
Similar to ffplay/ffmpeg is another command line audio tool called sox

Use PyMedia and Player. Look at the functions SeekTo() and SeekEndTime(). I think you will be able to find a right solution after playing around with these functions.

I always have trouble installing external libraries and if you are running your code on a server and you don't have sudo privileges then it becomes even more cumbersome. Don't even get me started on ffmpeg installation.
So, here's an alternative solution with scipy and native IPython that avoids the hassle of installing some other library.
from scipy.io import wavfile # to read and write audio files
import IPython #to play them in jupyter notebook without the hassle of some other library
def PlayAudioSegment(filepath, start, end, channel='none'):
# get sample rate and audio data
sample_rate, audio_data = wavfile.read(filepath) # where filepath = 'directory/audio.wav'
#get length in minutes of audio file
print('duration: ', audio_data.shape[0] / sample_rate / 60,'min')
## splice the audio with prefered start and end times
spliced_audio = audio_data[start * sample_rate : end * sample_rate, :]
## choose left or right channel if preferred (0 or 1 for left and right, respectively; or leave as a string to keep as stereo)
spliced_audio = spliced_audio[:,channel] if type(channel)==int else spliced_audio
## playback natively with IPython; shape needs to be (nChannel,nSamples)
return IPython.display.Audio(spliced_audio.T, rate=sample_rate)
Use like this:
filepath = 'directory_with_file/audio.wav'
start = 30 # in seconds
end = 45 # in seconds
channel = 0 # left channel
PlayAudioSegment(filepath,start,end,channel)

Show Video with audio in VLC

i have an Python Program that show photos and videos from a folder. The files in this folder a changing so a static playlist is not possible. I can create a playlist with a media_list_player
I create an instance with
instance = vlc.Instance('--input-repeat=-1 --aout=alsa --alsa-audio-device=plughw:0,0')
mediaList = instance.media_list_new()
mediaList.add_media(instance.media_new(tmpPath + "/start.jpg"))
list_player = instance.media_list_player_new()
list_player.set_media_list(mediaList)
list_player.audio_set_volume(100)
list_player.play()
And i add new files with
mediaList.add_media(instance.media_new(showFile))
list_player.set_media_list(mediaList)
This works but when there is an video i dont have sound. "audio_set_volume()" dosen work. I tried the normal media_player but there i cant make a playlist. So i hope someone here can help me with adding sound to the vlc.

Moviepy unable to read duration of file

I have been using Moviepy to combine several shorter video files into hour long files. Some small files are "broken", they contain video but was not completed correctly (i.e. they play with VLC but there is no duration and you cannot skip around in the video).
I noticed this issue when I try to create a clip using VideoFileClip(file) function. The error that comes up is:
MoviePy error: failed to read the duration of file
Is there a way to still read the "good" frames from this video file and then add them to the longer video?
UPDATE
To clarify, my issue specifically is with the following function call:
clip = mp.VideoFileClip("/home/test/"+file)
Stepping through the code it seems to be an issue when checking the duration of the file in ffmpeg_reader.py where it looks for the duration parameter in the video file. However, since the file never finished recording properly this information is missing. I'm not very familiar with the way video files are structured so I am unsure of how to proceed from here.

You're correct. This issue arises commonly when the video duration info is missing from the file.
Here's a thread on the issue: GitHub moviepy issue 116
One user proposed the solution of using MP4Box to convert the video using this guide: RASPIVID tutorial
The final solution that worked for me involved specifying the path to ImageMagick's binary file as WDBell mentioned in this post.
I had the path correctly set in my environment variables, but it wasn't till I specificaly defined it in config_defaults.py that it started working:

I solved it in a simpler way, with the help of VLC I converted the file to the forma MPEG4 xxx TV/device,
and you can now use your new file with python without any problem
xxx = 720p or
xxx = 1080p
everything depends on your choice on the output format
I already answered this question on the blog: https://github.com/Zulko/moviepy/issues/116

This issue appears when VideoFileClip(file) function from moviepy it looks for the duration parameter in the video file and it's missing. To avoid this (in those corrupted files cases) you should make sure that the total frames parameter is not null before to shoot the function: clip = mp.VideoFileClip("/home/test/"+file)
So, I handled it in a simpler way using cv2.
The idea:
find out the total frames
if frames is null, then call the writer of cv2 and generate a temporary copy of the video clip.
mix the audio from the original video with the copy.
replace the original video and delete copy.
then call the function clip = mp.VideoFileClip("/home/test/"+file)
Clarification: Since OpenCV VideoWriter does not encode audio, the new copy will not contain audio, so it would be necessary to extract the audio from the original video and then mix it with the copy, before replacing it with the original video.
You must import cv2
import cv2
And then add something like this in your code before the evaluation:
cap = cv2.VideoCapture("/home/test/"+file)
frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
fps = int(cap.get(cv2.CAP_PROP_FPS))
print(f'Checking Video {count} Frames {frames} fps: {fps}')
This will surely return 0 frames but should return at least framerate (fps).
Now we can set the evaluation to avoid the error and handle it making a temp video:
if frames == 0:
print(f'No frames data in video {file}, trying to convert this video..')
writer = cv2.VideoWriter("/home/test/fixVideo.avi", cv2.VideoWriter_fourcc(*'DIVX'), int(cap.get(cv2.CAP_PROP_FPS)),(int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)),int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))))
while True:
ret, frame = cap.read()
if ret is True:
writer.write(frame)
else:
cap.release()
print("Stopping video writer")
writer.release()
writer = None
break
Mix the audio from the original video with the copy. I have created a function for this:
def mix_audio_to_video(pathVideoInput, pathVideoNonAudio, pathVideoOutput):
videoclip = VideoFileClip(pathVideoInput)
audioclip = videoclip.audio
new_audioclip = CompositeAudioClip([audioclip])
videoclipNew = VideoFileClip(pathVideoNonAudio)
videoclipNew.audio = new_audioclip
videoclipNew.write_videofile(pathVideoOutput)
mix_audio_to_video("/home/test/"+file, "/home/test/fixVideo.avi", "/home/test/fixVideo.mp4")
replace the original video and delete copys:
os.replace("/home/test/fixVideo.mp4", "/home/test/"+file)

I had the same problem and I have found the solution.
I don't know why but if we enter the path in this method path = r'<path>' instead of ("F:\\path") we get no error.

Just click on the
C:\Users\gladi\AppData\Local\Programs\Python\Python311\Lib\site-packages\moviepy\video\io\ffmpeg_reader.py
and delete the the code and add this one
Provided by me in GITHUB - https://github.com/dudegladiator/Edited-ffmpeg-for-moviepy

clip1=VideoFileClip('path')
c=clip1.duration
print(c)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Record RTSP stream to file (.wav) - python

Related

Creating a simple IBM Assistant using their TTS and STT. I get a Bytes and Strings error. I am using VLC to play audio. How can I fix this?

Flask Audio File to Wave Object Python

Playing audio in python at given timestamp

Show Video with audio in VLC

Moviepy unable to read duration of file

Categories

Resources