I'm trying to play a gTTs voice with pygame.mixer.music.load() only. I don't want to save the voice into a file, so I saved it into a BytesIO stream. gTTs returns .mp3 audio which I know has limited support by pygame, so I tried to convert the .mp3 audio to .wav using the pydub module, but I couldn't find a way to do so without saving it to a file. How can I fix this issue in any way possible?
from pygame import mixer
from gtts import gTTS
def play(buffer):
buffer.seek(0)
mixer.music.load(buffer) #Load the mp3
print("Sound loaded. Time to play!")
mixer.music.play() #Play it
def generate_voice(text, accent):
mp3_fp = BytesIO()
tts = gTTS(text)
tts.write_to_fp(mp3_fp)
return mp3_fp
text = "Hi there"
buffer = generate_voice(text, accent)
play(buffer)
The error returned by pygame.mixer.music.load(): pygame.error: ModPlug_Load failed
I fixed this issue by using pydub to convert the audio into a wav format:
def play(buffer):
mixer.init()
mixer.music.load(buffer) #Load the mp3
print("Sound loaded. Time to play!")
mixer.music.play() #Play it
def generate_voice(text, lang):
fp = BytesIO()
wav_fp = BytesIO()
tts = gTTS(text=text, lang=lang)
tts.write_to_fp(fp)
fp.seek(0)
sound = AudioSegment.from_file(fp)
wav_fp = sound.export(fp, format = "wav")
return wav_fp
Related
I was trying to create a voice assistant using python... but my code is showing some weird texts when I run it... down below is my code
from email.mime import audio
from logging.config import listen
import re
from neuralintents import GenericAssistant
import speech_recognition
import sys
import pyttsx3 as tts
recognizer = speech_recognition.Recognizer()
speaker = tts.init()
speaker.setProperty('rate',150)
def create_note():
global recognizer
speaker.say("What do you want to say?")
speaker.runAndWait()
done=False
while not done:
try:
with speech_recognition.Microphone() as mic:
recognizer.adjust_for_ambient_noise(mic, duration=0.2)
audio= recognizer.listen(mic)
note = recognizer.recognize_google(audio)
note = note.lower()
speaker.say("choose a file name")
speaker.runAndWait
recognizer.adjust_for_ambient_noise(mic, duration=.2)
audio = listen(mic)
filename = recognizer.recognize_google(audio)
filename = filename.lower()
with open(filename,'w' ) as f:
f.write(note)
done= True
speaker.say={f"I saved the note {filename}"}
speaker.runAndWait()
except speech_recognition.UnknownValueError:
recognizer = speech_recognition.Recognizer()
speaker.say("I dont got that, please say it again")
speaker.runAndWait()
mappings = {'greeting': create_note}
assitant = GenericAssistant('intents.json',intent_methods=mappings)
assitant.train_model()
I expect it to get the input from the mic then go through a JSON file and say the next thing according to the JSON file....
here is the contents of the JSON file
JSON FILE
I am writing a simple python program that gets a text file then uses IBM Watson Text To Speech to convert it to audio then play the audio directly using a module such as playsound.
most of the tutorials shows you how to save the result to a file only and not how to pass it so a module to play the audio
from ibm_watson import TextToSpeechV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
authenticator = IAMAuthenticator('{apikey}')
text_to_speech = TextToSpeechV1(
authenticator=authenticator
)
text_to_speech.set_service_url('{url}')
with open('hello_world.wav', 'wb') as audio_file:
audio_file.write(
text_to_speech.synthesize(
'Hello world',
voice='en-US_AllisonVoice',
accept='audio/wav'
).get_result().content)
that's not what i want , I want to be able to play the audio without saving it, how can i do that.
If you are open for external libraries, you can install vlc binding for python using pip install python-vlc
And use player method to play audio directly from the content as below.
import vlc
from ibm_watson import TextToSpeechV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
authenticator = IAMAuthenticator('{apikey}')
text_to_speech = TextToSpeechV1(
authenticator=authenticator
)
text_to_speech.set_service_url('{url}')
#define VLC instance
instance = vlc.Instance('--input-repeat=-1', '--fullscreen')
#Define VLC player
player=instance.media_player_new()
#Define VLC media
media=instance.media_new(
text_to_speech.synthesize(
'Hello world',
voice='en-US_AllisonVoice',
accept='audio/wav').get_result().content)
#Set player media
player.set_media(media)
#Play the media
player.play()
Advantage of vlc player is that you can play most media types directly from URL (not just mp3) and also perform player like options such as
>>> play.pause() #pause play back
>>> player.play() #resume play back
>>> player.stop() #stop play back
*credits
from gtts import gTTS
import winsound
Text = "Hello world"
Audio = gTTS(text=Text,lang="en",slow=False)
Audio.save("T.wav")
winsound.PlaySound("T.wav", winsound.SND_FILENAME)
When the code is run there are no errors but a default windows alert sound is played instead of the test.wav file.
Here is my python code..
import pyttsx3;
engine = pyttsx3.init(driverName='sapi5')
infile = "tanjil.txt"
f = open(infile, 'r')
theText = f.read()
f.close()
engine.say(theText)
engine.runAndWait()
I couldn't save the file to audio file
As of July 14 2019, I'm able to save to file with the pyttsx3 library (without using another library or internet connection).
It doesn't appear to be documented, but looking at the source code in github for the Engine class in "engine.py" (https://github.com/nateshmbhat/pyttsx3/blob/master/pyttsx3/engine.py), I was able to find a "save_to_file" function:
def save_to_file(self, text, filename, name=None):
'''
Adds an utterance to speak to the event queue.
#param text: Text to sepak
#type text: unicode
#param filename: the name of file to save.
#param name: Name to associate with this utterance. Included in
notifications about this utterance.
#type name: str
'''
self.proxy.save_to_file(text, filename, name)
I am able to use this like:
engine.save_to_file('the text I want to save as audio', path_to_save)
Not sure the format - it's some raw audio format (I guess it's maybe something like aiff) - but I can play it in an audio player.
If you install pydub:
https://pypi.org/project/pydub/
then you can easily convert this to mp3, e.g.:
from pydub import AudioSegment
AudioSegment.from_file(path_to_save).export('converted.mp3', format="mp3")
I've tried #Brian's solution but it didn't work for me.
I searched around a bit and I couldn't figure out how to save the speech to mp3 in pyttx3 but I found another solution without pyttx3.
It can take a .txt file and directly output a .wav file,
def txt_zu_wav(eingabe, ausgabe, text_aus_datei = True, geschwindigkeit = 2, Stimmenname = "Zira"):
from comtypes.client import CreateObject
engine = CreateObject("SAPI.SpVoice")
engine.rate = geschwindigkeit # von -10 bis 10
for stimme in engine.GetVoices():
if stimme.GetDescription().find(Stimmenname) >= 0:
engine.Voice = stimme
break
else:
print("Fehler Stimme nicht gefunden -> Standard wird benutzt")
if text_aus_datei:
datei = open(eingabe, 'r')
text = datei.read()
datei.close()
else:
text = eingabe
stream = CreateObject("SAPI.SpFileStream")
from comtypes.gen import SpeechLib
stream.Open(ausgabe, SpeechLib.SSFMCreateForWrite)
engine.AudioOutputStream = stream
engine.speak(text)
stream.Close()
txt_zu_wav("test.txt", "test_1.wav")
txt_zu_wav("It also works with a string instead of a file path", "test_2.wav", False)
This was tested with Python 3.7.4 on Windows 10.
import pyttsx3
engine = pyttsx3.init("sapi5")
voices = engine.getProperty("voices")[0]
engine.setProperty('voice', voices)
text = 'Your Text'
engine.save_to_file(text, 'name.mp3')
engine.runAndWait() # don't forget to use this line
Try the following code snippet to convert text to audio and save it as an mp3 file.
import pyttsx3
from pydub import AudioSegment
engine = pyttsx3.init('sapi5')
engine.save_to_file('This is a test phrase.', 'test.mp3') # raw audio file
engine.runAndWait()
AudioSegment.from_file('test.mp3').export('test.mp3', format="mp3") # audio file in mp3 format
NB: pyttsx3 save_to_file() method creates a raw audio file and it won't be useful for other applications to use even if we are able to play it in the media player. pydub is a useful package to convert raw audio into a specific format.
import pyaudio
import wave
chunk = 1024
wf = wave.open('yes.mp3', 'rb')
p = pyaudio.PyAudio()
stream = p.open(
format = p.get_format_from_width(wf.getsampwidth()),
channels = wf.getnchannels(),
rate = wf.getframerate(),
output = True)
data = wf.readframes(chunk)
while data != '':
stream.write(data)
data = wf.readframes(chunk)
stream.close()
p.terminate()
No matter how I put this, while trying multiple methods I seem to keep getting the following error in terminal:
raise Error, 'file does not start with RIFF id'
I would use pyglet but media and all other modules aren't detected even though I'm able to import pyglet.
Any help?
You're using wave to attempt to open a file that is not wav. Instead, you're attempting to open an mp3 file. The wave module can only open wav files, so you need to convert the mp3 to wav. Here's how you can use pyglet to play an mp3 file:
import pyglet
music = pyglet.resource.media('music.mp3')
music.play()
pyglet.app.run()
It would be much simpler than the method you're trying. What errors are you getting with pyglet?