How to save file as mp3 from Amazon Polly using Python - python

I am using Amazon Polly for TTS, but I am not able to get how to save the converted speech into a .mp3 file in my computer
I have tried gTTS but i require Amazon Polly for my task.
import boto3
client = boto3.client('polly')
response = client.synthesize_speech
(Text = "Hello my name is Shubham", OuptutFormat = "mp3", VoiceId = 'Aditi')
Now, what Should I do to play this converted speech or save it into my PC as .mp3 file?

This code sample is taken straight from the documentation: https://docs.aws.amazon.com/polly/latest/dg/SynthesizeSpeechSamplePython.html
import boto3
polly_client = boto3.Session(
aws_access_key_id=,
aws_secret_access_key=,
region_name='us-west-2').client('polly')
response = polly_client.synthesize_speech(VoiceId='Joanna',
OutputFormat='mp3',
Text = 'This is a sample text to be synthesized.')
file = open('speech.mp3', 'wb')
file.write(response['AudioStream'].read())
file.close()

While not directly related to the original question, I responded to one of the comments about hot to get to the audio stream without saving the audio to a file.
You might also check out the documentation for this example:
https://docs.aws.amazon.com/polly/latest/dg/example-Python-server-code.html
This shows getting the response back from Polly:
response = polly.synthesize_speech(Text=text, VoiceId=voiceId, OutputFormat=outputFormat)
data_stream=response.get("AudioStream")
The first line makes the request to Polly and stores the response in the response object, while the second line gets the audio stream from the response object.

Related

Importing mime .eml file to gmail API using the import function

I am a python developer and somewhat new to using Google's gMail API to import .eml files into a gMail account.
I've gotten all of the groundwork done getting my oAuth credentials working, etc.
However, I am stuck where I load in the data-file. I need help loading the message data in to place in a variable..
How do I create the message_data variable reference - in the appropriate format - from my sample email file (which is stored in rfc822 format) that is on disk?
Assuming I have a file on disk at /path/to/file/sample.eml ... how do I load that to message_data in the proper format for the gMail API import call?
...
# how do I properly load message_data from the rfc822 disk file?
media = MediaIoBaseUpload(message_data, mimetype='message/rfc822')
message_response = service.users().messages().import_(
userId='me',
fields='id',
neverMarkSpam=True,
processForCalendar=False,
internalDateSource='dateHeader',
media_body=media).execute(num_retries=2)
...
You want to import an eml file using Gmail API.
You have already been able to get and put values for Gmail API.
You want to achieve this using google-api-python-client.
service in your script can be used for uploading the eml file.
If my understanding is correct, how about this answer? Please think of this as just one of several possible answers.
Modification point:
In this case, the method of "Users.messages: insert" is used.
Modified script:
Before you run the script, please set the filename with the path of the eml file.
eml_file = "###" # Please set the filename with the path of the eml file.
user_id = "me"
f = open(eml_file, "r", encoding="utf-8")
eml = f.read()
f.close()
message_data = io.BytesIO(eml.encode('utf-8'))
media = MediaIoBaseUpload(message_data, mimetype='message/rfc822', resumable=True)
metadata = {'labelIds': ['INBOX']}
res = service.users().messages().insert(userId=user_id, body=metadata, media_body=media).execute()
print(res)
In above script, the following modules are also required.
import io
from googleapiclient.http import MediaIoBaseUpload
Note:
In above modified script, {'labelIds': ['INBOX']} is used as the metadata. In this case, the imported eml file can be seen at INBOX of Gmail. If you want to change this, please modify this.
Reference:
Users.messages: insert
If I misunderstood your question and this was not the result you want, I apologize.

Converting Audio Blob to text in Python using Speech recognition

Apologies for the English....
I am building a chatbot application where voice is recorded on client side through HTML5's mediaRecorder api and sent as Formdata to python's falcon web service.
On Python side i need to directly convert this audio blob to text.
Currently I am writing this audio blob to a wav file and then reading from that file. However it is taking a long time in this process as FileIO is involved. I need to somehow directly consume this audio blob as input source for speech recognition.
This is What I have tried:
def on_post(self, req, resp):
open("backend.wav",'wb')
.write(req.get_param('audio_data').file.read());
mic = sr.AudioFile('backend.wav')
with mic as source:
print("Speak !!")
audio = r.record(source)
#audio = req
results = r.recognize_google(audio_data=audio, language="en-US",show_all=True)
return results;
I am not an experienced Python Developer ,So please pardon if it's a stupid question. Any help is highly appreciated..
I can't test it but it could work.
It seems that AudioFile can use file-object so this code uses io.BytesIO to create file-object in memory and save data in this file. This way it doesn't have to use disk.
import io
def on_post(self, req, resp):
f = req.get_param('audio_data').file
file_obj = io.BytesIO() # create file-object
file_obj.write(f.read()) # write in file-object
file_obj.seek(0) # move to beginning so it will read from beginning
mic = sr.AudioFile(file_obj) # use file-object
with mic as source:
audio = r.record(source)
result = r.recognize_google(audio_data=audio, language="en-US", show_all=True)
return result

Google speech to text API result is empty

I am using Cloud speech to text api to convert audio file to text file. I am executing it using python, Below is code.
import io
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="D:\\Sentiment_Analysis\\My Project 59503-717155d6fb4a.json"
# Imports the Google Cloud client library
from google.cloud import speech
from google.cloud.speech import enums
from google.cloud.speech import types
# Instantiates a client
client = speech.SpeechClient()
# The name of the audio file to transcribe
file_name = os.path.join(os.path.dirname('D:\CallADoc_VoiceImplementation\audioclip154173607416598.amr'),'CallADoc_VoiceImplementation','audioclip154173607416598.amr')
# Loads the audio into memory
with io.open(file_name, 'rb') as audio_file: content = audio_file.read()
audio = types.RecognitionAudio(content=content)
config = types.RecognitionConfig(encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,sample_rate_hertz=16000,language_code='en-IN')
# Detects speech in the audio file
response = client.recognize(config, audio)
for result in response.results: print('Transcript: {}'.format(result.alternatives[0].transcript))
When i execute the sample/tested audio file in the name "audio.raw", the audio is converting and result is like below.
runfile('C:/Users/sandesh.p/CallADoc/GoogleSpeechtoText.py', wdir='C:/Users/sandesh.p/CallADoc')
Transcript: how old is the Brooklyn Bridge
But for same code, i am recording a audio and try to convert, it is giving empty result like below:
runfile('C:/Users/sandesh.p/CallADoc/GoogleSpeechtoText.py', wdir='C:/Users/sandesh.p/CallADoc')
I am trying to fix this from past 2 days and please help me to resolve this.
Try following the troubleshooting steps to have your audio with the appropriate settings.
For instance, your audio file will have the following settings, which are required to have better results:
Encoding: FLAC
Channels: 1 # 16-bit
Sampleratehertz: 16000Hz

Google speech-to-text Python example code doesn't work

The following is my code (I made some slight changes to the original example code):
import io
import os
# Imports the Google Cloud client library
from google.cloud import speech
from google.cloud.speech import enums
from google.cloud.speech import types
# Instantiates a client
client = speech.SpeechClient()
# The name of the audio file to transcribe
file_name = os.path.join(
os.path.dirname(__file__),
'C:\\Users\\louie\\Desktop',
'TOEFL2.mp3')
# Loads the audio into memory
with io.open(file_name, 'rb') as audio_file:
content = audio_file.read()
audio = types.RecognitionAudio(content=content)
config = types.RecognitionConfig(
encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
sample_rate_hertz=16000,
language_code='en-US')
# Detects speech in the audio file
response = client.recognize(config, audio)
for result in response.results:
print('Transcript: {}'.format(result.alternatives[0].transcript))
text_file = open("C:\\Users\\louie\\Desktop\\Output.txt", "w")
text_file.write('Transcript: {}'.format(result.alternatives[0].transcript))
text_file.close()
I can only directly run this code in my windows prompt command since otherwise, the system cannot know the GOOGLE_APPLICATION_CREDENTIALS. However, when I run the code, nothing happened. I followed all the steps and I could see the request traffic changed on my console. But I cannot see any transcript. Could someone help me out?
You are trying to decode TOEFL2.mp3 file encoded as MP3 while you specify LINEAR audio encoding with
encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16
You have to convert mp3 to wav first, see information about AudioEncoding

How to convert text into audio file and play in browser via python/django?

How to convert text into audio file which can be played in browser via python/django views?
How can I do text-to-speech conversion in python? I want to convert a string to a .wav file, that will be played in a browser via python/django views.
For example:
text = "how are you?"
convert text to audio file (text.wav)
open text.wav file & play in browser via django view.
As Tichodroma says, you should always see if someone has already asked your question before asking it again. Google search for python text to speech returns http://code.google.com/p/pyspeech/ and How to make Python speak, among others.
I have tried to do like following way & it works for me. Thanks.
#Write text to file
text_file_path = '/user/share/project/test.txt'
audio_file_path = '/user/share/project/test.wav'
text_file = open(text_file_path, "w")
text_file.write('How are you?')
text_file.close()
#Convert file
conv = 'flite -f "%s" -o "%s"' % (text_file_path, audio_file_path)
response = commands.getoutput(conv)
if os.path.isfile(audio_file_path):
response = HttpResponse()
f = open(audio_file_path, 'rb')
response['Content-Type'] = 'audio/x-wav'
response.write(f.read())
f.close()
return response

Categories