How to use unofficial Google Trend API ( pyGTrends.py) - python

I'm starting to learn python to make a program for crawling the web data. So I was googling and I found the google trend API, pyGTrend.py. But I can't use it.
I can found the same problem in google but no solution which I can understand.
Please help me.
I just used the API as written at the API owner's website: Programmatic Google Trends Api
from pyGTrends import pyGTrends
connector = pyGTrends('googleID','passwaord')
connector.download_report(('banana', 'bread', 'bakery'),date='2008-4',geo='AT',scale=1)
print connector.csv()
error message is below,
Traceback(most recent call last):
File ('Stdin') line1, in <Module>
File "C:\Pyhon27\Lib\site-pacakage\pyGTrends.py" line 115, in csv
KeyError: 'main'

you need to call it like this
from pytrends.pyGTrends import pyGTrends

Here's an example of how to use it. Let me know if you would need further assistance:
from pytrends.pyGTrends import pyGTrends
import time
from random import randint
from IPython.display import display
from pprint import pprint
import urllib
import sys
google_username = "GMAIL_USERNAME"
google_password = "PASSWORD"
path = "."
terms = [
"Image Processing",
"Signal Processing",
"Computer Vision",
"Machine Learning",
"Information Retrieval",
"Data Mining"
]
# connect to Google Trends API
connector = pyGTrends(google_username, google_password)
for label in terms:
print(label)
sys.stdout.flush()
#kw_string = '"{0}"'.format(keyword, base_keyword)
connector.request_report(label, geo="US", date="01/2014 96m")
# wait a random amount of time between requests to avoid bot detection
time.sleep(randint(5, 10))
# download file
connector.save_csv(path, label)
for term in terms:
data = connector.get_suggestions(term)
pprint(data)

Related

Spotify API Works Sometimes, Stalls Other Times(?)

I'm trying to follow along in a simple tutorial that uses the spotipy api. The code is below:
from spotipy.oauth2 import SpotifyClientCredentials
import spotipy
import sys
from pprint import pprint
import spotipy
import yaml
from spotipy.oauth2 import SpotifyOAuth
from data_functions import offset_api_limit, get_artists_df, get_tracks_df, get_track_audio_df,\
get_all_playlist_tracks_df, find_popular_album, get_recommendations_artists, get_recommendations_tracks, get_most_popular_artist, get_recommendations_genre
import pandas as pd
urn = 'spotify:artist:3jOstUTkEu2JkjvRdBA5Gu'
scope = "user-library-read user-follow-read user-top-read playlist-read-private"
with open("spotify/spotify_info.yaml", 'r') as stream:
spotify_details = yaml.safe_load(stream)
sp = spotipy.Spotify(auth_manager=SpotifyOAuth(
client_id=spotify_details['client_id'],
client_secret=spotify_details['client_secret'],
redirect_uri=spotify_details['redirect_uri'],
scope=scope,
))
results = sp.search(q='album:' + 'From 2 to 3', type='album')
print(results)
print("\n\n")
items = results['albums']['items']
if len(items) > 0:
print(items)
album = items[0]
print(album)
print(album['name'], album['id'])
alb = sp.album(album['id']) #issues here
pprint(alb)
The code works and pulls the information from spotify until the marked line. I have no idea what's different about this function but when it hits that line it stalls indefinitely. Has anyone encountered this issue?
I've tried changing my authentication configuration and various functions such as albums, related_artists that also don't work.

Azure API Not Working(sorry for the title I have no idea what's wrong)

As I said already sorry for the title. I have never worked with Azure API and have no idea what is wrong with the code, as I just copied from the documentation and put in my information.
Here is the code:
from azure.cognitiveservices.speech import AudioDataStream, SpeechConfig, SpeechSynthesizer, SpeechSynthesisOutputFormat
from azure.cognitiveservices.speech.audio import AudioOutputConfig
speech_config = SpeechConfig(subscription="ImagineHereAreNumbers", region="westeurope")
speech_config.speech_synthesis_language = "en-US"
speech_config.speech_synthesis_voice_name = "ChristopherNeural"
audio_config = AudioOutputConfig(filename=r'C:\Users\TheD4\OneDrive\Desktop\SpeechFolder\Azure.wav')
synthesizer = SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)
synthesizer.speak_text_async("A simple test to write to a file.")
Well as I run this I get no error and in fact, get in my desired folder a .wav file, but this file has 0 bytes and it looks corrupted.
Now here is why I have no idea of what's wrong because if I remove this
speech_config.speech_synthesis_language = "en-US"
speech_config.speech_synthesis_voice_name = "ChristopherNeural"
So it becomes this
from azure.cognitiveservices.speech import AudioDataStream, SpeechConfig, SpeechSynthesizer, SpeechSynthesisOutputFormat
from azure.cognitiveservices.speech.audio import AudioOutputConfig
speech_config = SpeechConfig(subscription="ImagineHereAreNumbers", region="westeurope")
audio_config = AudioOutputConfig(filename=r'C:\Users\TheD4\OneDrive\Desktop\SpeechFolder\Azure.wav')
synthesizer = SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)
synthesizer.speak_text_async("A simple test to write to a file.")
It now works all of the sudden, but with what I assume to be the basic/common voice.
So here is my question: how do I choose a voice that I want(btw is this one "en-US-JennyNeural" style="customerservice" or something among these lines)
Thank You in advance!
ChristopherNeural is not a valid voice name. The actual name of the voice is en-US-ChristopherNeural.
speech_config.speech_synthesis_voice_name = "en-US-ChristopherNeural"
This is well-documented on the Language support page of the Speech services documentation.
For other, more fine-grained control over voice characteristics, you'll require the use of SSML as outlined in text-to-speech-basics.py.

IBM Watson Text to Speech API Python

I'm trying to adjust the pitch of IBM Watson but I can't seem to find any documentation on this whatsoever.
If you visit this link then you can see that there is an option to adjust the pitch/speed.
The code I have is very simply this:
from ibm_watson import TextToSpeechV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
authenticator = IAMAuthenticator('api_key')
text_to_speech = TextToSpeechV1(
authenticator=authenticator
)
text_to_speech.set_service_url('service_url')
sample = "insert what you want to say here"
with open('test.wav', 'wb') as audio_file:
audio_file.write(
text_to_speech.synthesize(
sample,
voice='en-GB_JamesV3Voice',
accept='audio/wav'
).get_result().content)
I have literally no idea what parameters to adjust in order to make the voice low. Thank you so much!
What you are looking for is the prosody element. Neural voices (V3) only use the pitch and rate attribute.
Using your example:
sample = 'Here is a <prosody pitch="150Hz"> modified pitch </prosody> example.'
sample = 'Here is a <prosody rate="x-slow"> modified rate </prosody> example.'
And here is a link to the docs about the prosody element:
https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-elements#prosody_element

how to read mp3 data from google cloud using python

I am trying to read mp3/wav data from google cloud and trying to implement audio diarization technique. Issue is that I am not able to read the result which has passed by google api in variable response.
below is my python code
speech_file = r'gs://pp003231/a4a.wav'
config = speech.types.RecognitionConfig(
encoding=speech.enums.RecognitionConfig.AudioEncoding.LINEAR16,
language_code='en-US',
enable_speaker_diarization=True,
diarization_speaker_count=2)
audio = speech.types.RecognitionAudio(uri=speech_file)
response = client.long_running_recognize(config, audio)
print response
result = response.results[-1]
print result
Output displayed on console is
Traceback (most recent call last):
File "a1.py", line 131, in
print response.results
AttributeError: 'Operation' object has no attribute 'results'
Can you please share your expert advice about what I am doing wrong?
Thanks for your help.
Its too late for the author of this thread. However, posting the solution for someone in future as I too had similar issue.
Change
result = response.results[-1]
to
result = response.result().results[-1]
and it will work fine
Do you have access to the wav file in your bucket? also, this is the entire code? It seems that the sample_rate_hertz and the imports are missing. Here you have the code copy/pasted from the google docs samples, but I edited it to have just the diarization function.
#!/usr/bin/env python
"""Google Cloud Speech API sample that demonstrates enhanced models
and recognition metadata.
Example usage:
python diarization.py
"""
import argparse
import io
def transcribe_file_with_diarization():
"""Transcribe the given audio file synchronously with diarization."""
# [START speech_transcribe_diarization_beta]
from google.cloud import speech_v1p1beta1 as speech
client = speech.SpeechClient()
audio = speech.types.RecognitionAudio(uri="gs://<YOUR_BUCKET/<YOUR_WAV_FILE>")
config = speech.types.RecognitionConfig(
encoding=speech.enums.RecognitionConfig.AudioEncoding.LINEAR16,
sample_rate_hertz=8000,
language_code='en-US',
enable_speaker_diarization=True,
diarization_speaker_count=2)
print('Waiting for operation to complete...')
response = client.recognize(config, audio)
# The transcript within each result is separate and sequential per result.
# However, the words list within an alternative includes all the words
# from all the results thus far. Thus, to get all the words with speaker
# tags, you only have to take the words list from the last result:
result = response.results[-1]
words_info = result.alternatives[0].words
# Printing out the output:
for word_info in words_info:
print("word: '{}', speaker_tag: {}".format(word_info.word,
word_info.speaker_tag))
# [END speech_transcribe_diarization_beta]
if __name__ == '__main__':
transcribe_file_with_diarization()
To run the code just name it diarization.py and use the command:
python diarization.py
Also, you have to install the latest google-cloud-speech library:
pip install --upgrade google-cloud-speech
And you need to have the credentials of your service account in a json file, you can check more info here

Python script for "Google search by image"

I have checked Google Search API's and it seems that they have not released any API for searching "Images". So, I was wondering if there exists a python script/library through which I can automate the "search by image feature".
This was annoying enough to figure out that I thought I'd throw a comment on the first python-related stackoverflow result for "script google image search". The most annoying part of all this is setting up your proper application and custom search engine (CSE) in Google's web UI, but once you have your api key and CSE, define them in your environment and do something like:
#!/usr/bin/env python
# save top 10 google image search results to current directory
# https://developers.google.com/custom-search/json-api/v1/using_rest
import requests
import os
import sys
import re
import shutil
url = 'https://www.googleapis.com/customsearch/v1?key={}&cx={}&searchType=image&q={}'
apiKey = os.environ['GOOGLE_IMAGE_APIKEY']
cx = os.environ['GOOGLE_CSE_ID']
q = sys.argv[1]
i = 1
for result in requests.get(url.format(apiKey, cx, q)).json()['items']:
link = result['link']
image = requests.get(link, stream=True)
if image.status_code == 200:
m = re.search(r'[^\.]+$', link)
filename = './{}-{}.{}'.format(q, i, m.group())
with open(filename, 'wb') as f:
image.raw.decode_content = True
shutil.copyfileobj(image.raw, f)
i += 1
There is no API available but you are can parse the page and imitate the browser, but I don't know how much data you need to parse because google may limit or block access.
You can imitate the browser by simply using urllib and setting correct headers, but if you think parsing complex web-pages may be difficult from python, you can directly use a headless browser like phontomjs, inside a browser it is trivial to get correct elements using javascript/DOM
Note before trying all this check google's TOS
You can try this:
https://developers.google.com/image-search/v1/jsondevguide#json_snippets_python
It's deprecated, but seems to work.

Categories