Spotify API Works Sometimes, Stalls Other Times(?) - python

I'm trying to follow along in a simple tutorial that uses the spotipy api. The code is below:
from spotipy.oauth2 import SpotifyClientCredentials
import spotipy
import sys
from pprint import pprint
import spotipy
import yaml
from spotipy.oauth2 import SpotifyOAuth
from data_functions import offset_api_limit, get_artists_df, get_tracks_df, get_track_audio_df,\
get_all_playlist_tracks_df, find_popular_album, get_recommendations_artists, get_recommendations_tracks, get_most_popular_artist, get_recommendations_genre
import pandas as pd
urn = 'spotify:artist:3jOstUTkEu2JkjvRdBA5Gu'
scope = "user-library-read user-follow-read user-top-read playlist-read-private"
with open("spotify/spotify_info.yaml", 'r') as stream:
spotify_details = yaml.safe_load(stream)
sp = spotipy.Spotify(auth_manager=SpotifyOAuth(
client_id=spotify_details['client_id'],
client_secret=spotify_details['client_secret'],
redirect_uri=spotify_details['redirect_uri'],
scope=scope,
))
results = sp.search(q='album:' + 'From 2 to 3', type='album')
print(results)
print("\n\n")
items = results['albums']['items']
if len(items) > 0:
print(items)
album = items[0]
print(album)
print(album['name'], album['id'])
alb = sp.album(album['id']) #issues here
pprint(alb)
The code works and pulls the information from spotify until the marked line. I have no idea what's different about this function but when it hits that line it stalls indefinitely. Has anyone encountered this issue?
I've tried changing my authentication configuration and various functions such as albums, related_artists that also don't work.

Related

AttributeError: module 'config' has no attribute 'TWITTER_ACCESS_TOKEN_SECRET'. Did you mean: 'TWITTER_ACCESS_TOKEN'?

import streamlit as st
import pandas as pd
import numpy as np
import requests
import tweepy
import config
import psycopg2
import psycopg2.extras
import plotly.graph_objects as go
auth = tweepy.OAuthHandler(config.TWITTER_CONSUMER_KEY,
config.TWITTER_CONSUMER_SECRET)
auth.set_access_token(config.TWITTER_ACCESS_TOKEN,
config.TWITTER_ACCESS_TOKEN_SECRET)
api = tweepy.API(auth)
Problem with my code, it is not running with the twitter key. The module has no attributes
The config module that you are attempting to import and read off of is not what you want.
TWITTER_CONSUMER_KEY and TWITTER_CONSUMER_SECRET are not constants that you can get from a module. These are values that you must input yourself. There is perhaps a piece of code missing at the start of your application that looks like this:
config = {
'TWITTER_CONSUMER_KEY': 'ENTER YOUR TWITTER CONSUMER KEY',
'TWITTER_CONSUMER_SECRET': 'ENTER YOUR TWITTER CONSUMER SECRET'
}
Take a look at this article for more help. Goodluck!

Common Crawl Request returns 403 WARC

I am trying to crawl some WARC files from the common crawls archives, but I do not seem to get successful requests through to the server. A minimal python example below is provided below to replicate the error. I tried adding the UserAgent in the request header, but it did help. Any ideas on how to proceed?
import io
import time
import justext # >= 2.2.0
import argparse
import requests # >= 2.23.0
import pandas as pd # pandas >= 1.0.3
from tqdm import tqdm
from warcio.archiveiterator import ArchiveIterator warcio >= 1.7.3
def debug():
common_crawl_data = {"filename":"crawl-data/CC-MAIN-2016-07/segments/1454702018134.95/warc/CC-MAIN-20160205195338-00121-ip-10-236-182-209.ec2.internal.warc.gz",
"offset":244189209,
"length":989
}
offset, length = int(common_crawl_data['offset']), int(common_crawl_data['length'])
offset_end = offset + length - 1
prefix = 'https://commoncrawl.s3.amazonaws.com/'
resp = requests.get(prefix + common_crawl_data['filename'], headers={'Range': 'bytes={}-{}'.format(offset, offset_end)})
raw_data = io.BytesIO(resp.content)
uri = None
page = None
for record in ArchiveIterator(raw_data, arc2warc=True):
uri = record.rec_headers.get_header('WARC-Target-URI')
R = record.content_stream().read()
try:
page = R.strip().decode('utf-8')
except:
page = R.strip().decode('latin1')
print(uri, page)
return uri, page
debug()
Please see this commoncrawl blog posting for the recent change that generates 403s for some unauthenticated requests.

Google cloud platform gcp delete snapshot 10 days older using python

I want to delete the snapshot which is 10 days older in GCP using python. I tried using the below program using filter expression, but unfortunately i faced below errors
from datetime import datetime
from googleapiclient import discovery
import google.oauth2.credentials
from oauth2client.service_account import ServiceAccountCredentials
import sys
def get_disks(project,zone):
credentials = ServiceAccountCredentials.from_json_keyfile_name(r"D:\Users\ganeshb\Desktop\Json\auth.json",
scopes='https://www.googleapis.com/auth/compute')
service = discovery.build('compute', 'v1',credentials=credentials)
request = service.snapshots().list(project='xxxx',FILTER="creationTimestamp<'2021-05-31'")
response = request.execute()
print (response)
output = get_disks("xxxxxxxx", "europe-west1-b")
Your problem is a known Google Cloud bug.
Please read these issue trackers: 132365111 and 132676194
Solution:
Remove the filter statement and process the returned results:
from datetime import datetime
from dateutil import parser
request = service.snapshots().list(project=project)
response = request.execute()
# Watch for timezone issues here!
filter_date = '2021-05-31'
d1 = parser.parse(filter_date)
for item in response['items']:
d2 = datetime.fromisoformat(item['creationTimestamp'])
if d2.timestamp() < d1.timestamp():
# Process the result here. This is a print statement stub.
print("{} {}".format(item['name'], item['creationTimestamp']))

Partial response of documentconversionV1()

I am trying to use DocumentConversionV1 function of watson_developer_cloud API on python , However the response in my case comes only as "<"Response 200">".
import sys
import os as o
import json
import codecs
from watson_developer_cloud import DocumentConversionV1
document_conversion = DocumentConversionV1(
username="873512ac-dcf7-4365-a01d-7dec438d5720",
password="bvhXbdaHtYgw",
version='2016-02-10',
url= "https://gateway.watsonplatform.net/document-conversion/api",
)
config = {
'conversion_target': 'NORMALIZED_TEXT',
}
i = "v.docx"
with open((i),'rb') as doc:
res = document_conversion.convert_document(document = doc , config = config)
print(res)
First and foremost, delete your service credentials and recreate them through Bluemix. (Posting them on a public forum is usually a bad idea.) ;o)
Now, to actually answer the question... You want to get the content of the response. Right now, you're printing the response itself. Try
print(res.content)
See https://github.com/watson-developer-cloud/python-sdk/blob/master/examples/document_conversion_v1.py#L16

deadline = None after using urlfetch.set_default_fetch_deadline(n)

I'm working on a web application with Python and Google App Engine.
I tried to set the default URLFetch deadline globally as suggested in a previous thread:
https://stackoverflow.com/a/14698687/2653179
urlfetch.set_default_fetch_deadline(45)
However it doesn't work - When I print its value in one of the functions: urlfetch.get_default_fetch_deadline() is None.
Here is main.py:
from google.appengine.api import users
import webapp2
import jinja2
import random
import string
import hashlib
import CQutils
import time
import os
import httpRequests
import logging
from google.appengine.api import urlfetch
urlfetch.set_default_fetch_deadline(45)
...
class Del(webapp2.RequestHandler):
def get(self):
id = self.request.get('id')
ext = self.request.get('ext')
user_id = httpRequests.advance(id,ext)
d2 = urlfetch.get_default_fetch_deadline()
logging.debug("value of deadline = %s", d2)
Prints in the Log console:
DEBUG 2013-09-05 07:38:21,654 main.py:427] value of deadline = None
The function which is being called in httpRequests.py:
def advance(id, ext=None):
url = "http://localhost:8080/api/" + id + "/advance"
if ext is None:
ext = ""
params = urllib.urlencode({'ext': ext})
result = urlfetch.fetch(url=url,
payload=params,
method=urlfetch.POST,
headers={'Content-Type': 'application/x-www-form-urlencoded'})
if (result.status_code == 200):
return result.content
I know this is an old question, but recently ran into the issue.
The setting is placed into a thread-local, meaning that if your application is set to thread-safe and you handle a request in a different thread than the one you set the default deadline for, it can be lost. For me, the solution was to set the deadline before every request as part of the middleware chain.
This is not documented, and required looking through the source to figure it out.

Categories