Spotipy Accessing Track Data No Longer Works - python

I have been working on an AI project using Spotipy and the Spotify Web API. I have been getting a list of preview_url's to do some analysis on and I have successfully gotten many, but I ran into issues lately. Whenever I try to use .track(track_id) it gets stuck on the line and doesn't continue past the line. I was thinking it could be an issue with the API, but other commands work fine, it's only track that is giving me issues. I cannot figure out the issue because it doesn't give me any errors, it just gets stuck trying to execute that line and never finishes.
Refreshing the client secret does nothing now. This is the code I have so far.
from spotipy.oauth2 import SpotifyClientCredentials
cid = '121e03d3acd1440188ae4c0f58b844d4'
secret = '431a5e56bcd544c3aefce8166a9c3703'
client_credentials_manager = SpotifyClientCredentials(client_id=cid, client_secret=secret)
sp = spotipy.Spotify(client_credentials_manager=client_credentials_manager)
number = 2
output_file = open('data\\25k_data_preview\\track_url_preview_' + str(number) + '.txt', 'a')
for l in open('data\\25k_data\\track_url_' + str(number) + '.txt'):
line = l.replace('\n','')
print(line)
try:
track = sp.track(line)
try:
testing = track['preview_url']
if testing != None:
output_file.write(line + " " + testing + "\n")
except:
x = 0
except:
x = 0
output_file.close()

Related

Make tweepy search for the newest mentions instead of the oldest

Today I wrote a twitter bot that replies anybody who mentions it with a random image from a folder.
The problem here is that I'm a newbie in python and I don't know how to make it funcitonal at all. When I started running it, the bot started replying all the mentions from other users (I'm using an old account a friend gave to me), and that's not precisely what I want, even if it's working, but not as I desire.
The bot is replying all the mentions from the very beggining and it won't stop until all these are replied (the bot is now turned off, I don't want to annoy anybody)
How can I make it to only reply to latest mentions instead of the first ones?
here's the code:
import tweepy
import logging
from config import create_api
import time
import os
import random
from datetime import datetime
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger()
api = create_api()
imagePath = random.choice(os.listdir("images/"))
while True:
for tweet in tweepy.Cursor(api.mentions_timeline).items():
try:
imagePath = random.choice(os.listdir("images/"))
tweetId = tweet.user.id
username = tweet.user.screen_name
api.update_with_media('images/' + imagePath, "#" + username + " ", in_reply_to_status_id=tweet.id)
print('Replying to ' + username + 'with ' + imagePath)
except tweepy.TweepError as e:
print(e.reason)
except StopIteration:
break
time.sleep(12)
Thanks in advance.
I don't have the ability to test this code currently but this should work.
Instead of iterating over every tweet, it turns the iterator that tweepy.Cursor returns into a list and then just gets the last item in that list.
api = create_api()
imagePath = random.choice(os.listdir("images/"))
while True:
tweet_iterator = tweepy.Cursor(api.mentions_timeline).items()
latest_tweet = list(tweet_iterator)[-1]
try:
imagePath = random.choice(os.listdir("images/"))
tweetId = latest_tweet.user.id
username = latest_tweet.user.screen_name
api.update_with_media('images/' + imagePath, "#" + username + " ", in_reply_to_status_id=latest_tweet.id)
print('Replying to ' + username + 'with ' + imagePath)
except tweepy.TweepError as e:
print(e.reason)
except StopIteration:
break
time.sleep(12)
You will also want to keep track of what user you last replied to, so you don't just keep spamming the same person over and over.
This isn't the most efficient way of doing it but should be easy enough to understand:
latest_user_id = None
while True:
# Rest of the code
try:
if latest_user_id == latest_tweet.user.id:
# don't do anything
else:
latest_user_id = latest_tweet.user.id
# the rest of your code

I have functional problem but I cannot see where

I know this is not good to ask this kind of question but I am honest and this is my problem now. I do not know that to do anymore so I have to ask this (also I do not know where else I can ask this). I cannot debug my code that I can see when day changes so I do not know where the problem is.
My code is taking pictures and sending them to twitter between 24 hours. Code is working fine at the first day but after that it is not sending photos anymore. And I do not see any problems in my code. Please take a look and say if you see problem.
from twython import Twython
from picamera import PiCamera
from time import sleep
import datetime
import os
sleep(500)
camera = PiCamera()
camera.rotation = 180
datetimeNow = datetime.datetime.now()
oldDate = 0
newDate = 0
photoAlreadyTaken = 0
CONSUMER_KEY = 'sad...'
CONSUMER_SECRET = 'asd...'
ACCESS_TOKEN_KEY = 'fdsf...'
ACCESS_TOKEN_SECRET = 'asd..'
twitter = Twython(CONSUMER_KEY, CONSUMER_SECRET, ACCESS_TOKEN_KEY,
ACCESS_TOKEN_SECRET)
while True:
try:
newDate = datetimeNow.day
except:
print("error")
if newDate != oldDate:
if datetimeNow.hour == 14 and photoAlreadyTaken != 1:
photoAlreadyTaken = 1
try:
camera.start_preview()
sleep(5)
camera.capture('/home/pi/strawberry.jpg')
camera.stop_preview()
except:
photoAlreadyTaken = 0
sleep(5)
try:
with open('/home/pi/strawberry.jpg', 'rb') as photo:
twitter.update_status_with_media(status=str(datetimeNow.day) + "-" + str(datetimeNow.month) + "-" + str(datetimeNow.year), media=photo)
except:
photoAlreadyTaken = 0
oldDate = datetimeNow.day
else: #When the first photo is sent this is executed, but I cannot debug how long
photoAlreadyTaken = 0
sleep(500)
After the first cycle of the loop, both newDate and oldDate equal datetime.datetime.now() at the time of script activation forever.
Your reasoning about the value of datetimeNow is a bit off; it's set at activation and you never update it. Consider replacing datetimeNow (static value) with datetime.datetime.now() (function invocation that gives the current date-time); or at least update datetimeNow somewhere inside the loop.
Good luck!

Why this small python script is so slow ? Maybe using the wrong lib ? Suggestions?

I am training python at my high school. We are still learning.
We did this small script where it loads a website using cookies and downloads files with 400k each. But for some reason, it is very slow. Our internet connection is very fast. It should be able to download 20-30 files at once, but for some reason, it downloads just one file at once and still waits some seconds before download the next. Why is this so? Please, check the script and give suggestions. No matter where we run the script, it always downloads a maximum of 7-8 files per minute. It is not right.
check out:
import urllib.request,string,random
def id_generator(size=6, chars=string.ascii_uppercase + string.digits):
return ''.join(random.choice(chars) for _ in range(size))
ab = 0
faturanum = 20184009433300
while (ab != 1000000):
try:
ab = ab + 1
opener = urllib.request.build_opener()
a = id_generator() + ".pdf"
faturanum = faturanum - 1
fatura = str(faturanum)
faturanome = fatura + ".pdf"
opener.addheaders = [('Cookie', 'ASP.NET_SessionId=gpkufgrfzsk5abc0bc2v2v3e')]
f = opener.open("https://urltest.com/Fatura/Pdf?nrFatura="+fatura)
file = open(faturanome, 'wb')
file.write(f.read())
file.close()
print("file downloaded:"+str(ab)+" downloaded!")
except:
pass
Why is this so slow? The remote server is very fast too. Is there some way to get better results? Maybe putting several files in the queue? Like I said, we are still learning. We just want to find a way the script makes several quests at once to get several files at once, instead of one at a time.
So this is what I wrote after hours.. but doesnt work
Well, here we go.
What am I doing wrong ? I am giving my best, but I never used threading before. Hey #KlausD, check it and let me know what am I doing wrong ? it is a website that requir cookies. Also it need to load the ul and turn to pdf.
check my code try
import os
import threading
import urllib.request,string,random
from queue import Queue
def id_generator(size=6, chars=string.ascii_uppercase + string.digits):
return ''.join(random.choice(chars) for _ in range(size))
class Downloader(threading.Thread):
def __init__(self, queue):
threading.Thread.__init__(self)
self.queue = queue
def run(self):
while True:
url = self.queue.get()
self.download_file(url)
self.queue.task_done()
def download_file(self, url):
faturanum = 20184009433300
ab = ab + 1
handle = urllib.request.urlopen(url)
a = id_generator() + ".pdf"
faturanum = faturanum - 1
fatura = str(faturanum)
faturanome = fatura + ".pdf"
handle.addheaders = [('Cookie', 'ASP.NET_SessionId=v5x4k1m44saix1f5hpybf2qu')]
fname = os.path.basename(url)
with open(fname, "wb") as f:
while True:
chunk = handle.read(1024)
if not chunk: break
f.write(chunk)
def main(urls):
queue = Queue()
for i in range(5):
t = Downloader(queue)
t.setDaemon(True)
t.start()
for url in urls:
queue.put(url)
queue.join()
if __name__ == "__main__":
urls = ["https://sitetest.com/Fatura/Pdf?nrFatura="+fatura"]
main(urls)
what is wrong THis is my best.. believe I am not sleeping to make it working

Twitter API connection aborted with Twython

i'm trying to download twitter followers from a list of accounts. my function (that uses twython) works pretty well for short account lists but rise an error for longer lists. it is not a RateLimit problem since my function sleeps until the next time bin if the rate limit is hit.
the error is this
twythonerror: ('Connection aborted.', error(10054, ''))
others seem to have the same problem and the proposed solution is to make the function sleep between different REST API calls so i implemented the following code
del twapi
sleep(nap[afternoon])
afternoon = afternoon + 1
twapi = Twython(app_key=app_key, app_secret=app_secret,
oauth_token=oauth_token, oauth_token_secret=oauth_token_secret)
nap is a list of intervals in seconds and afternoon is an index.
despite this suggestion i still have the exact same problem. it seems that the sleep doesen't resolve the problem.
can anyone help me?
here is the whole finction
def download_follower(serie_lst):
"""Creates account named txt files containing followers ids. Uses for loop on accounts names list."""
nap = [1, 2, 4, 8, 16, 32, 64, 128]
afternoon = 0
for exemplar in serie_lst:
#username from serie_lst entries
account_name = exemplar
twapi = Twython(app_key=app_key, app_secret=app_secret,
oauth_token=oauth_token, oauth_token_secret=oauth_token_secret)
try:
#initializations
del twapi
if afternoon >= 7:
afternoon =0
sleep(nap[afternoon])
afternoon = afternoon + 1
twapi = Twython(app_key=app_key, app_secret=app_secret,
oauth_token=oauth_token, oauth_token_secret=oauth_token_secret)
next_cursor = -1
result = {}
result["screen_name"] = ""
result["followers"] = []
iteration = 0
file_name = ""
#user info
user = twapi.lookup_user(screen_name = account_name)
#store user name
result['screen_name'] = account_name
#loop until all cursored results are stored
while (next_cursor != 0):
sleep(random.randrange(start = 1, stop = 15, step = 1))
call_result = twapi.get_followers_ids(screen_name = account_name, cursor = next_cursor)
#loop over each entry of followers id and append each entry to results_follower
for i in call_result["ids"]:
result["followers"].append(i)
next_cursor = call_result["next_cursor"] #new next_cursor
iteration = iteration + 1
if (iteration > 13): #skip sleep if all cursored pages are processed
error_msg = localtime()
error_msg = "".join([str(error_msg.tm_mon), "/", str(error_msg.tm_mday), "/", str(error_msg.tm_year), " at ", str(error_msg.tm_hour), ":", str(error_msg.tm_min)])
error_msg ="".join(["Twitter API Request Rate Limit hit on ", error_msg, ", wait..."])
print(error_msg)
del error_msg
sleep(901) #15min + 1sec
iteration = 0
#output file
file_name = "".join([account_name, ".txt"])
#print output
out_file = open(file_name, "w") #open file "account_name.txt"
#out_file.write(str(result["followers"])) #standard format
for i in result["followers"]: #R friendly table format
out_file.write(str(i))
out_file.write("\n")
out_file.close()
except twython.TwythonRateLimitError:
#wait
error_msg = localtime()
error_msg = "".join([str(error_msg.tm_mon), "/", str(error_msg.tm_mday), "/", str(error_msg.tm_year), " at ", str(error_msg.tm_hour), ":", str(error_msg.tm_min)])
error_msg ="".join(["Twitter API Request Rate Limit hit on ", error_msg, ", wait..."])
print(error_msg)
del error_msg
del twapi
sleep(901) #15min + 1sec
#initializations
if afternoon >= 7:
afternoon =0
sleep(nap[afternoon])
afternoon = afternoon + 1
twapi = Twython(app_key=app_key, app_secret=app_secret,
oauth_token=oauth_token, oauth_token_secret=oauth_token_secret)
next_cursor = -1
result = {}
result["screen_name"] = ""
result["followers"] = []
iteration = 0
file_name = ""
#user info
user = twapi.lookup_user(screen_name = account_name)
#store user name
result['screen_name'] = account_name
#loop until all cursored results are stored
while (next_cursor != 0):
sleep(random.randrange(start = 1, stop = 15, step = 1))
call_result = twapi.get_followers_ids(screen_name = account_name, cursor = next_cursor)
#loop over each entry of followers id and append each entry to results_follower
for i in call_result["ids"]:
result["followers"].append(i)
next_cursor = call_result["next_cursor"] #new next_cursor
iteration = iteration + 1
if (iteration > 13): #skip sleep if all cursored pages are processed
error_msg = localtime()
error_msg = "".join([str(error_msg.tm_mon), "/", str(error_msg.tm_mday), "/", str(error_msg.tm_year), " at ", str(error_msg.tm_hour), ":", str(error_msg.tm_min)])
error_msg = "".join(["Twitter API Request Rate Limit hit on ", error_msg, ", wait..."])
print(error_msg)
del error_msg
sleep(901) #15min + 1sec
iteration = 0
#output file
file_name = "".join([account_name, ".txt"])
#print output
out_file = open(file_name, "w") #open file "account_name.txt"
#out_file.write(str(result["followers"])) #standard format
for i in result["followers"]: #R friendly table format
out_file.write(str(i))
out_file.write("\n")
out_file.close()
As discussed in the comments, there are a few issues with your code at present. You shouldn't need to delete your connection for it to function properly, and I think the issue comes because you initialise for a second time without having any catches for hitting your rate limit. Here is an example using Tweepy of how you can get the information you require:
import tweepy
from datetime import datetime
def download_followers(user, api):
all_followers = []
try:
for page in tweepy.Cursor(api.followers_ids, screen_name=user).pages():
all_followers.extend(map(str, page))
return all_followers
except tweepy.TweepError:
print('Could not access user {}. Skipping...'.format(user))
# Include your keys below:
consumer_key = 'YOUR_KEY'
consumer_secret = 'YOUR_KEY'
access_token = 'YOUR_KEY'
access_token_secret = 'YOUR_KEY'
# Set up tweepy API, with handling of rate limits
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
main_api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
# List of usernames to get followers for
lookup_users = ['asongtoruin', 'mbiella']
for username in lookup_users:
user_followers = download_followers(username, main_api)
if user_followers:
with open(username + '.txt', 'w') as outfile:
outfile.write('\n'.join(user_followers))
print('Finished outputting: {} at {}'.format(username, datetime.now().strftime('%Y/%m/%d %H:%M:%S')))
Tweepy is clever enough to know when it has hit its rate limit when we use wait_on_rate_limit=True, and checks how long it needs to sleep for before it can start again. By using wait_on_rate_limit_notify=True, we allow it to paste out how long it will be waiting until it can next get a page of followers (through this ID-based method, it seems as though there are 5000 IDs per page).
We additionally catch a TweepError exception - this can occur if the username provided relates to a protected account for which our authenticated user does not have permission to view. In this case, we simply skip the user to allow other information to be downloaded, but print out a warning that the user could not be accessed.
Running this saves a text file of follower ids for any user it can access. For me this prints the following:
Rate limit reached. Sleeping for: 593
Finished outputting: asongtoruin at 2017/02/22 11:43:12
Could not access user mbiella. Skipping...
With the follower IDs of asongtoruin (aka me) saved as asongtoruin.txt
There is one possible issue, in that our pages of followers start from the newest first. This could (though I don't understand the API well enough to say with certainty) result in issues with our output dataset if new users are added between our calls, as we may both miss these users and end up with duplicates in our dataset. If duplicates become an issue, you could change return all_followers to return set(all_followers)

How to request multiple url at one time using urllib in python

I'm programing a program for downloading images from internet and I would like to speed it up using multiple requests at once.
So I wrote a code you can see here at GitHub.
I can request for webpage only like this:
def myrequest(url):
worked = False
req = Request(url, headers={'User-Agent': 'Mozilla/5.0'})
while not worked:
try:
webpage_read = urlopen(req).read()
worked = True
except:
print("failed to connect to \n{}".format(url))
return(webpage_read)
url = "http://www.mangahere.co/manga/mysterious_girlfriend_x"
webpage_read = myrequest(url).decode("utf-8")
The while is here because I definitely want to download every single picture, so I'm trying until it work (nothing can go wrong except urllib.error.HTTPError: HTTP Error 504: Gateway Time-out)
My question is, how to run that multiple times at once?
My idea is to have " a comander" which will run 5 (or 85) pythonic scripts, give each url and get webpage from them once they are finished, but this is definitely a silly solution :)
EDIT:
I used _thread but it doesn't seem to speed up the program. That should have been the solution am I doing it wrong? that is my new question.
You can use link do get to my code on GitHub
def thrue_thread_download_pics(path, url, ep, name):
lock.acquire()
global goal
goal += 1
lock.release()
webpage_read = myrequest("{}/{}.html".format(url, ep))
url_to_pic = webpage_read.decode("utf-8").split('" onerror="')[0].split('<img src="')[-1]
pic = myrequest(url_to_pic)
myfile = open("{}/pics/{}.jpg".format(path, name), "wb")
myfile.write(pic)
myfile.close()
global finished
finished += 1
and I'm using it here:
for url_ep in urls_eps:
url, maxep = url_ep.split()
maxep = int(maxep)
chap = url.split("/")[-1][2:]
if "." in chap:
chap = chap.replace(".", "")
else:
chap = "{}0".format(chap)
for ep in range(1, maxep + 1):
ted = time.time()
name = "{}{}".format(chap, "{}{}".format((2 - len(str(ep))) * "0", ep))
if name in downloaded:
continue
_thread.start_new_thread(thrue_thread_download_pics, (path, url, ep, name))
checker = -1
while finished != goal:
if finished != checker:
checker = finished
print("{} of {} downloaded".format(finished, goal))
time.sleep(0.1)
Requests Futures is built on top of the very popular requests library and uses non-blocking IO:
from requests_futures.sessions import FuturesSession
session = FuturesSession()
# These requests will run at the same time
future_one = session.get('http://httpbin.org/get')
future_two = session.get('http://httpbin.org/get?foo=bar')
# Get the first result
response_one = future_one.result()
print(response_one.status_code)
print(response_one.text)
# Get the second result
response_two = future_two.result()
print(response_two.status_code)
print(response_two.text)

Categories