I use Python and Spotipy lib.
Is there any way to list all songs from spotify playlist?
Method playlist_tracks() has limit to print only 100 songs.
Same question about other methods eg. current_user_saved_tracks() with its 20 limits.
Why spotify api have these limits?
Thanks
In order to get all songs in a playlist you use the first request to determine the total song count which is part of the spotify response.
Dividing the total count by the max song limit per request (100) and rounding up the result will give you the necessary amount of api calls needed to get all song.
With this information in mind you simply loop from 0 to call count and request the next 100 song.
Inside the loop you have to set the offset property of the request to x * 100.
Furthermore your have to add the requested songs to a temporary array to get the entire list at the end.
I can only assume that the general limit of 100 songs per request is a technical need at spotify.
Related
Using tweepy, I made the following request
tmpTweets = tweepy.Cursor( api.search_full_archive, label=LABEL_NAME, query=‘to:{}’.format(tweet_user), fromDate=tweet_date_formatted, toDate=tweet_date_end_formatted, maxResults=500).items()
The goal was to get all tweets to a user within a 72 hour period, so I can manually filter the "in_reply_*" features to find replies to a specific tweet.
Doing so, I paid for the 250-request limit API, as I had about 50 tweets I wanted to do this with.
Within ~5 minutes, I ran out of my query limit, claiming I did 253 queries. At this point, I had only processed about 3 tweets.
So, my guess is that a single "call" in my code isn't a request, but rather every 500-tweets returned is a request. Thus that single line of code can be multiple requests. Can anyone confirm?
Secondly, I wonder how this makes sense. The Full Archive Premium API offers
250 Requests
1.25M Tweets
However, if a single request can return only 500 tweets, then the max tweets you can ever obtain is (250*500=125,000) Tweets. Thus, I fundamentally believe I am structuring my requests wrong to obtain proper usage of my Premium account.
So my question is...
tmpTweets = tweepy.Cursor( api.search_full_archive, label=LABEL_NAME, query=‘to:{}’.format(tweet_user), fromDate=tweet_date_formatted, toDate=tweet_date_end_formatted, maxResults=500).items()
How do I alter this line of code to be able to maximize the number of tweets I get, such that I am not blowing through my request limit ( 253/250 ) well before my tweet limit ( 123K/1.25M )
Thanks!
I’m using Twitter’s real-time filtered stream API (for the first time!) in Python, and I’m basically trying to recreate my real-time timeline in-terminal: I want to get a response every time someone I follow tweets, retweets, etc. I’m doing this in two steps:
GETting an id list of people I follow (my “friends”) from api.twitter.com/1.1/friends/ids.json?screen_name=my_username
Iterating over the returned list ids and creating a “from:” follow rule for each:
for f_id in friends_list_response['ids']:
rule_val = "from:" + str(f_id)
filter_rules.append({"value": rule_val, "tag": str(f_id)},)
I have this working for up to 10 test friends—they tweet, I get terminal printout in real-time. However I very quickly hit a cap, since I follow 500+ accounts, and am thus creating 500+ distinct from: rules for the filtered stream.
(HTTP 422): {"detail":"Rule creation request exceeds account's current rule limits. Please remove or update existing rules to proceed.","title":"RulesCapExceeded","type":"https://api.twitter.com/2/problems/rule-cap"
I don’t really understand what to do about hitting my cap (that "type:" url doesn't offer guidance). Whether I need higher permissions to the API v2, or if I need to do something with Power Track. How can I increase my rule limit, ideally tracking 1000+ accounts? I believe I have the basic developer authorization. Or, is there a workaround to a filter rule for tweets from my friends? I wish there was an endpoint that filtered tweets by everyone followed by a user.
Again, I'm a new to this API, so I appreciate any insight!
Nevermind, Tweepy's "home_timeline" function is exactly what I needed, almost a direct answer to "I wish there was an endpoint that filtered tweets by everyone followed by a user."
In Twitter API v2, you can use filtered stream to follow a list of users. You can build filtered stream from rules. The limits of rules are there:
Your rules will be limited depending on which product track you are using.
If you are using the Standard product track at the Basic level of access, you are able to add 25 concurrent rules to your stream, and each rule can be 512 characters long.
If you are using the Academic Research product track, you are able to add 1000 concurrent rules to your stream, and each rule can be 1024 characters long.
So, you can add 25 concurrent rules with 512 characters for each rule. If the only filter will be users accounts or IDs, you need to calculate the number of users that fit within the character limitations.
The average character length of the id of twitter accounts could be between 8 (older) and 18 (newest accounts). Note that you will need to add '\sfrom:' for each user, this means 6 more characters for each user account.
If we take into account that, the average length of ID is 13, adding the 6 predecessor characters:
512 limit characters / 19 avg characters per user ID = ~26 ID per rule
If we have 25 rules:
25 rules * 26 IDs per rule = ~650 IDs in total
Note that this is approximate and you can use the account name instead of the ID.
I want to create a database of YouTube videos with counts of likes and dislikes clustered on the basis of genres. So I need a data-set of each and every video on YouTube. So far the data API supports queries fired only for a single URL. Further I am not sure if Data API supports going through each and every video which seems unfeasible. Is there any way I can get the task done. Should I try to crawl, even tough I am not sure if that is legal?
Also I am using a web based architecture for it.
Thanks for any help.
YouTube imposes a soft limit of about 500. There is no current way to get more than that through the API.
Full details: https://code.google.com/p/gdata-issues/issues/detail?id=4282
Relevant Excerpt:
"We can't provide more than ~500 search results for any arbitrary YouTube query via the API without the quality of the search results severely degrading (duplicates, etc.).
The v1/v2 GData API was updated back in November to limit the number of search results returned to 500. If you specify a start-index of 500 or more, you won't get back any results.
This was supposed to have also gone into effect for the v3 API (which uses a different method of paging through results) but it apparently was not pushed out, so it is still possible to retrieve up to 1000 search results in v3—the last 500 of which are usually of bad quality.
The change to limit v3 to 500 search results will be pushed out sometime in the near future. There will no longer be nextPageTokens returned once you hit 500 results.
I understand that the totalResults that are returned is much higher than 500 in all of these cases, but that is not the same thing as saying that we can effectively return all X million possible results. It's meant as an estimate of the total size of the set of videos that match a query and normally isn't very useful."
I'm currently trying to retrieve the followers of some big account with a lot of followers.
I'm using Tweepy and this piece of code (with cursor):
follower_cursors = tweepy.Cursor(api.followers, id = id_var,count=5000)
for friend in follower_cursors.items():
Ok if I don't specify count it seems that by default only 20 results are shown per page, but as from Twitter API documentation it can provide 5000 followers I tried to set it to the maximum.
However this doesn't seem to be taken into account and each page contains a maximum of 200 entries, which is a real problem as you will trigger the rate limit much more easily.
What m'I doing wrong? Is there a way to make Tweepy requests pages of 5000 IDs, to minimize requets and overide this default max value of 200?
Thanks!
You could use cursor for pages instead of items, and then process the items per page:
for page in Cursor(api.user_timeline).pages():
# page is a list of statuses
process_page(page)
# or iterate over items in `page`
I don't see a limit in the tweepy Cursor for results returned, so it should return as many as it gets.
Previous answer:
The max per-page result is enforced by the Twitter API, not by tweepy. You're supposed to paginate over the list of 200-per-call results, which Cursor is already doing for you. If there were 5000 followers, then with the max 200 results per query, you're using only 25 calls. You'd still have 4975 calls left to do other things.
To exceed the 5000-per-hour rate limit, you'd need to be doing at least 83 calls per minute or 1.4 calls per second.
Note that 'read limits' are per-application but 'write limits' are per user. So you could split your task between two or more apps* if they are read intensive.
Consider using the Streaming API instead, if it's more appropriate for your needs.
*: Though I'm sure Twitter has controls in place to prevent abuse.
I am trying to retrieve comments and likes for specific posts through Facebook's opengraph API. While I do get some information back, it does not always match the comments/likes count mentioned in the post. I guess this can be attributed to the access permissions of the token I'm using. However, I have noticed that results vary depending on the request limit I use, and sometimes I also get duplicate entries between requests.
For example, post 10376464573_150423345118848 has about 14000 likes as of this writing, but I can only retrieve a maximum of around 5000. With the default limit of 25 I can get up to 3021 likes. A value of 100 gives 4501, while limits of 1000, 2000, 3000 and 5000 all return the same number of likes, 4959 (the absolute values don't make too much sense of course, they are just there for comparison).
I have noticed similar results on a smaller scale for comments.
I'm using a simple python script to fetch pages. It goes through the data following the pagination links provided by Facebook, writing each page retrieved to a separate file. Once an empty reply is encountered it stops.
With small limits (e.g. the default of 25), I notice that the number of results returned is monotically decreasing as I go through the pagination links, which seems really odd.
Any thoughts on what could be causing this behavior and how to work around it?
If you are looking for a list of the names of each and every like / comment on a particular post I think you will run up against the API limit (even with pagination).
If you are merely looking for an aggregate number of likes, comments, shares, or link clicks, you'll want to simply use the summary=true param provided in the posts endpoint. Kind of like this:
try:
endpoint = 'https://graph.facebook.com/v2.5/'+postid+'/comments?summary=true&access_token='+apikey
response = requests.get(endpoint)
fb_data = response.json()
return fb_data
You can also retrieve all of the posts of any particular page and their summary data points:
{page_id}/posts?fields=message,likes.limit(1).summary(true)
You can retrieve comments and like count or other information of a particular post using url or api below.
https://graph.facebook.com/{0}/comments?access_token={1}&limit={2}&fields=from,message,message_tags,created_time,id,attachment,like_count,comment_count,parent&order=chronological&filter=stream'.format(post_id,access_token,limit)
As here order specified as chronological, you need to use after parameter as well in the same url whose value one can get in paging.cursor.after section of the first response.