ZenPy search calls with max results - python

I am using ZenPy to search for a few tickets in ZenDesk:
open_tickets = zenpy_client.search(type='ticket', status=['new', 'open'], subject=subject, group=group_name, created_between=[two_weeks_ago_date, current_date])
The problem is when I have too many results from this Search call (more than 1.000 since it's the new query limit for ZenDesk API). I get the following exception:
<Invalid search: Requested response size was greater than Search Response Limits>
I'm trying to look into ZenPy documentation but couldn't find any parameter that I could use to limit the search call for 10 pages (in this case, 1.000 records since we get 100 tickets per request).
I ended up putting a try-catch in the call but I'm sure that's not the best solution:
from zenpy.lib.exception import APIException
try:
open_tickets = zenpy_client.search(type='ticket', status=['new', 'open'], subject=subject, group=group_name, created_between=[two_weeks_ago_date, current_date])
except APIException as ex:
...
What is the best solution to limit this search?
I also know that I can limit even more the dates but we are creating a lot of tickets in one specific day of the week so there's no way to filter more, I just need to go until the limit.
Reference:
https://developer.zendesk.com/rest_api/docs/support/search
https://develop.zendesk.com/hc/en-us/articles/360022563994--BREAKING-New-Search-API-Result-Limits
Thanks!

The generator returned by search supports Python slices, so the following code will pull results up to 1000 and prevent exceeding the new limit:
ticket_generator = zenpy_client.search(type="ticket")
print(ticket_generator[:1000])

Related

rate limit tweepy paginator search_all_tweets

I'm not sure why I am getting rate limited so quickly using:
mentions = []
for tweet in tweepy.Paginator(client.search_all_tweets, query= "to:######## lang:nl -is:retweet",
start_time = "2022-01-01T00:00:00Z", end_time = "2022-05-31T00:00:00Z",
max_results=500).flatten(limit=10000):
mention = tweet.text
mentions.append(mention)
I suppose I could put time.sleep(1) after these lines, but then it would mean I could only process one Tweet every second, whereas with a regular client.search_all_tweets I would get 500 Tweets per request.
Is there anything I'm missing here? How can I process more than one Tweet a second using tweepy.Paginator?
BTW: I have academic access and know the rate limit documentation.
See the FAQ section about this in Tweepy's documentation:
Why am I getting rate-limited so quickly when using Client.search_all_tweets() with Paginator?
The GET /2/tweets/search/all Twitter API endpoint that Client.search_all_tweets() uses has an additional 1 request per second rate limit that is not handled by Paginator.
You can time.sleep() 1 second while iterating through responses to handle this rate limit.
See also the relevant Tweepy issues #1688 and #1871.

Twitter pagination per page limit in downloading user profile Tweets

Here is the code I am using from this link. I have updated the original code as I need the full .json object. But I am having a problem with pagination as I am not getting the full 3200 Tweets.
api = tweepy.API(auth, parser=tweepy.parsers.JSONParser(),wait_on_rate_limit=True)
jsonFile = open(path+filname+'.json', "a+",encoding='utf-8')
page=1
max_pages=3200
result_limit=2
last_tweet_id=False
while page <= max_pages:
if last_tweet_id:
tweet = api.user_timeline(screen_name=user,
count=result_limit,
max_id=last_tweet_id - 1,
tweet_mode = 'extended',
include_retweets=True
)
else:
tweet = api.user_timeline(screen_name=user,
count=result_limit,
tweet_mode = 'extended',
include_retweets=True)
json_str = json.dumps(tweet, ensure_ascii=False, indent=4)
as per author "result_limit and max_pages are multiplied together to get the number of tweets called."
Then shouldn't I get 6400 Tweets by this definition. But the problem is I am getting 2 Tweets 3200 times. I also updated the values to
max_pages=3200
result_limit=5000
You can say it as a super limit so I should at least get 3200 Tweets. But in this case I got 200 Tweets repeated many times (as I terminated the code).
I just want 3200 Tweets per user profile, nothing fancy. Consider that I have 100 users list, so I want that in an efficient way. Currently seems like I am just sending so many requests and wasting time and assets.
Even though I update the code with a smaller value of max_pages, I am still not sure what should be that value, How am I supposed to know that a one-page covers how many Tweets?
Note: "This answer is not useful" as it has an error at .item() so please don't mark it duplicate.
You don't change last_tweet_id after setting it to False, so only the code in the else block is executing. None of the parameters in that method call change while looping, so you're making the same request and receiving the same response back over and over again.
Also, neither page nor max_pages changes within your loop, so this will loop infinitely.
I would recommend looking into using tweepy.Cursor instead, as it handles pagination for you.

Python Reddit PRAW get top week. How to change limit?

I have been familiarising myself with PRAW for reddit. I am trying to get the top x posts for the week, however I am having trouble changing the limit for the "top" method.
The documentation doesn't seem to mention how to do it, unless I am missing something. I can change the time peroid ok by just passing in the string "week", but the limit has be flummoxed. The image shows that there is a param for limit and it is set to 100.
r = self.getReddit()
sub = r.subreddit('CryptoCurrency')
results = sub.top("week")
for r in results:
print(r.title)
DOCS: subreddit.top()
IMAGE: Inspect listing generator params
From the docs you've linked:
Additional keyword arguments are passed in the initialization of
ListingGenerator.
So we follow that link and see the limit parameter for ListingGenerator:
limit – The number of content entries to fetch. If limit is None, then
fetch as many entries as possible. Most of reddit’s listings contain a
maximum of 1000 items, and are returned 100 at a time. This class will
automatically issue all necessary requests (default: 100).
So using the following should do it for you:
results = sub.top("week", limit=500)

List subscribers limit at 1000

I am trying to fetch a subscription list according to the Subscriptions: list documentation. I want to get all my subscribers so I am using mySubscribers=True in the parameter list in a loop after my first request.
while "nextPageToken" in my_dict:
next_page_token = my_dict["nextPageToken"]
my_dict = subscriptions_list_by_channel_id(client,
part='snippet,contentDetails',
mySubscribers=True,
maxResults=50,
pageToken=next_page_token
)
for item in my_dict["items"]:
file.write("{}\n".format(item["snippet"]["channelId"]))
The problem is at page 20 my loop breaks, i.e. I don't recieve a nextPageToken key in the response capping my data to 1000 total subscribers fetched. But I have more than 1000 subs. The documentation states that myRecentSubscribers has a limit at 1000 but that mySubscribers does not.
Can not really find much help with this anywhere. Any light on my situation?
I chose to list channels instead of listing subscriptions, passing the same argument mySubscribers. The documentations says it's deprecated and gives them back in a weird order with duplicates but it does not have a limit.

How do I return more than 100 Twitter search results with Twython?

Twitter only returns 100 tweets per "page" when returning search results on the API. They provide the max_id and since_id in the returned search_metadata that can be used as parameters to get earlier/later tweets.
Twython 3.1.2 documentation suggests that this pattern is the "old way" to search:
results = twitter.search(q="xbox",count=423,max_id=421482533256044543)
for tweet in results['statuses']:
... do something
and that this is the "new way":
results = twitter.cursor(t.search,q='xbox',count=375)
for tweet in results:
... do something
When I do the latter, it appears to endlessly iterate over the same search results. I'm trying to push them to a CSV file, but it pushes a ton of duplicates.
What is the proper way to search for a large number of tweets, with Twython, and iterate through the set of unique results?
Edit: Another issue here is that when I try to iterate with the generator (for tweet in results:), it loops repeatedly, without stopping. Ah -- this is a bug... https://github.com/ryanmcgrath/twython/issues/300
I had the same problem, but it seems that you should just loop through a user's timeline in batches using the max_id parameter. The batches should be 100 as per Terence's answer (but actually, for user_timeline 200 is the max count), and just set the max_id to the last id in the previous set of returned tweets minus one (because max_id is inclusive). Here's the code:
'''
Get all tweets from a given user.
Batch size of 200 is the max for user_timeline.
'''
from twython import Twython, TwythonError
tweets = []
# Requires Authentication as of Twitter API v1.1
twitter = Twython(PUT YOUR TWITTER KEYS HERE!)
try:
user_timeline = twitter.get_user_timeline(screen_name='eugenebann',count=200)
except TwythonError as e:
print e
print len(user_timeline)
for tweet in user_timeline:
# Add whatever you want from the tweet, here we just add the text
tweets.append(tweet['text'])
# Count could be less than 200, see:
# https://dev.twitter.com/discussions/7513
while len(user_timeline) != 0:
try:
user_timeline = twitter.get_user_timeline(screen_name='eugenebann',count=200,max_id=user_timeline[len(user_timeline)-1]['id']-1)
except TwythonError as e:
print e
print len(user_timeline)
for tweet in user_timeline:
# Add whatever you want from the tweet, here we just add the text
tweets.append(tweet['text'])
# Number of tweets the user has made
print len(tweets)
As per the official Twitter API documentation.
Count optional
The number of tweets to return per page, up to a maximum of 100
You need to make repeated calls to the python method. However, there is no guarantee that these will be the next N, or if the tweets are really coming in it might miss some.
If you want all the tweets in a time frame you can use the streaming api: https://dev.twitter.com/docs/streaming-apis and combine this with the oauth2 module.
How can I consume tweets from Twitter's streaming api and store them in mongodb
python-twitter streaming api support/example
Disclaimer: i have not actually tried this
As a solution to the problem of returning 100 tweets for a search query using Twython, here is the link showing how it can be done using the "old way":
Twython search API with next_results

Categories