Tweets get back from twitter api are not showing whole tweets - python

This first I am using python twitter tool. I have question about results get back from it. It seems they are omission of original tweets.
import twitter
api = twitter.Api(consumer_key='jyd2tcu**OHiIrfg',
consumer_secret='****t80qZeM4JYvV5V8UpB0fTtebPSsb0LUjI9kYSZbLTRn',
access_token_key='1***74372608-dfi5bz22RTKep7GF04lk6FnPSYBgnD',
access_token_secret='5gt0YIw***gwPca5RXiwMksg7GM4ACQtl4')
results = api.GetSearch(
raw_query="q=immigration%20&result_type=recent")
Text I got back is
Text='RT #ddale8: Fox is now showing Trump\'s comments at Cabinet. He begins the clip by saying he\'s "heard numbers as high as $275 billion" for h…')
It ends with "…", is it how twitter api works or is there a way i can get whole tweets instead?
thank you

Try passing tweet_mode="extended" to the twitter.Api constructor.
I believe that since the original tweet is greater than 140 chars, we need to inform the interface to expect this as it does not do this by default.

Related

Problem with getting tweet_fields from Twitter API 2.0 using Tweepy

I have a similar problem as in this question (Problem with getting user.fields from Twitter API 2.0)
but I am using Tweepy. When making the request with tweet_fields, the response is only giving me the default values. In another fuction where I use user_fields it works perfectly.
I followed this guide, specifically number 17 (https://dev.to/twitterdev/a-comprehensive-guide-for-using-the-twitter-api-v2-using-tweepy-in-python-15d9)
My function looks like this:
def get_user_tweets():
client = get_client()
tweets = client.get_users_tweets(id=get_user_id(), max_results=5)
ids = []
for tweet in tweets.data:
ids.append(str(tweet.id))
tweets_info = client.get_tweets(ids=ids, tweet_fields=["public_metrics"])
print(tweets_info)
This is my response (with the last tweets from elonmusk) also there is no error code or anything else
Response(data=[<Tweet id=1471419792770973699 text=#WholeMarsBlog I came to the US with no money & graduated with over $100k in debt, despite scholarships & working 2 jobs while at school>, <Tweet id=1471399837753135108 text=#TeslaOwnersEBay #PPathole #ScottAdamsSays #johniadarola #SenWarren It’s complicated, but hopefully out next quarter, along with Witcher. Lot of internal debate as to whether we should be putting effort towards generalized gaming emulation vs making individual games work well.>, <Tweet id=1471393851843792896 text=#PPathole #ScottAdamsSays #johniadarola #SenWarren Yeah!>, <Tweet id=1471338213549744130 text=link>, <Tweet id=1471325148435394566 text=#24_7TeslaNews #Tesla ❤️>], includes={}, errors=[], meta={})
I found this link: https://giters.com/tweepy/tweepy/issues/1670. According to it,
Response is a namedtuple. Here, within its data field, is a single Tweet object.
The string representation of a Tweet object will only ever include its ID and text. This was an intentional design choice, to reduce the excess of information that could be displayed when printing all the data as the string representation, as with models.Status. The ID and text are the only default / guaranteed fields, so the string representation remains consistent and unique, while still being concise. This design is used throughout the API v2 models.
To access the data of the Tweet object, you can use attributes or keys (like a dictionary) to access each field.
If you want all the data as a dictionary, you can use the data attribute/key.
In that case, to access public metrics, you could maybe try doing this instead:
tweets_info = client.get_tweets(ids=ids, tweet_fields=["public_metrics"])
for tweet in tweets_info.data:
print(tweet["id"])
print(tweet["public_metrics"])

Tweepy: extended mode with api.search

I've written a simple script to get the most trending 300 tweets containing a specific hashtag.
for self._tweet in tweepy.Cursor(self._api.search,q=self._screen_name,count=300, lang="en").items(300):
self._csvWriter.writerow([self._tweet.created_at, self._tweet.text.encode('utf-8')])
It works well and it save the result to CSV but the tweets are truncated.
I modified the code like this, adding the twitter_mode=extended parameter:
for self._tweet in tweepy.Cursor(self._api.search,q=self._screen_name,count=300, lang="en", tweet_mode="extended").items(300):
self._csvWriter.writerow([self._tweet.created_at, self._tweet.text.encode('utf-8')])
But I got this exception:
AttributeError: 'Status' object has no attribute 'text
My question is: how can I save an complete tweet using a Cursor? (complete = not truncated)
Thanks in advance (and sorry, I'm a Tweepy newbie trying to learn as much as possible)
You're really close, do this instead:
for self._tweet in tweepy.Cursor(self._api.search,q=self._screen_name,count=300, lang="en", tweet_mode="extended").items(300):
self._csvWriter.writerow([self._tweet.created_at, self._tweet.full_text.encode('utf-8')])
Notice that I used full_text in self._tweet.full_text.encode('utf-8'), rather than just text. The text property is null when you use tweet_mode='extended' and the tweet appears in full_text instead.

Tweepy API Live Stream (Two-separate words back to back) - Python 3.5

I'm searching for separate words used back to back in a tweets, but it's resulting with tweets that have both words in one tweet (although not used in the correct form --- e.g. " Apple Watch " comes back as something like "#JohnDoe - I watch tv and eat an apple")
Code I'm currently using is as followed:
live_stream.filter(track = ("apple watch"))
I've also tried:
live_stream.filter(track = ("\"apple watch\""))
Both have not worked for the task at hand. Thanks!
The Twitter API doesn't support exact matching in this manner, unfortunately.
https://dev.twitter.com/streaming/overview/request-parameters
Exact matching of phrases (equivalent to quoted phrases in most search engines) is not supported.
The way to do it is as suggested by rbierman - retrieve the tweets and drop the ones that don't have the exact phrase, e.g., something along the lines of:
search_track = "apple watch"
retained tweets = [tweet for tweet in retrieved_tweets if search_track in tweet]

limit TwitterSearch API, python

How would I limit the number of results in a twitter search?
This is what I thought would work...tweetSearchUser is user input in another line of code.
tso = TwitterSearchOrder() # create a TwitterSearchOrder object
tso.set_keywords(["tweetSearchUser"]) # let's define all words we would like to have a look for
tso.setcount(30)
tso.set_include_entities(False) # and don't give us all those entity information
Was looking at this reference
https://twittersearch.readthedocs.org/en/latest/advanced_usage_ts.html
tried this, seems like it should work but can't figure out the format to enter the date...
tso.set_until('2016-02-25')
You should use set_count as specified in the documentation.
The default value for count is 200, because it is the maximum of tweets returned by the Twitter API.

BioPython Pubmed Eutils url?

I'm trying to run some queries against Pubmed's Eutils service. If I run them on the website I get a certain number of records returned, in this case 13126 (link to pubmed).
A while ago I bodged together a python script to build a query to do much the same thing, and the resultant url returns the same number of hits (link to Eutils result).
Of course, not having any formal programming background, it was all a bit cludgy, so I'm trying to do the same thing using Biopython. I think the following code should do the same thing, but it returns a greater number of hits, 23303.
from Bio import Entrez
Entrez.email = "A.N.Other#example.com"
handle = Entrez.esearch(db="pubmed", term="stem+cell[All Fields]",datetype="pdat", mindate="2012", maxdate="2012")
record = Entrez.read(handle)
print(record["Count"])
I'm fairly sure it's just down to some subtlety in how the url is being generated, but I can't work out how to see what url is being generated by Biopython. Can anyone give me some pointers?
Thanks!
EDIT:
It's something to do with how the url is being generated, as I can get back the original number of hits by modifying the code to include double quotes around the search term, thus:
handle = Entrez.esearch(db='pubmed', term='"stem+cell"[ALL]', datetype='pdat', mindate='2012', maxdate='2012')
I'm still interested in knowing what url is being generated by Biopython as it'll help me work out how i have to structure the search term for when i want to do more complicated searches.
handle = Entrez.esearch(db="pubmed", term="stem+cell[All Fields]",datetype="pdat", mindate="2012", maxdate="2012")
print(handle.url)
You've solved this already (Entrez likes explicit double quoting round combined search terms), but currently the URL generated is not exposed via the API. The simplest trick would be to edit the Bio/Entrez/__init__.py file to add a print statement inside the _open function.
Update: Recent versions of Biopython now save the URL as an attribute of the returned handle, i.e. in this example try doing print(handle.url)

Categories