Twitter Trending List Printing First Character Instead of First Entry - python

I'm having an issue where I can get the Twitter API to provide me with the top 10 list of trending topics in a given area, but I can only get the entirety to print, or the first character to print, but not the first entry in the list.
The following code is what I tried to just print the first entry in the list (entry 0) but I get the first character for each list entry instead (character 0).
from twitter import *
access_token = "myaccesstoken"
access_token_secret = "myaccesstokensecret"
consumer_key = "consumerkey"
consumer_secret = "consumersecret"
t = Twitter(auth=OAuth(access_token, access_token_secret, consumer_key, consumer_secret))
results = t.trends.place(_id = 2442047)
#I used the Los Angeles WOEID
for location in results:
for trend in location["trends"]:
trendlist = trend["name"]
print trendlist[0]
If I just use a simple list like this, I can get Python to just print the first entry:
trendlist = ['one', 'two', 'three']
print trendlist[0]
Can anyone provide a pointer on why this behavior is different and how to just get one entry to print from the Trending list?
Thank you!

The trends api returns something like this:
"trends": [
{
"events": null,
"name": "#GanaPuntosSi",
"promoted_content": null,
"query": "%23GanaPuntosSi",
"url": "http://twitter.com/search/?q=%23GanaPuntosSi"
}...]
With your second for loop you iterate through each of the above trend "objects".
trendlist = trend["name"]
doesn't get you a list, but the trend name.
print trendlist[0]
prints out the first letter of the name.
Just print trend["name"] and you are done.
Here's a little repl.it for you https://repl.it/BLww/1. You are printing all 10 because you are looping through them all. If you want to print just the first one, you can do this:
for location in results:
print location["trends"][0]['name']

Related

Tweet Strings via Tweepy

I'm using tweepy to automatically tweet a list of URLs. However if my list is too long (it can vary from tweet to tweet) I am not allowed. Is there anyway that tweepy can create a thread of tweets when the content is too long? My tweepy code looks like this:
import tweepy
def get_api(cfg):
auth = tweepy.OAuthHandler(cfg['consumer_key'],
cfg['consumer_secret'])
auth.set_access_token(cfg['access_token'],
cfg['access_token_secret'])
return tweepy.API(auth)
def main():
# Fill in the values noted in previous step here
cfg = {
"consumer_key" : "VALUE",
"consumer_secret" : "VALUE",
"access_token" : "VALUE",
"access_token_secret" : "VALUE"
}
api = get_api(cfg)
tweet = "Hello, world!"
status = api.update_status(status=tweet)
# Yes, tweet is called 'status' rather confusing
if __name__ == "__main__":
main()
Your code isn't relevant to the problem you're trying to solve. Not only does main() not seem to take any arguments (tweet text?) but you don't show how you are currently trying approaching the matter. Consider the following code:
import random
TWEET_MAX_LENGTH = 280
# Sample Tweet Seed
tweet = """I'm using tweepy to automatically tweet a list of URLs. However if my list is too long (it can vary from tweet to tweet) I am not allowed."""
# Creates list of tweets of random length
tweets = []
for _ in range(10):
tweets.append(tweet * (random.randint(1, 10)))
# Print total initial tweet count and list of lengths for each tweet.
print("Initial Tweet Count:", len(tweets), [len(x) for x in tweets])
# Create a list for formatted tweet texts
to_tweet = []
for tweet in tweets:
while len(tweet) > TWEET_MAX_LENGTH:
# Take only first 280 chars
cut = tweet[:TWEET_MAX_LENGTH]
# Save as separate tweet to do later
to_tweet.append(cut)
# replace the existing 'tweet' variable with remaining chars
tweet = tweet[TWEET_MAX_LENGTH:]
# Gets last tweet or those < 280
to_tweet.append(tweet)
# Print total final tweet count and list of lengths for each tweet
print("Formatted Tweet Count:", len(to_tweet), [len(x) for x in to_tweet])
It's separated out as much as possible for ease-of-interpretation. The gist is that one could start with a list of text to be used as tweets. The variable TWEET_MAX_LENGTH defines where each tweet would be split to allow for multi-tweets.
The to_tweet list would contain each tweet, in the order of your initial list, expanded into multiple tweets of <= TWEET_MAX_LENGTH length strings.
You could use that list to feed into your actual tweepy function that posts. This approach is pretty willy-nilly and doesn't do any checks for maintaining sequence of split tweets. Depending on how you're implenting your final tweet functions, that might be an issue but also a matter for a separate question.

How do I place multiple searched tweets into string

I have a program set up so it searches tweets based on the hashtag I give it and I can edit how many tweets to search and display but I can't figure out how to place the searched tweets into a string. this is the code I have so far
while True:
for status in tweepy.Cursor(api.search, q=hashtag).items(2):
tweet = [status.text]
print tweet
when this is run it only outputs 1 tweet when it is set to search 2
Your code looks like there's nothing to break out of the while loop. One method that comes to mind is to set a variable to an empty list and then with each tweet, append that to the list.
foo = []
for status in tweepy.Cursor(api.search, q=hashtag).items(2):
tweet = status.text
foo.append(tweet)
print foo
Of course, this will print a list. If you want a string instead, use the string join() method. Adjust the last line of code to look like this:
bar = ' '.join(foo)
print bar

Grouping of documents having the same phone number

My database consists of collection of a large no. of hotels (approx 121,000).
This is how my collection looks like :
{
"_id" : ObjectId("57bd5108f4733211b61217fa"),
"autoid" : 1,
"parentid" : "P01982.01982.110601173548.N2C5",
"companyname" : "Sheldan Holiday Home",
"latitude" : 34.169552,
"longitude" : 77.579315,
"state" : "JAMMU AND KASHMIR",
"city" : "LEH Ladakh",
"pincode" : 194101,
"phone_search" : "9419179870|253013",
"address" : "Sheldan Holiday Home|Changspa|Leh Ladakh-194101|LEH Ladakh|JAMMU AND KASHMIR",
"email" : "",
"website" : "",
"national_catidlineage_search" : "/10255012/|/10255031/|/10255037/|/10238369/|/10238380/|/10238373/",
"area" : "Leh Ladakh",
"data_city" : "Leh Ladakh"
}
Each document can have 1 or more phone numbers separated by "|" delimiter.
I have to group together documents having same phone number.
By real time, I mean when a user opens up a particular hotel to see its details on the web interface, I should be able to display all the hotels linked to it grouped by common phone numbers.
While grouping, if one hotel links to another and that hotels links to another, then all 3 should be grouped together.
Example : Hotel A has phone numbers 1|2, B has phone numbers 3|4 and C
has phone numbers 2|3, then A, B and C should be grouped together.
from pymongo import MongoClient
from pprint import pprint #Pretty print
import re #for regex
#import unicodedata
client = MongoClient()
cLen = 0
cLenAll = 0
flag = 0
countA = 0
countB = 0
list = []
allHotels = []
conContact = []
conId = []
hotelTotal = []
splitListAll = []
contactChk = []
#We'll be passing the value later as parameter via a function call
#hId = 37443;
regx = re.compile("^Vivanta", re.IGNORECASE)
#Connection
db = client.hotel
collection = db.hotelData
#Finding hotels wrt search input
for post in collection.find({"companyname":regx}):
list.append(post)
#Copying all hotels in a list
for post1 in collection.find():
allHotels.append(post1)
hotelIndex = 11 #Index of hotel selected from search result
conIndex = hotelIndex
x = list[hotelIndex]["companyname"] #Name of selected hotel
y = list[hotelIndex]["phone_search"] #Phone numbers of selected hotel
try:
splitList = y.split("|") #Splitting of phone numbers and storing in a list 'splitList'
except:
splitList = y
print "Contact details of",x,":"
#Printing all contacts...
for contact in splitList:
print contact
conContact.extend(contact)
cLen = cLen+1
print "No. of contacts in",x,"=",cLen
for i in allHotels:
yAll = allHotels[countA]["phone_search"]
try:
splitListAll.append(yAll.split("|"))
countA = countA+1
except:
splitListAll.append(yAll)
countA = countA + 1
# print splitListAll
#count = 0
#This block has errors
#Add code to stop when no new links occur and optimize the outer for loop
#for j in allHotels:
for contactAll in splitListAll:
if contactAll in conContact:
conContact.extend(contactAll)
# contactChk = contactAll
# if (set(conContact) & set(contactChk)):
# conContact = contactChk
# contactChk[:] = [] #drop contactChk list
conId = allHotels[countB]["autoid"]
countB = countB+1
print "Printing the list of connected hotels..."
for final in collection.find({"autoid":conId}):
print final
This is one code I wrote in Python. In this one, I tried performing linear search in a for loop. I am getting some errors as of now but it should work when rectified.
I need an optimized version of this as liner search has poor time complexity.
I am pretty new to this so any other suggestions to improve the code are welcome.
Thanks.
The easiest answer to any Python in-memory search-for question is "use a dict". Dicts give O(ln N) key-access speed, lists give O(N).
Also remember that you can put a Python object into as many dicts (or lists), and as many times into one dict or list, as it takes. They are not copied. It's just a reference.
So the essentials will look like
for hotel in hotels:
phones = hotel["phone_search"].split("|")
for phone in phones:
hotelsbyphone.setdefault(phone,[]).append(hotel)
At the end of this loop, hotelsbyphone["123456"] will be a list of hotel objects which had "123456" as one of their phone_search strings. The key coding feature is the .setdefault(key, []) method which initializes an empty list if the key is not already in the dict, so that you can then append to it.
Once you have built this index, this will be fast
try:
hotels = hotelsbyphone[x]
# and process a list of one or more hotels
except KeyError:
# no hotels exist with that number
Alternatively to try ... except, test if x in hotelsbyphone:

Google search from Python program

I'm trying to take an input file, read each line, search google with that line and print all the search results from the query ONLY IF the result is from a specific website. A simple example to illustrate my point, if I search dog I only want results printed from wikipedia, whether that be one result or ten results from wikipedia. My problem is I've been getting really weird results. Below is my Python code which contains a specific URL I want results from.
My program
inputFile = open("small.txt", 'r') # Makes File object
outputFile = open("results1.txt", "w")
dictionary = {} # Our "hash table"
compare = "www.someurl.com/" # urls will compare against this string
from googlesearch import GoogleSearch
for line in inputFile.read().splitlines():
lineToRead = line
dictionary[lineToRead] = [] #initialzed to empty list
gs = GoogleSearch(lineToRead)
for url in gs.top_urls():
print url # check to make sure this is printing URLs
compare2 = url
if compare in compare2: #compare the two URLs, if they match
dictionary[lineToRead].append(url) #write out query string to dictionary key & append EACH url that matches
inputFile.close()
for i in dictionary:
print i # this print is a test that shows what the query was in google (dictionary key)
outputFile.write(i+"\n")
for j in dictionary[i]:
print j # this print is a test that shows the results from the query which should look like correct URL: "www.medicaldepartmentstore.com/..."(dictionary value(s))
outputFile.write(j+"\n") #write results for the query string to the output file.
My output file is incorrect, the way it's supposed to be formatted is
query string
http://www.
http://www.
http://www.
query string
http://www.
query string
http://www.medical...
http://www.medical...
Can you limit the scope of the results to the specific site (e.g. wikipedia) at the time of the query? For example, using:
gs = GoogleSearch("site:wikipedia.com %s" % query) #as shown in https://pypi.python.org/pypi/googlesearch/0.7.0
This would instruct Google to return only the results from that domain, so you won't need to filter them after seeing the results.
I think #Cahit has the right idea. The only reason you would be getting lines of just the query string is because the domain you were looking for wasn't in the top_urls(). You can verify this by checking if the array contained in the dictionary for a given key is empty
for i in dictionary:
outputFile.write("%s: " % str(i))
if len(dictionary[i]) == 0:
outputFile.write("No results in top_urls\n")
else:
outputFile.write("%s\n" % ", ".join(dictionary[i]))

pymongo no output for query

EDIT:
I have somewhat distilled the question.
mongo_documents = mongo_collection.find({"medicalObjectId": "269"})
print "\n\n"
for this_document in mongo_documents:
print this_document
print "-------------------------"
pqr = 269
mongo_documents2 = mongo_collection.find({"medicalObjectId": pqr})
print "\n\n"
for this_document2 in mongo_documents2:
print this_document2
My problem is that the first code chunk where I use the number as the key in the query, works. But the second chunk where I use the variable, i get no output.
I am a beginner at python and pymongo, so please bear with me.
I have a list as;
row = [1, 2, ...., 100]
I want to query a mongodb collection for each entry in my list.
The collection has the format:
collection = {'pk', 'attribute1', 'attribute2', 'attribute3'}
I want to call the mongodb connection and iterate through each entry in my list with row[i]=pk and return the other attributes as the output.
ie. mongo_documents = mongo_collection.find({'pk' : row[0]})
mongo_documents = mongo_collection.find({'pk' : row[1]})
and so on.
The code that I have is:
for row in result_set:
print row[0]
mongo_documents = mongo_collection.find({'medicalObjectId' : row[0]})
print mongo_documents
for this_document in mongo_documents:
print "----------------------------------"
print this_document
however i get no output. where am I going wrong?
if i print mongo_documents, i get
<pymongo.cursor.Cursor object at 0xe43150>
You could use the $in operator of mongodb to fetch all the rows at once and iterate through them.
mongo_documents = mongo_collection.find({ 'medicalObjectId' : { '$in' : result_set } } );
for doc in mongo_documents:
print mongo_documents
I have not tested it, comment below if it doesnt work.
EDIT
mongo_documents2 = mongo_collection.find({"medicalObjectId": str(pqr)})
print "\n\n"
for this_document2 in mongo_documents2:
print this_document2

Categories