I'm using the following code:
def recentchanges(bot=False,rclimit=20):
"""
#description: Gets the last 20 pages edited on the recent changes and who the user who edited it
"""
recent_changes_data = {
'action':'query',
'list':'recentchanges',
'rcprop':'user|title',
'rclimit':rclimit,
'format':'json'
}
if bot is False:
recent_changes_data['rcshow'] = '!bot'
else:
pass
data = urllib.urlencode(recent_changes_data)
response = opener.open('http://runescape.wikia.com/api.php',data)
content = json.load(response)
pages = tuple(content['query']['recentchanges'])
for title in pages:
return title['title']
When I do recentchanges() I only get one result. If I print it though, all the pages are printed.
Am I just misunderstanding or is this something relating to python?
Also, opener is:
cj = CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
Once a return statment is reached in a function, that functions execution ends, so the second return does not get executed. In order to return both values you need to pack them in a list or tuple:
...
returnList = [title['title'] for title in pages]
return returnList
This uses list comprehension to make a list of all the object you want the function to return and then returns it.
Then you can unpackage individual results from the return list:
answerList = recentchanges()
for element in answerList:
print element
The problem you are having is that a function ends at the first return line it sees.
So. in the line
for title in pages:
return title['title']
It returns only the first value: pages[0]['title'].
One way around this is to use a list-comprehension i.e.
return [ title['title'] for title in pages ]
Another option is to make recentchanges a generator and use yield.
for title in pages:
yield title['title']
return ends the function. So the loop only executes once, because you're returning in the loop. Think about it: how would the caller get subsequent values once the first value has been returned? Would they have to call the function again? But that would start it over again. Should Python wait until the loop is complete to return all the values at once? But where would they go and how would Python know to do this?
You might provide a generator here by yielding rather than returning it. You could also just return a generator:
return (page['title'] for page in pages)
Either way, the caller can then convert it to a list if desired, or iterate over it directly:
titles = list(recentchanges())
# or
for title in recentchanges():
print title
Alternatively, you can just return the list of titles:
return [page['title'] for page in pages]
Since you use return, your function will end after returning first value.
There are two alternatives;
you can append the titles to a list and return that, or
you can use yield instead of return to turn your function into a generator.
The latter is probably more pythonic, because you could then us it like this:
for title in recentchanges():
# do something with the title
pass
Related
when I use the print function it returns None.
when I replace it with return some of the data is deleted
def email_list(domains):
for domain in domains:
for user in domains[domain]:
return("{}#{}".format(user,domain))
print(email_list({"gmail.com": ["clark.kent", "diana.prince", "peter.parker"], "yahoo.com": ["barbara.gordon", "jean.grey"], "hotmail.com": ["bruce.wayne"]}))`
You are probably looking for this -
def email_list(domains):
lst = []
for domain in domains:
for user in domains[domain]:
lst.append("{}#{}".format(user,domain))
return lst
print(email_list({"gmail.com": ["clark.kent", "diana.prince", "peter.parker"], "yahoo.com": ["barbara.gordon", "jean.grey"], "hotmail.com": ["bruce.wayne"]}))
OUTPUT :
['clark.kent#gmail.com', 'diana.prince#gmail.com', 'peter.parker#gmail.com', 'barbara.gordon#yahoo.com', 'jean.grey#yahoo.com', 'bruce.wayne#hotmail.com']
In your code, the function would end on the first iteration itself. However, you are looking for first creating a list and at last after you are done with your looping, only then return the list.
Hey guys I need a bit of guidance with this problem ( .py noobie)
So I have a list of websites that have different status codes:
url_list=["http://www.ehow.com/foo-barhow_2323550_clean-coffee-maker-vinegar.html",
"http://www.google.com",
"http://livestrong.com/register/confirmation/",
"http://www.facebook.com",
"http://www.youtube.com"]
What i'm trying to return is a dictionary that returns the website's status code as key and the associated websites as values. Something like that:
result= {"200": ["http://www.google.com",
"http://www.facebook.com",
"http://www.youtube.com"],
"301": ["http://livestrong.com/register/confirmation/"],
"404": ["http://www.ehow.com/foo-barhow_2323550_clean-coffee-maker-vinegar.html"]}
What I have till now:
Function that gets the status code:
def code_number(url):
try:
u = urllib2.urlopen(url)
code = u.code
except urllib2.HTTPError, e:
code = e.code
return code
And a function should return the dictionary but is not working - the part where i got stuck. Basically I dont know how to make it insert in the same status code more than 1 url
result={}
def get_code(list_of_urls):
for n in list_of_urls:
code = code_number(n)
if n in result:
result[code] = n
else:
result[code] = n
return result
Any ideas please?! Thank you
collections.defaultdict makes this a breeze:
import collections
def get_code(list_of_urls):
result = collections.defaultdict(list)
for n in list_of_urls:
code = code_number(n)
result[code].append(n)
return result
Not sure why you had result as a global, since it's returned as the function's result anyway (avoid globals except when really indispensable... locals are not only a structurally better approach, but also faster to access).
Anyway, the collections.defaultdict instance result will automatically call the list argument, and thus make an empty list, to initialize any entry result[n] that wasn't yet there at the time of indexing; so you can just append to the entry without needing to check whether it was previously there or not. That is the super-convenient idea!
If for some reason you want a plain dict as a result (though I can't think of any sound reason for needing that), just return dict(result) to convert the defaultdict into a plain dict.
You could initialize every key of the dict with a list, to which you will append any websites that return the same status code. Example:
result={}
def get_code(list_of_urls):
for n in list_of_urls:
code = code_number(n)
if code in result:
result[code].append(n)
else:
result[code] = [n]
return result
I also think that the condition should be if code in result, since your keys are the return codes.
In web2py I have been trying to break down this list comprehension so I can do what I like with the categories it creates. Any ideas as to what this breaks down to?
def menu_rec(items):
return [(x.title,None,URL('shop', 'category',args=pretty_url(x.id, x.slug)),menu_rec(x.children)) for x in items or []]
In addition the following is what uses it:
response.menu = [(SPAN('Catalog', _class='highlighted'), False, '',
menu_rec(db(db.category).select().as_trees()) )]
So far I've come up with:
def menu_rec(items):
for x in items:
return x.title,None,URL('shop', 'category',args=pretty_url(x.id, x.slug)),menu_rec(x.children))
I've got other variations of this but, every variation only gives me back 1(one) category, when compared to the original that gives me all the categories.
Can anyone see where I'm messing this up at? Any and all help is appreciated, thank you.
A list comprehension builds a list by appending:
def menu_rec(items):
result = []
for x in items or []:
url = URL('shop', 'category', args=pretty_url(x.id, x.slug))
menu = menu_rec(x.children) # recursive call
result.append((x.title, None, url, menu))
return result
I've added two local variables to break up the long line somewhat, and to show how it recursively calls itself.
Your version returned directly out of the for loop, during the first iteration, and never built up a list.
You don't want to do return. Instead append to a list and then return the list:
def menu_rec(items):
result = []
for x in items:
result.append(x.title,None,URL('shop', 'category',args=pretty_url(x.id, x.slug)),menu_rec(x.children)))
return result
If you do return, it will return the value after only the first iteration. Instead, keep adding it to a list and then return that list at the end. This will ensure that your result list only gets returned when all the values have been added instead of just return one value.
I am trying to parse a JSON page in Python, it is contained in a variable reddit_front.
I am trying to get the sum of all "ups" in this page. I do have the right answer which is the following:
def total_ups():
j=json.loads(reddit_front)
return sum(c["data"]["ups"] for c in j["data"]["children"])
However why does the following loop give me only 1 item and not iterate over?
def total_ups():
j=json.loads(reddit_front)
for c in j["data"]["children"]:
i = c["data"]["ups"]
a = +i
return a
PS: ok, all good points and thx for the negative reputations points, it's fair call I wasn't precise in my question.
return will stop the loop, try appending it to a list then you can join it or whatever you need to so you can get the data.
Example:
def total_ups():
a = list()
j=json.loads(reddit_front)
for c in j["data"]["children"]:
i = c["data"]["ups"]
a.append(+i)
return a
print(total_ups()) # returns list
print(", ".join(total_ups)) # returns a string separated by commas
Maybe...
def total_ups():
children = json.loads(reddit_front)["data"]["children"]
return sum(c["data"]["ups"] for c in children)
This may seem like the worlds simplest python question... But I'm going to give it a go of explaining it.
Basically I have to loop through pages of json results from a query.
the standard result is this
{'result': [{result 1}, {result 2}], 'next_page': '2'}
I need the loop to continue to loop, appending the list in the result key to a var that can be later accessed and counted the amount of results within the list. However I require it to loop only while next_page exists as after a while when there are no more pages the next_page key is dropped from the dict.
currently i have this
next_page = True
while next_page == True:
try:
next_page_result = get_results['next_page'] # this gets the next page
next_url = urllib2.urlopen("http://search.twitter.com/search.json" + next_page_result)# this opens the next page
json_loop = simplejson.load(next_url) # this puts the results into json
new_result = result.append(json_loop['results']) # this grabs the result and "should" put it into the list
except KeyError:
next_page = False
result_count = len(new_result)
Alternate (cleaner) approach, making one big list:
results = []
res = { "next_page": "magic_token_to_get_first_page" }
while "next_page" in res:
fp = urllib2.urlopen("http://search.twitter.com/search.json" + res["next_page"])
res = simplejson.load(fp)
fp.close()
results.extend(res["results"])
new_result = result.append(json_loop['results'])
The list is appended as a side-effect of the method call.
append() actually returns None, so new_result is now a reference to None.
You want to use
result.append(json_loop['results']) # this grabs the result and "should" put it into the list
new_result = result
if you insist on doing it that way. As Bastien said, result.append(whatever) == None
AFAICS, you don't need the variable new_result at all.
result_count = len(result)
will give you the answer you need.
you cannot append into a dict..you can append into your list inside your dict,you should do like this
result['result'].append(json_loop['results'])
if you want to check if there is no next page value in your result dict,and you want to delete the key from the dict,just do like this
if not result['next_page']:
del result['next_page']