I am trying to write a github webhood applicaton.
And at one point I will need to access the url to get the json object that loooks like in the picture below, and then I want to convert the "comment" json file into dictionary, so I can easily find the key and values in dictionary.
But the return value from the "json.loads(requests.get(comment_url).text)" is a list, not dictionary, so I have runtime error when I run the code when I was to use it like dictionary.
I googled a lot on this topic, and also try a lot of recommended suggestion from stackoverflow, but haven't found anything yet.
Just wondering does any know how to make the return value from list to dictionary. And I also I thought json.loads should return dictionary instead of list, have I done anything wrong?
comment_url = "https://api.github.com/repos/octocat/Hello-World/pulls/comments/1"
pull_request_comment = json.loads(requests.get(comment_url).text)
polite_words = ("please", "appreciate", "would be great")
for word in polite_words: # Loop through each element of reportTuple
if word[0] in pull_request_comment["body"] or word[1]==pull_request_comment["body"] or word[2] ==pull_request_comment["body"]:
print("You are polite!!!" )
return 'ok'
else:
print("You are impolite!!" )
return 'skipped'
Error
if word[0] in pull_request_comment["body"]
or word[1]==pull_request_comment["body"] or word[2]
==pull_request_comment["body"]: TypeError:
list indices must be integers or slices, not str
sample content from the url
So a couple of things here.
The Github API will return a JSON. From the possible response available here it will not provide a list, but a dict (after conversion from JSON to a python object). I think flakes was correct in his comment.
As to your main issue, I think it has to do with the fact that you use word[0] and word[1]. You are slicing a simple string here. Because of the for loop, each word variable will contain one of the elements from the polite_words. There's no need to actually do any slicing here.
Take a look at the logic below, it will trigger on any comment that contains one of the strings in the polite_words variable.
Also note that if you need the JSON from a request, using the requests library, you can just use response.json(). See the assertion in the logic below. Do please note that I'm doing the simplest assertion to determine if the JSON values are the same.
import json
import requests
comment_url = "https://api.github.com/repos/octocat/Hello-World/pulls/comments/1"
resp = requests.get(comment_url)
text = resp.text
js = resp.json()
print(type(text), text)
print(type(js), js)
pull_request_comment = json.loads(text)
print(type(pull_request_comment), pull_request_comment)
# assert if the values from js and pull_request_comment are the same
assert json.dumps(js, sort_keys=True) == json.dumps(pull_request_comment, sort_keys=True)
polite_words = ("please", "appreciate", "would be great")
# safety check to see if we actually got a body
if "body" in pull_request_comment:
polite = False
for word in polite_words: # Loop through each element of reportTuple
if word in pull_request_comment["body"]:
polite = True
break # no need for further processing
if polite:
print("You are polite!!!")
else:
print("You are impolite!!!")
else:
print(f"no body found in response:\n{text}")
EDIT
So changing the processing of single comments into a function, and then doing a simple check on whether you got a list or a dict (after conversion) from the API, you can call the funcion based on that.
See the adjusted code below...
import json
import requests
polite_words = ("please", "appreciate", "would be great")
def process_comment(comment_dict: dict):
# safety check to see if we actually got a body
if "body" in comment_dict:
polite = False
for word in polite_words: # Loop through each element of reportTuple
if word in comment_dict["body"]:
polite = True
break # no need for further processing
if polite:
return "You are polite!!!"
else:
return "You are impolite!!!"
else:
return f"no body found in response:\n{text}"
comment_url = "https://api.github.com/repos/octocat/Hello-World/pulls/comments/1"
resp_json = requests.get(comment_url).json()
results = []
if type(resp_json) == dict:
results.append(process_comment(resp_json))
elif type(resp_json) == list:
results.extend([process_comment(elem) for elem in resp_json])
else:
print(f"unexpected return from API:\n{resp_json}")
print(results)
To actually answer your question about how JSON is converted, it depends. Both a list and a dict are valid JSON objects. The only thing that the library does, is to make python objects from JSON representations. As to what actually is a valid JSON, the actual RFC8259 has the full spec for it.
In that regard, this works:
huh = json.loads('"value"')
print(huh)
output
value
What this code does it gets the values of a json.loads file. It gives me a list of dictionary that are organized by dates. This code works, my understanding is, i am taking the the first value in the list of dictionaries, so the first dictionary, but shouldn't. self.get_jsonparsed_data(self.ticker_text.get().upper())[0] work as well? In my case it doesn't, I was hoping if someone can explain why I does not work.
def get_jsonparsed_data(self, ticker):
#quote
url = (f"https://financialmodelingprep.com/api/v3/quote/{ticker}?)
response = urlopen(url)
data = response.read().decode("utf-8")
return json.loads(data)
def search_info(self):
#self.info.delete(0, END)
recent_filing = []
for header in self.get_jsonparsed_data(self.ticker_text.get().upper())[:1]:
recent_filing.append(header)
ticker = self.ticker_text.get()
#output dictionary values with proper format
try:
recent_filing_dict = recent_filing[0]
This works.
I get the first dictionary which is what i want but when i do self.get_jsonparsed_data(self.ticker_text.get().upper())[0] instead of self.get_jsonparsed_data(self.ticker_text.get().upper())[:1] it gives me an error
which pretty much is saying there isnt any values appended to recent_filing_dict. I was just hoping if someone can explain why?
"for" loops through iterable and self.get_jsonparsed_data(self.ticker_text.get().upper())[0] seemingly returns an item rather than iterable (list) while self.get_jsonparsed_data(self.ticker_text.get().upper())[:1] returns an iterable (single item list) which is iterated over by for loop
I am asking an ElasticSearch database to provide me with a list of indices and their creation dates using Python 2.7 and the Requests package. The idea is to quickly calculate which indices have exceeded the retention policy and need to be put to sleep.
The request works perfectly and the results are exactly what I want. However, when I run the code below, when I try to convert the json result to a dict, the type of theDict is correct but it reports a size of 1, when there should be at least a couple dozen entries. What am I doing wrong? I have a feeling it's something really dumb but I just can't snag it! :)
import json
import requests
esEndPoint = "https://localhost:9200"
retrieveString = "/_cat/indices?h=index,creation.date.string&format=json&s=creation.date"
# Gets the current indices and their creation dates
def retrieveIndicesAndDates():
try:
theResult = requests.get(esEndPoint+retrieveString)
print (theResult.content)
except Exception as e:
print("Unable to retrieve list of indices with creation dates.")
print("Error: "+e)
exit(3)
return theResult.content
def main():
theDict = dict(json.loads(retrieveIndicesAndDates()))
print(type(theDict)) # Reports correct type
print(len(theDict)) # Always outputs "1" ??
for index, creationdate in theDict.items():
print("Index: ",index,", Creation date: ",theDict[index])
return
The json the call returns:
[{"index":".kibana","creation.date.string":"2017-09-14T15:01:38.611Z"},{"index":"logstash-2018.07.23","creation.date.string":"2018-07-23T00:00:01.024Z"},{"index":"cwl-2018.07.23","creation.date.string":"2018-07-23T00:00:03.877Z"},{"index":"k8s-testing-internet-2018.07.23","creation.date.string":"2018-07-23T14:19:10.024Z"},{"index":"logstash-2018.07.24","creation.date.string":"2018-07-24T00:00:01.023Z"},{"index":"k8s-testing-internet-2018.07.24","creation.date.string":"2018-07-24T00:00:01.275Z"},{"index":"cwl-2018.07.24","creation.date.string":"2018-07-24T00:00:02.157Z"},{"index":"k8s-testing-internet-2018.07.25","creation.date.string":"2018-07-25T00:00:01.022Z"},{"index":"logstash-2018.07.25","creation.date.string":"2018-07-25T00:00:01.186Z"},{"index":"cwl-2018.07.25","creation.date.string":"2018-07-25T00:00:04.012Z"},{"index":"logstash-2018.07.26","creation.date.string":"2018-07-26T00:00:01.026Z"},{"index":"k8s-testing-internet-2018.07.26","creation.date.string":"2018-07-26T00:00:01.185Z"},{"index":"cwl-2018.07.26","creation.date.string":"2018-07-26T00:00:02.587Z"},{"index":"k8s-testing-internet-2018.07.27","creation.date.string":"2018-07-27T00:00:01.027Z"},{"index":"logstash-2018.07.27","creation.date.string":"2018-07-27T00:00:01.144Z"},{"index":"cwl-2018.07.27","creation.date.string":"2018-07-27T00:00:04.485Z"},{"index":"ctl-2018.07.27","creation.date.string":"2018-07-27T09:02:09.854Z"},{"index":"cfl-2018.07.27","creation.date.string":"2018-07-27T11:12:44.681Z"},{"index":"elb-2018.07.27","creation.date.string":"2018-07-27T11:13:51.340Z"},{"index":"cfl-2018.07.24","creation.date.string":"2018-07-27T11:45:23.697Z"},{"index":"cfl-2018.07.23","creation.date.string":"2018-07-27T11:45:24.646Z"},{"index":"cfl-2018.07.25","creation.date.string":"2018-07-27T11:45:25.700Z"},{"index":"cfl-2018.07.26","creation.date.string":"2018-07-27T11:45:26.341Z"},{"index":"elb-2018.07.24","creation.date.string":"2018-07-27T11:45:27.440Z"},{"index":"elb-2018.07.25","creation.date.string":"2018-07-27T11:45:29.572Z"},{"index":"elb-2018.07.26","creation.date.string":"2018-07-27T11:45:36.170Z"},{"index":"logstash-2018.07.28","creation.date.string":"2018-07-28T00:00:01.023Z"},{"index":"k8s-testing-internet-2018.07.28","creation.date.string":"2018-07-28T00:00:01.316Z"},{"index":"cwl-2018.07.28","creation.date.string":"2018-07-28T00:00:03.945Z"},{"index":"elb-2018.07.28","creation.date.string":"2018-07-28T00:00:53.992Z"},{"index":"ctl-2018.07.28","creation.date.string":"2018-07-28T00:07:19.543Z"},{"index":"k8s-testing-internet-2018.07.29","creation.date.string":"2018-07-29T00:00:01.026Z"},{"index":"logstash-2018.07.29","creation.date.string":"2018-07-29T00:00:01.378Z"},{"index":"cwl-2018.07.29","creation.date.string":"2018-07-29T00:00:04.100Z"},{"index":"elb-2018.07.29","creation.date.string":"2018-07-29T00:00:59.241Z"},{"index":"ctl-2018.07.29","creation.date.string":"2018-07-29T00:06:44.199Z"},{"index":"logstash-2018.07.30","creation.date.string":"2018-07-30T00:00:01.024Z"},{"index":"k8s-testing-internet-2018.07.30","creation.date.string":"2018-07-30T00:00:01.179Z"},{"index":"cwl-2018.07.30","creation.date.string":"2018-07-30T00:00:04.417Z"},{"index":"elb-2018.07.30","creation.date.string":"2018-07-30T00:01:01.442Z"},{"index":"ctl-2018.07.30","creation.date.string":"2018-07-30T00:08:28.936Z"},{"index":"cfl-2018.07.30","creation.date.string":"2018-07-30T06:52:16.739Z"}]
Your error is trying to convert a list of dicts to a dict:
theDict = dict(json.loads(retrieveIndicesAndDates()))
# ^^^^^ ^
That would only work for a dict of lists. It would be redundant, though.
Just use the reply directly. Each entry is a dict with the appropriate keys:
data = json.loads(retrieveIndicesAndDates())
for entry in data:
print("Index: ", entry["index"], ", Creation date: ", entry["creation.date.string"])
So what happens when you do convert that list to a dict? Why is there just one entry?
The dict understands three initialisation methods: keywords, mappings and iterables. A list fits the last one.
Initialisation from an iterable goes through it and expects key-value iterables as elements. If one were to do it manually, it would look like this:
def sequence2dict(sequence):
map = {}
for element in sequence:
key, value = element
map[key] = value
return map
Notice how each element is unpacked via iteration? In the reply each element is a dict with two entries. Iteration on that yields the two keys but ignores the values.
key, value = {"index":".kibana","creation.date.string":"2017-09-14T15:01:38.611Z"}
print(key, '=>', value) # prints "index => creation.date.string"
To the dict constructor, every element in the reply has the same key-value pair: "index" and "creation.date.string". Since keys in a dict are unique, all elements collapse to the same entry: {"index": "creation.date.string"}.
Hey guys I need a bit of guidance with this problem ( .py noobie)
So I have a list of websites that have different status codes:
url_list=["http://www.ehow.com/foo-barhow_2323550_clean-coffee-maker-vinegar.html",
"http://www.google.com",
"http://livestrong.com/register/confirmation/",
"http://www.facebook.com",
"http://www.youtube.com"]
What i'm trying to return is a dictionary that returns the website's status code as key and the associated websites as values. Something like that:
result= {"200": ["http://www.google.com",
"http://www.facebook.com",
"http://www.youtube.com"],
"301": ["http://livestrong.com/register/confirmation/"],
"404": ["http://www.ehow.com/foo-barhow_2323550_clean-coffee-maker-vinegar.html"]}
What I have till now:
Function that gets the status code:
def code_number(url):
try:
u = urllib2.urlopen(url)
code = u.code
except urllib2.HTTPError, e:
code = e.code
return code
And a function should return the dictionary but is not working - the part where i got stuck. Basically I dont know how to make it insert in the same status code more than 1 url
result={}
def get_code(list_of_urls):
for n in list_of_urls:
code = code_number(n)
if n in result:
result[code] = n
else:
result[code] = n
return result
Any ideas please?! Thank you
collections.defaultdict makes this a breeze:
import collections
def get_code(list_of_urls):
result = collections.defaultdict(list)
for n in list_of_urls:
code = code_number(n)
result[code].append(n)
return result
Not sure why you had result as a global, since it's returned as the function's result anyway (avoid globals except when really indispensable... locals are not only a structurally better approach, but also faster to access).
Anyway, the collections.defaultdict instance result will automatically call the list argument, and thus make an empty list, to initialize any entry result[n] that wasn't yet there at the time of indexing; so you can just append to the entry without needing to check whether it was previously there or not. That is the super-convenient idea!
If for some reason you want a plain dict as a result (though I can't think of any sound reason for needing that), just return dict(result) to convert the defaultdict into a plain dict.
You could initialize every key of the dict with a list, to which you will append any websites that return the same status code. Example:
result={}
def get_code(list_of_urls):
for n in list_of_urls:
code = code_number(n)
if code in result:
result[code].append(n)
else:
result[code] = [n]
return result
I also think that the condition should be if code in result, since your keys are the return codes.
I am trying to iterate over a JSON object, using simplejson.
def main(arg1):
response = urllib2.urlopen("http://search.twitter.com/search.json?q=" + arg1) #+ "&rpp=100&page=15")
twitsearch = simplejson.load(response)
twitsearch = twitsearch['results']
twitsearch = twitsearch['text']
print twitsearch
I am passing a list of values to search for in Twitter, like "I'm", "Think", etc.
The problem is that there are multiple text fields, one each for every Tweet. I want to iterate over the entire JSON object, pulling out the "text" field.
How would I do this? I'm reading the documentation and can't see exactly where it talks about this.
EDIT: It appears to be stored as a list of JSON objects.
Trying to do this:
for x in twitsearch:
x['text']
How would I store x['text'] in a list? Append?
Note that
twitsearch['results']
is a Python list. You can iterate over that list, storing the text component of each of those objects in your own list. A list comprehension would be a good thing to use here.
text_list = [x['text'] for x in twitsearch['results']]
Easy. Figured it out.
tweets = []
for x in twitsearch:
tweets.append(x['text'])