Accessing an element in a slightly complicated dictionary

Accessing an element in a slightly complicated dictionary - python

import requests
api="https://api.rootnet.in/covid19-in/stats/latest"
response=requests.get(api)
local_case_tracker=response.json()
print(local_case_tracker.items())
print(local_case_tracker['data'])
print(local_case_tracker['data']['regional'])
So I'm trying to build a Covid tracker(global and local, and by local i mean my own country) using python.If you run the code you get to see a really big and branched dictionary and I wish to access any one of the states in that dictionary (lets say Goa) but i cant do so, so I tried to break down the problem by doing this
print(local_case_tracker['data'])
print(local_case_tracker['data']['regional'])
Im able to fetch some results but when i try
print(local_case_tracker['data']['regional']['loc : Goa'])
i get a:
TypeError: list indices must be integers or slices, not str
Its a very stupid doubt but ive been scratching my head over this since the last 30 min.

local_case_tracker['data']['regional']is a Python List (of Dictionaries) so should be accessed by index, stepping through by or searching through. Try this to get the data for Goa.
for item in local_case_tracker['data']['regional']:
if item['loc'] == 'Goa':
print(item)
You could just save the item instead of printing it; it is a Dictionary so could access the values using the various keys. Also try this to see all the locations
for item in local_case_tracker['data']['regional']:
print(item['loc'])

Related

Using nested for loops to iterate through JSON file of tweets in Python

So I am new to Python, but I know what I am trying to accomplish. Basically, I have the output of tweets from twitter in a JSON file loaded into Python. What I need to do is iterate through the tweets to access the "text" key, that has the text of each tweet, because that's what I'm going to use to do topic modeling. So, I have discovered that "text" is triple nested in this data structure, and it's been really difficult to find the correct way to write the for loop code in order to iterate through the dataset and pull the "text" from every tweet.
Here is a look at what the JSON structure is like: https://pastebin.com/fUH5MTMx
So, I have figured out that the "text" key that I want is within [hits][hits][_source]. What I can't figure out is the appropriate for loop to iterate through _source and pull those texts. Here is my code so far (again I'm very beginning sorry if try code is way off):
for hits in tweets["hits"]["hits"]:
for _source in hits:
for text in _source:
for item in text:
print(item)
also tried this:
for item in tweets['hits']["hits"]["_source"]:
print(item['text'])
But I keep getting either syntax errors for the first one then "TypeError: list indices must be integers or slices, not str" for the second one. I am understanding that I need to specify some way that I am trying to access this list, and that I'm missing something in order to show that its a list and I am not looking for integers as an output from iterations...(I am using the JSON module in Python for this, an using a Mac with Python3 in Spyder)
Any insight would be greatly appreciated! This multiple nesting is confusing me a lot.

['hits']["hits"] is not dictionary with ["_source"]
but a list with one or many items which have ["_source"]
it means
tweets['hits']["hits"][0]["_source"]
tweets['hits']["hits"][1]["_source"]
tweets['hits']["hits"][2]["_source"]
So this should work
for item in tweets['hits']["hits"]:
print(item["_source"]['text'])

Not sure if you realize it, but JSON is transformed into a Python dictionary, not a list. Anyway, let's get into this nest.
tweets['hits'] will give you another dict.
tweets['hits']['hits'] will give you a list (notice the brackets)
This apparently is a list of dictionaries, and in this case (not sure if it will always be), the dict with the "_source" key you are looking for is the first one,so:
tweets['hits']['hits'][0] will give you the dict you want. Then, finally:
tweets['hits']['hits'][0]['_source'] should give you the text.

The value of the second "hits" is a list.
Try:
for hit in tweets["hits"]["hits"]:
print(hit["_source"]["text"])

Python list.remove items present in second list

I've searched around and most of the errors I see are when people are trying to iterate over a list and modify it at the same time. In my case, I am trying to take one list, and remove items from that list that are present in a second list.
import pymysql
schemaOnly = ["table1", "table2", "table6", "table9"]
db = pymysql.connect(my connection stuff)
tables = db.cursor()
tables.execute("SHOW TABLES")
tablesTuple = tables.fetchall()
tablesList = []
# I do this because there is no way to remove items from a tuple
# which is what I get back from tables.fetchall
for item in tablesTuple:
tablesList.append(item)
for schemaTable in schemaOnly:
tablesList.remove(schemaTable)
When I put various print statements in the code, everything looks like proper and like it is going to work. But when it gets to the actual tablesList.remove(schemaTable) I get the dreaded ValueError: list.remove(x): x not in list.
If there is a better way to do this I am open to ideas. It just seemed logical to me to iterate through the list and remove items.
Thanks in advance!
** Edit **
Everyone in the comments and the first answer is correct. The reason this is failing is because the conversion from a Tuple to a list is creating a very badly formatted list. Hence there is nothing that matches when trying to remove items in the next loop. The solution to this issue was to take the first item from each Tuple and put those into a list like so: tablesList = [x[0] for x in tablesTuple] . Once I did this the second loop worked and the table names were correctly removed.
Thanks for pointing me in the right direction!

I assume that fetchall returns tuples, one for each database row matched.
Now the problem is that the elements in tablesList are tuples, whereas schemaTable contains strings. Python does not consider these to be equal.
Thus when you attempt to call remove on tablesList with a string from schemaTable, Python cannot find any such value.
You need to inspect the values in tablesList and find a way convert them to a strings. I suspect it would be by simply taking the first element out of the tuple, but I do not have a mySQL database at hand so I cannot test that.

Regarding your question, if there is a better way to do this: Yes.
Instead of adding items to the list, and then removing them, you can append only the items that you want. For example:
for item in tablesTuple:
if item not in schemaOnly:
tablesList.append(item)
Also, schemaOnly can be written as a set, to improve search complexity from O(n) to O(1):
schemaOnly = {"table1", "table2", "table6", "table9"}
This will only be meaningful with big lists, but in my experience it's useful semantically.
And finally, you can write the whole thing in one list comprehension:
tablesList = [item for item in tablesTuple if item not in schemaOnly]
And if you don't need to keep repetitions (or if there aren't any in the first place), you can also do this:
tablesSet = set(tablesTuple) - schemaOnly
Which is also has the best big-O complexity of all these variations.

A more efficient way of finding value in dictionary and its position

I have a dictionary which contains (roughly) 6 elements, each of an element which looks like the following:
What I want to do is find a particular domain (that I pass through a method) and if it exists, it stores the keyword and its position within an object. I have tried the following
def parseGoogleResponse(response, website):
i = 0
for item in response['items']:
if(item['formattedUrl'] == website):
print i
break;
i++
This approach seems to be a bit tedious and also i also remains the same at i = 10 and I'm pretty sure that this is a more efficient way. I also have to keep in consideration that if the website is not found the first time, it then queries the API for a maximum up to 5 pages, each page contains 6 search results so I somehow have to calculate the position if it is on a different page.
Any ideas

Dictionaries in Python are not ordered. There is no way to find something's position in a dictionary, unlike list type objects.
You can rather easily check for the existence of a value in the dictionary with something like:
if website in response['items'].values():
# If you enter this section, you know it's in the dictionary
else:
# If you end up here, it isn't in the dictionary

Properly looping over a dictionary / using dictionaries as databases

This looks like a CS 101 style homework but it actually isn't. I am trying to learn more python so I took up this personal project to write a small app that keeps my grade-book for me.
I have a class semester which holds a dictionary of section objects.
A section is a class that I am teaching in which ever semester object I am manipulating (I didn't want to call them classes for obvious reasons). I originally had sections as a list not a dictionary, and when I wanted to add a roster of students to that semester I could do this.
for sec in working_semester.sections:
sec.addRosterFromFile(filename)
Now I have changed the sections member of semester to a dictionary so I can look up a specific one to work with, but I am having trouble when I want to loop over all of them to do something like when I first set up a new semester I want to add all the sections, then loop over them and add students to each. If I try the same code to loop over the dictionary it gives me the key, but I was hoping to get the value.
I have also tried to iterate over a dictionary like this, which I got out of an older stack over flow question
for sec in iter(sorted(working_semester.sections.iteritems())):
sec.addRosterFromFile(filename)
But iter(sorted ... returns a tuple (key, value) not the item so the line in side the loop gives me an error that tuple does not have a function called addStudent.
Currently I have this fix in place where I loop through the keys and then use the key to access the value like this:
for key in working_semester.sections:
working_semester.sections[key].addRosterFromFile(filename)
There has to be a way to loop over dictionary values, or is this not desirable? My understanding of dictionaries is that they are like lists but rather than grabbing an element by its position it has a specific key, which makes it easier to grab the one you want no matter what order they are in. Am I missing how dictionaries should be used?

Using iteritems is a good approach, you just need to unpack the key and value:
for key, value in iter(sorted(working_semester.sections.iteritems())):
value.addRosterFromFile(filename)
If you really only need the value, you could use the aptly named itervalues:
for sec in sorted(working_semester.sections.itervalues()):
sec.addRosterFromFile(filename)
(It's not clear from your example whether you really need sorted there. If you don't need to iterate over the sections in sorted order just leave sorted out.)

Referring to objects inside a list without using references or indices

I'm using python for my shopping cart class which has a list of items. When a customer wants to edit an item, I need to pass the JavaScript front-end some way to refer to the item so that it can call AJAX methods to manipulate it.
Basically, I need a simple way to point to a particular item that isn't its index, and isn't a reference to the object itself.
I can't use an index, because another item in the list might be added or removed while the identifier is "held" by the front end. If I were to pass the index forward, if an item got deleted from the list then that index wouldn't point to the right object.
One solution seems to be to use UUIDs, but that seems particularly heavyweight for a very small list. What's the simplest/best way to do this?

Instead of using a list, why not use a dictionary and use small integers as the keys? Adding and removing items from the dictionary will not change the indices into the dictionary. You will want to keep one value in the dictionary that lets you know what the next assigned index will be.

A UUID seems perfect for this. Why don't you want to do that?
Do the items have any sort of product_id? Can the shopping cart have more than one of the same product_id, or does it store a quantity? What I'm getting at is: If product_id's in the cart are unique, you can just use that.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Accessing an element in a slightly complicated dictionary - python

Related

Using nested for loops to iterate through JSON file of tweets in Python

Python list.remove items present in second list

A more efficient way of finding value in dictionary and its position

Properly looping over a dictionary / using dictionaries as databases

Referring to objects inside a list without using references or indices

Categories

Resources