manipulating the contents of dictionaries inside dictionaries

manipulating the contents of dictionaries inside dictionaries - python

I am receiving data from a radar on different contacts. each contact has a lat, lon, direction, range and time stamp. and each time hit on a contact will be ID'd such as 1,2,3 etc. for one contact this suggests a dictionary over time. therefore, my dictionary for one contact will look something like this:
{1:[data # t1], 2:[data # t2], 3:[data # t3]}
And as time goes on the dictionary will fill up until ...But there will not be only one contact. there will be several, maybe many. this suggests a dictionary of dictionaries:
{'SSHornblower': {1:[data], 2:[data], 3:[data]},
'Lustania': {1:[], 2:[], 3:[]},
'Queen Mary': {1:[], 2:[], 3:[], 4:[]}}
It is not possible to know before hand how many contacts my radar will find, maybe 3 maybe 300. I cannot come up with names ahead of time for all the possible contacts and names for all the possible dictionaries. Therefore, I came up with the idea that once i nested a dictionary inside the larger dictionary, i could clear it and start over with the new contact. but when i do a clear after i nest one inside another, it clears everything inside the larger dictionary! Is there a way to get around this?

For filling up nested dictionaries a defaultdict can be very useful.
Let's assume you have a function radar() that returns three values:
contact_name
contact_id
contact_data
Then the following would do the job:
from collections import defaultdict
store = defaultdict(dict)
while True:
contact_name, contact_id, contact_data = radar()
store[contact_name][contact_id] = contact_data
So even if there will be a new contact_name that is not yet present in the store, the magic of defaultdict will make sure that an empty nested dict will already be there when you access the store with the new key . Therefore store[new_contact_name][new_contact_id] = new_contact_data will work.

Related

Store and Scrape Over Time

I'm brand new here and brand new to Python and programming in general. I wrote a simple script today that I'm pretty proud of as a new beginner. I used BS4 and Requests to scrape some data from a website. I put all of the data in dictionaries inside a list. The same key/value pairs exist for every list item. For simplicity, I'm left with something like this:
[{'country': 'us', 'state':'new york', 'people':50},{'country':'us', 'state':'california','people':30']}
Like I said, pretty simple, but then I can turn it into a Pandas dataframe and everything is organized with a few hundred different dictionaries inside the list. My next step is to do run this scrape every hour for 5 hours--and the only thing that changes is the value of the 'people' key. All of the sudden I'm not sure a list of lists of dictionaries (did I say that right?!) is a great idea. Plus, I really only need to get the updated values of 'people' from the webpage. Is this something I can realistically do with built in Python lists and dictionaries? I don't know much about databases, but I'm thinking that maybe SQLite might be good to use. I really only know about it in concept but haven't worked with it. Thoughts?
Ideally, after several scrapes, I would have easy access to the data to say, see 'people' in 'new york' over time. Or find at what time 'california' had the highest number of people. And then I could plot the data in 1000 different ways! I'd love any guidance or direction here. Thanks a bunch!

You could create a Python class, like this:
class StateStats:
def __init__(self, country, state, people):
self.country = country
self.state = state
self.people = people
def update():
# Do whatever your update script is here
# Except, update the value self.people when it changes
# Like this: self.people = newPeopleValueAsAVariable
And then create instances of it like this:
# For each country you have scraped, make a new instance of this class
# This assumes that the list you gathered is stored in a variable named my_list
state_stats_list = []
for dictionary in my_list:
state_stats_list.append(
StateStats(
dictionary['country'],
dictionary['state'],
dictionary['people']
)
)
# Or, instead, you can just create the class instances
# when you scrape the webpage, instead of creating a
# list and then creating another list of classes from that list
You could also use a database like SQLite, but I think this will be fine for your purpose. Hope this helps!

Parsing JSON in Python (Reverse dictionary search)

I'm using Python and "requests" to practice the use of API. I've had success with basic requests and parsing, but having difficulty with list comprehension for a more complex project.
I requested from a server and got a dictionary. From there, I used:
participant_search = (match1_request['participantIdentities'])
To convert the values of the participantIdentities key to get the following data:
[{'player':
{'summonerName': 'Crescent Bladex',
'matchHistoryUri': '/v1/stats/player_history/NA1/226413119',
'summonerId': 63523774,
'profileIcon': 870},
'participantId': 1},
My goal here is to combine the summonerId and participantId to one list. Which is easy normally, but the order of ParticipantIdentities is randomized. So the player I want information on will sometimes be 1st on the list, and other times third.
So I can't use the var = list[0] like how I would normally do.
I have access to summonerId, so I'm thinking I can search the list the summonerId, then somehow collect all the information around it. For instance, if I knew 63523774 then I could find the key for it. From here, is it possible to find the parent list of the key?
Any guidance would be appreciated.
Edit (Clarification):
Here's the data I'm working with: http://pastebin.com/spHk8VP0
At line 1691 is where participant the nested dictionary 'participantIdentities' is. From here, there are 10 dictionaries. These 10 dictionaries include two nested dictionaries, "player" and "participantId".
My goal is to search these 10 dictionaries for the one dictionary that has the summonerId. The summonerId is something I already know before I make this request to the server.
So I'm looking for some sort of "search" method, that goes beyond "true/false". A search method that, if a value is found within an object, the entire dictionary (key:value) is given.

Not sure if I properly understood you, but would this work?
for i in range(len(match1_request['participantIdentities'])):
if(match1_request['participantIdentities'][i]['summonerid'] == '63523774':
# do whatever you want with it.
i becomes the index you're searching for.

ds = match1_request['participantIdentities']
result_ = [d for d in ds if d["player"]["summonerId"] == 12345]
result = result_[0] if result_ else {}
See if it works for you.

You can use a dict comprehension to build a dict wich uses summonerIds as keys:
players_list = response['participantIdentities']
{p['player']['summonerId']: p['participantId'] for p in players_list}

I think what you are asking for is: "How do I get the stats for a given a summoner?"
You'll need a mapping of participantId to summonerId.
For example, would it be helpful to know this?
summoner[1] = 63523774
summoner[2] = 44610089
...
If so, then:
# This is probably what you are asking for:
summoner = {ident['participantId']: ident['player']['summonerId']
for ident in match1_request['participantIdentities']}
# Then you can do this:
summoner_stats = {summoner[p['participantId']]: p['stats']
for p in match1_request['participants']}
# And to lookup a particular summoner's stats:
print summoner_stats[44610089]
(ref: raw data you pasted)

Python--adding list into dict (beginner)

I'm very new to programming (taking my first class in it now), so bear with me for format issues and misunderstandings, or missing easy fixes.
I have a dict with tweet data: 'user' as keys and then 'text' as their values. My goal here is to find the tweets where they are replying to another user, signified by starting with the # symbol, and then make a new dict that contains the author's user and the users of everyone he replied to. That's the fairly simple if statement I have below. I was also able to use the split function to isolate the username of the person they are replying to (the function takes all the text between the # symbol and the next space after it).
st='#'
en=' '
task1dict={}
for t in a,b,c,d,e,f,g,h,i,j,k,l,m,n:
if t['text'][0]=='#':
user=t['user']
repliedto=t['text'].split(st)[-1].split(en)[0]
task1dict[user]=[repliedto]
Username1 replied to username2. Username2 replied to both username3 and username5.
I am trying to create a dict (caled tweets1) that reads something like:
'user':'repliedto'
username1:[username2]
username2:[username3, username5]
etc.
Is there a better way to isolate the usernames, and then put them into a new dict? Here's a 2 entry sample of the tweet data:
{"user":"datageek88","text":"#sundevil1992 good question! #joeclarknet Is this on the exam?"},
{"user":"joeclarkphd","text":"Exam questions will be answered in due time #sundevil1992"}
I am now able to add them to a dict, but it would only save one 'repliedto' for each 'user', so instead of showing username2 have replied to both 3 and 5, it just shows the latest one, 5:
{'username1': ['username2'],
'username2': ['username5']}
Again, if I'm making a serious no-no anywhere in here, I apologize, and please show me what I'm doing wrong!

Modify the last line to
task1dict.setdefault(user, [])
task1dict[user].append (repliedto)
You were overwriting the users replied to array each time you edited it. The setdefault method will set the dict to have a empty list if it doesn't already exist. Then just append to the list.
EDIT: same code using a set for uniqueness.
task1dict.setdefault(user, set())
task1dict[user].add (repliedto)
For a set you add an element to the set. Whereas a list you append to the list

I might do it like this. Use the following regular expression to identify all usernames.
r"#([^\s]*)"
It means look for the # symbol, and then return all characters that aren't a space. A defaultdict is a simply a dictionary that returns a default value if they key isn't found. In this case, I specify an empty set as the return type in the event that we are adding a new key.
import re
from collections import defaultdict
tweets = [{"user":"datageek88","text":"#sundevil1992 good question! #joeclarknet Is this on the exam?"},
{"user":"joeclarkphd","text":"Exam questions will be answered in due time #sundevil1992"}]
from_to = defaultdict(set)
for tweet in tweets:
if "#" in tweet['text']:
user = tweet['user']
for replied_to in re.findall(r"#([^\s]*)", tweet['text']):
from_to[user].add(replied_to)
print from_to
Output
defaultdict(<type 'list'>, {'joeclarkphd': ['sundevil1992'],
'datageek88': ['sundevil1992', 'joeclarknet']})

python nested list and dicts, trouble accesing and setting

I am originally a c guy but recently I started doing some stuff in python.
The things that gives me trouble is the more advanced data structures in python.
I could do it all with multiple list like in c but that would just be boring right?
anyway here I have a data structure which is basically a list of dicts where the value field of the dict is another list of 2 key-value pairs:
clients = [
{'client1':[{'test':'testvalue','status':'statusvalue'}]},
{'client2':[{'test':'testvalue','status':'statusvalue'}]},
{'client3':[{'test':'testvalue','status':'statusvalue'}]}
]
now I want to be able to acces the testvalue and statusvalue fields and modify or read them. based on the position in the list.
in pseudocode it would be something like:
for i in range(0,clients):
getvalue(clients[i].'test')
setvalue(clients[i].'test')
getvalue(clients[i].'status')
setvalue(clients[i].'status')
in the end I want to use this data structure to render a html page with jinja2

For a start, in Python you should (almost) never iterate over range(len(something)). You iterate over something.
Secondly, your data structure is wrong. There's no point having a list of dicts, each dict containing a single key/value pair and each value consisting of a list with a single item. You should just have a dict of dicts: you can still iterate over it.
clients = {
'client1':{'test':'testvalue','status':'statusvalue'},
'client2':{'test':'testvalue','status':'statusvalue'},
'client3':{'test':'testvalue','status':'statusvalue'},
}
for key, value in clients.iteritems():
print value['test']
value['test'] = 'newvalue'

I have noticed that you put a dictionary inside a list as the value for each client.
I think you may wish to re-configure your data structure as such:
clients = [
{'client1':{'test':'testvalue','status':'statusvalue'}}
{'client2':{'test':'testvalue','status':'statusvalue'}}
{'client3':{'test':'testvalue','status':'statusvalue'}}
]
Therefore, you can begin iterating as such:
for client in clients:
for k, v in client.iteritems(): #this unpacks client into 'client' (k) and {'val'...} (v)
print v['test'] #this gets the value of test.
v['test'] = 'some_new_value' #this sets the value of test.

singular or plural identifier for a dictionary?

When naming a container , what's a better coding style:
source = {}
#...
source[record] = some_file
or
sources = {}
#...
sources[record] = some_file
The plural reads more natural at creation; the singular at assignment.
And it is not an idle question; I did catch myself getting confused in an old code when I wasn't sure if a variable was a container or a single value.
UPDATE
It seems there's a general agreement that when the dictionary is used as a mapping, it's better to use a more detailed name (e.g., recordToSourceFilename); and if I absolutely want to use a short name, then make it plural (e.g., sources).

I think that there are two very specific use cases with dictionaries that should be identified separately. However, before addressing them, it should be noted that the variable names for dictionaries should almost always be singular, while lists should almost always be plural.
Dictionaries as object-like entities: There are times when you have a dictionary that represents some kind of object-like data structure. In these instances, the dictionary almost always refers to a single object-like data structure, and should therefore be singular. For example:
# assume that users is a list of users parsed from some JSON source
# assume that each user is a dictionary, containing information about that user
for user in users:
print user['name']
Dictionaries as mapping entities: Other times, your dictionary might be behaving more like a typical hash-map. In such a case, it is best to use a more direct name, though still singular. For example:
# assume that idToUser is a dictionary mapping IDs to user objects
user = idToUser['0001a']
print user.name
Lists: Finally, you have lists, which are an entirely separate idea. These should almost always be plural, because they are simple a collection of other entities. For example:
users = [userA, userB, userC] # makes sense
for user in users:
print user.name # especially later, in iteration
I'm sure that there are some obscure or otherwise unlikely situations that might call for some exceptions to be made here, but I feel that this is a pretty strong guideline to follow when naming dictionaries and lists, not just in Python but in all languages.

It should be plural because then the program behaves just like you read it aloud. Let me show you why it should not be singular (totally contrived example):
c = Customer(name = "Tony")
c.persist()
[...]
#
# 500 LOC later, you retrieve the customer list as a mapping from
# customer ID to Customer instance.
#
# Singular
customer = fetchCustomerList()
nameOfFirstCustomer = customer[0].name
for c in customer: # obviously it's totally confusing once you iterate
...
# Plural
customers = fetchCustomerList()
nameOfFirstCustomer = customers[0].name
for customer in customers: # yeah, that makes sense!!
...
Furthermore, sometimes it's a good idea to have even more explicit names from which you can infer the mapping (for dictionaries) and probably the type. I usually add a simple comment when I introduce a dictionary variable. An example:
# Customer ID => Customer
idToCustomer = {}
[...]
idToCustomer[1] = Customer(name = "Tony")

I prefer plurals for containers. There's just a certain understandable logic in using:
entries = []
for entry in entries:
#Code...

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.