How to access a dict's values and delete them? - python

I have a Python dict stuffs with keys and values(list);
{'car':['bmw','porsche','benz'] 'fruits':['banana','apple']}
And I would like delete first value from cars: bmw and first value from fruits: banana
How can I access and delete them please? I have tried .pop(index), but it doesn't work...

You can create a new dictionary where you skip the first element using [1:]
stuffs = {'car':['bmw','porsche','benz'], 'fruits':['banana','apple']}
stuffs_new = {k:v[1:] for k,v in stuffs.items()}
# {'car': ['porsche', 'benz'], 'fruits': ['apple']}

An easy way of doing this is to use a for loop and iterate over each item in you're dictionary, and pop the first element:
dictionary = {'car':['bmw','porsche','benz'], 'fruits':['banana','apple']}
for key in dictionary:
dictionary[key].pop(0)
Or, as a list comprehension
dictionary = {'car':['bmw','porsche','benz'], 'fruits':['banana','apple']}
[dictionary[i].pop(0) for i in dictionary]
These pieces of code reference the dictionary at each of it's keys ('car' and 'fruits') and then proceeds to use pop on the values indexed by these keys.
Edit:
Don't use a list comprehension if you don't intend to store the list. In the case where you are iterating over large values, you could run into memory errors due to storing a whole load of useless values. Such as in this case:
[print(i) for i in range(9823498)]
This will store 9823498 None values*, where as a for loop would not. but still achieve the same thing.

You were almost there.
Use either:
del dict[key]
Or
dict.pop(key, value)
The second will remove but also leave the item available as a return

Related

Updating dictionary within loops

I have a list of dictionaries in which keys are "group_names" and values are gene_lists.
I want to update each dictionary with a new list of genes by looping through a species_list.
Here is my pseudocode:
groups=["group1", "group2"]
species_list=["spA", "spB"]
def get_genes(group,sp)
return gene_list
for sp in species_list:
for group in groups:
gene_list[group]=get_genes(group,sp)
gene_list.update(get_genes(group,sp))
The problem with this code is that new genes are replaced/overwritten by the previous ones instead of being added to the dictionary. My question is where should I put the following line. Although, I'm not sure if this is the only problem.
gene_list.update(get_genes(group,sp))
The data I have looks like this dataframe:
data={"group1":["geneA1", "geneA2"],
"group2":[ "geneB1","geneB2"]}
pd.DataFrame.from_dict(data).T
The data I want to create should look like this:
data={"group1":["geneA1", "geneA2", "geneX"],
"group2":[ "geneB1","geneB2", "geneX"]}
pd.DataFrame.from_dict(data).T
So in this case, "gene_x" refers to the new genes obtained by the get_genes function for each species and finally updated to the existing dictionary.
Any help would be much appreciated!!
You need to append to the list in the dictionary entry, not assign it.
Use setdefault() to provide a default empty list if the dictionary key doesn't exist yet.
for sp in species_list:
for group in groups:
gene_list.setdefault(group, []).extend(get_genes(group, sp))
From what I understand, you want to append new gene to each key, in order to do that:
new_gene = "gene_x"
data={"group1":["geneA1", "geneA2"], "group2":[ "geneB1","geneB2"]}
for value in data.values():
value.append(new_gene)
print(data)
You can also use defaultdict where you can append directly (read the docs for that).

Why does the default dictionary in my code keep expanding?

I have a default dictionary and I run it through a couple of loops to look for certain strings in the dictionary. The loops don't really append anything to the dictionary yet as it turns out, during the loop, new items keep getting appended to the dictionary and the final dictionary ends up bigger than the original one before the loop.
I've been trying to pinpoint the error forever but now it's late and I have no idea what's causing this!
from collections import defaultdict
dummydict = defaultdict(list)
dummydict['Alex'].append('Naomi and I love hotcakes')
dummydict['Benjamin'].append('Hayley and I hate hotcakes')
part = ['Alex', 'Benjamin', 'Hayley', 'Naomi']
emp = []
for var in dummydict:
if 'I' in dummydict[var]:
emp.append(var)
for car in part:
for key in range(len(dummydict)):
print('new len', len(dummydict))
print(key, dummydict)
if car in dummydict[key]:
emp.append(car)
print(emp)
print('why are there new values in the dictionary?!', len(dummydict), dummydict)
I expect the dictionary to remain unchanged.
if car in dummydict[key]:
key being an integer, and your dict being initially filled with only string as keys, this will create a new value in dummydict for each key.
Accessing missing keys as in dummydict[key] will add those keys to the defaultdict. Note that key is an int, not the value at that position, as for key in range(len(dummydict)) iterates indexes, not the dict or its keys.
See the docs:
When each key is encountered for the first time, it is not already in the mapping; so an entry is automatically created using the default_factory function which returns an empty list.
For example, this code will show a dummydict with a value in it, because simply accessing dummydict[key] will add the key to the dict if that key is not already there.
from collections import defaultdict
dummydict = defaultdict(list)
dummydict[1]
print (dummydict)
outputs:
defaultdict(<class 'list'>, {1: []})
Your issue is that in your loop, you do things like dummydict[key] and dummydict[var], which adds those keys.

Pulling up "dict" value of nested JSON by one level

I'm looking at converting some Chef run_lists to tags, and would like to automate the process.
So far what I've done is created a variable that runs:
# write to file instead of directly to variable for archival purposes
os.system("knife search '*:*' -a expanded_run_list -F json > /tmp/hostname_runlist.json")
data = json.load(open('/tmp/hostname_runlist.json'))
From there, I have a dict within a dict with list values similar to this:
{u'abc.com': {u'expanded_run_list': None}}
{u'foo.com': {u'expanded_run_list': u'base::default'}}
{u'123.com': {u'expanded_run_list': [u'utils::default', u'base::default']}}
...
I would like to convert that to a more simpler dictionary by removing the 'expanded_run_list' portion, as it it's not required at this point, so in the end it looks like this:
abc.com:None
foo.com:'base::default'
123.com:['utils::default', 'base::default']
I would like to keep the values as a list, or a single value depending on what is returned. When I run a 'for statement' to iterate, I can pull the hostnames from i.keys, but would need to remove the expanded_run_list key from i.values, as well as pair the key values up appropriately.
From there, I should have an easier time to iterate through the new dictionary when running an os.system Chef command to create the new tags. It's been a few years since I've written in python, so am a bit rusty. Any descriptive help would be much appreciated.
Considering that you are having your list of dict objects as:
my_list = [
{u'abc.com': {u'expanded_run_list': None}},
{u'foo.com': {u'expanded_run_list': u'base::default'}},
{u'123.com': {u'expanded_run_list': [u'utils::default', u'base::default']}}
]
Then, in order to achieve your desired result, you may use a combination of list comprehension and dict comprehension as:
For getting the list of nested dictionary
[{k: v.get('expanded_run_list') for k, v in l.items()} for l in my_list]
which will return you the list of dict objects in your desired form as:
[
{u'abc.com': None},
{u'foo.com': u'base::default'},
{u'123.com': [u'utils::default', u'base::default']}
]
Above solution assumes that you only want the value of key 'expanded_run_list' to be picked up from each of your nested dictionary. In case it doesn't exists, dict.get will return None which will be set as value in your resultant dict.
For pulling up your nested dictionary to form single dictionary
{k: v.get('expanded_run_list') for l in my_list for k, v in l.items()}
which will return:
{
'foo.com': 'base::default',
'123.com': ['utils::default', 'base::default'],
'abc.com': None
}

clearing a dictionary but keeping the keys

Is it possible to clear all the entries within a dictionary but keep all the keys?
For example if I had:
my_dic={
"colour":[],
"number":[]
}
I put some stuff in them:
my_dic["colour"]='Red'
my_dic["number"]='2'
I can clear these by:
my_dic["colour"] = []
my_dic["number"] = []
But this is long winded if I want to clear a large dictionary quickly, is there a quicker way perhaps using for? I want to keep the keys ["colour"], ["number"], without having to recreate them, just clear all the entries within them.
You can simply clear all lists in a loop:
for value in my_dic.values():
del value[:]
Note the value[:] slice deletion; we are removing all indices in the list, not the value reference itself.
Note that if you are using Python 2 you probably want to use my_dic.itervalues() instead of my_dic.values() to avoid creating a new list object for the loop.
Demo:
>>> my_dic = {'colour': ['foo', 'bar'], 'number': [42, 81]}
>>> for value in my_dic.values():
... del value[:]
...
>>> my_dic
{'colour': [], 'number': []}
You could also replace all values with new empty lists:
my_dic.update((key, []) for key in my_dic)
or replace the whole dictionary entirely:
my_dic = {key: [] for key in my_dic}
Take into account these two approaches will not update other references to either the lists (first approach) or the whole dictionary (second approach).
You no need to delete keys from dictionary:
for key in my_dict:
my_dict[key] = []
One liner:
my_dict = dict.fromkeys(my_dict, None)
You can also replace the None type with other values that are immutable. A mutable type such as a list will cause all of the values in your new dictionary to be the same list.
For mutable types you would have to populate the dictionary with distinct instances of that type as others have shown.

Checking items in a list of dictionaries in python

I have a list of dictionaries=
a = [{"ID":1, "VALUE":2},{"ID":2, "VALUE":2},{"ID":3, "VALUE":4},...]
"ID" is a unique identifier for each dictionary. Considering the list is huge, what is the fastest way of checking if a dictionary with a certain "ID" is in the list, and if not append to it? And then update its "VALUE" ("VALUE" will be updated if the dict is already in list, otherwise a certain value will be written)
You'd not use a list. Use a dictionary instead, mapping ids to nested dictionaries:
a = {
1: {'VALUE': 2, 'foo': 'bar'},
42: {'VALUE': 45, 'spam': 'eggs'},
}
Note that you don't need to include the ID key in the nested dictionary; doing so would be redundant.
Now you can simply look up if a key exists:
if someid in a:
a[someid]['VALUE'] = newvalue
I did make the assumption that your ID keys are not necessarily sequential numbers. I also made the assumption you need to store other information besides VALUE; otherwise just a flat dictionary mapping ID to VALUE values would suffice.
A dictionary lets you look up values by key in O(1) time (constant time independent of the size of the dictionary). Lists let you look up elements in constant time too, but only if you know the index.
If you don't and have to scan through the list, you have a O(N) operation, where N is the number of elements. You need to look at each and every dictionary in your list to see if it matches ID, and if ID is not present, that means you have to search from start to finish. A dictionary will still tell you in O(1) time that the key is not there.
If you can, convert to a dictionary as the other answers suggest, but in case you you have reason* to not change the data structure storing your items, here's what you can do:
items = [{"ID":1, "VALUE":2}, {"ID":2, "VALUE":2}, {"ID":3, "VALUE":4}]
def set_value_by_id(id, value):
# Try to find the item, if it exists
for item in items:
if item["ID"] == id:
break
# Make and append the item if it doesn't exist
else: # Here, `else` means "if the loop terminated not via break"
item = {"ID": id}
items.append(id)
# In either case, set the value
item["VALUE"] = value
* Some valid reasons I can think of include preserving the order of items and allowing duplicate items with the same id. For ways to make dictionaries work with those requirements, you might want to take a look at OrderedDict and this answer about duplicate keys.
Convert your list into a dict and then checking for values is much more efficient.
d = dict((item['ID'], item['VALUE']) for item in a)
for new_key, new_value in new_items:
if new_key not in d:
d[new_key] = new_value
Also need to update on key found:
d = dict((item['ID'], item['VALUE']) for item in a)
for new_key, new_value in new_items:
d.setdefault(new_key, 0)
d[new_key] = new_value
Answering the question you asked, without changing the datastructure around, there's no real faster way of looking without a loop and checking every element and doing a dictionary lookup for each one - but you can push the loop down to the Python runtime instead of using Python's for loop.
I haven't tried if it ends up faster though.
a = [{"ID":1, "VALUE":2},{"ID":2, "VALUE":2},{"ID":3, "VALUE":4}]
id = 2
tmp = filter(lambda d: d['ID']==id, a)
# the filter will either return an empty list, or a list of one item.
if not tmp:
tmp = {"ID":id, "VALUE":"default"}
a.append(tmp)
else:
tmp = tmp[0]
# tmp is bound to the found/new dictionary

Categories