This question already has an answer here:
deleting entries in a dictionary based on a condition
(1 answer)
Closed 8 years ago.
I'm trying to drop items from a dictionary if the value of the key is below a certain threshold. For a simple example to what I mean:
my_dict = {'blue': 1, 'red': 2, 'yellow': 3, 'green': 4}
for color in my_dict:
threshold_value = 3
if my_dict[color] < threshold_value:
del my_dict[color]
print(my_dict)
Now, I get a RuntimeError: dictionary changed size during iteration error. No big surprises there. The reason I'm posting this question is:
Find out if there's an elegant solution that doesn't require creating a new dictionary (that holds only the keys with values >= threshold).
Try to understand Python's rationale here. The way I read it to myself is: "go to the first key. Is the value of that key < x ? if yes - del this key:value item and continue on the the next key in the dictionary, if no - continue to next key without doing anything". In other words, what happened historically to previous keys shouldn't affect where I go next. I'm looking forward to the next items, regardless of the past.
I know it's a funny (some might say stupid, I'll give you that) but what's Python's "way of thinking" about this loop? Why doesn't it work? How would Python read it out loud to itself? Just trying to get a better understanding of the language...
Due to the fact that Python dictionaries are implemented as hash tables, you shouldn't rely on them having any sort of an order. Key order may change unpredictably (but only after insertion or removal of a key). Thus, it's impossible to predict the next key. Python throws the RuntimeError to be safe, and to prevent people from running into unexpected results.
Python 2's dict.items method returns a copy of key-value pairs, so you can safely iterate over it and delete values you don't need by keys, as #wim suggested in comments. Example:
for k, v in my_dict.items():
if v < threshold_value:
del my_dict[k]
However, Python 3's dict.items returns a view object that reflects all changes made to the dictionary. This is the reason the solution above only works in Python 2. You may convert my_dict.items() to list (tuple etc.) to make it Python 3-compatible.
Another way to approach the problem is to select keys you want to delete and then delete them
keys = [k for k, v in my_dict.items() if v < threshold_value]
for x in keys:
del my_dict[x]
This works in both Python 2 and Python 3.
Dictionaries are unordered. By deleting one key nobody can say, what the next key is. So python in general disallow to add or remove keys from a dictionary, over that is iterated.
Just create a new one:
my_dict = {"blue":1,"red":2,"yellow":3,"green":4}
new_dict = {k:v for k,v in my_dict.iteritems() if v >= threshold_value}
I guess that modifying a collection while iterating over it is a hard thing to do to implement properly. Consider following exaple:
>>> list = [1, 2, 3, 4, 5, 6]
>>> for ii in range(len(list)):
print list[ii];
if list[ii] == 3:
del list[ii]
1
2
3
5
6
Notice that in this example 4 was altogether omitted. It is very similat in dictionaries, deleting/adding entries might invalidate internal structures that define order of iteration (for example you deleted enough entries so hash map bucket size changed).
To solve your case --- just create new dictionary and copy items there. As to
Related
Is there a more pythonic way of obtaining a sorted list of dictionary keys with one key moved to the head? So far I have this:
# create a unique list of keys headed by 'event' and followed by a sorted list.
# dfs is a dict of dataframes.
for k in (dict.fromkeys(['event']+sorted(dfs))):
display(k,dfs[k]) # ideally this should be (k,v)
I suppose you would be able to do
for k, v in list(dfs.items()) + [('event', None)]:
.items() casts a dictionary to a list of tuples (or technically a dict_items, which is why I have to cast it to list explicitly to append), to which you can append a second list. Iterating through a list of tuples allows for automatic unpacking (so you can do k,v in list instead of tup in list)
What we really want is an iterable, but that's not possible with sorted, because it must see all the keys before it knows what the first item should be.
Using dict.fromkeys to create a blank dictionary by insertion order was pretty clever, but relies on an implementation detail of the current version of python. (dict is fundamentally unordered) I admit, it took me a while to figure out that line.
Since the code you posted is just working with the keys, I suggest you focus on that. Taking up a few more lines for readability is a good thing, especially if we can hide it in a testable function:
def display_by_keys(dfs, priority_items=None):
if not priority_items:
priority_items = ['event']
featured = {k for k in priority_items if k in dfs}
others = {k for k in dfs.keys() if k not in featured}
for key in list(featured) + sorted(others):
display(key, dfs[key])
The potential downside is you must sort the keys every time. If you do this much more often than the data store changes, on a large data set, that's a potential concern.
Of course you wouldn't be displaying a really large result, but if it becomes a problem, then you'll want to store them in a collections.OrderedDict (https://stackoverflow.com/a/13062357/1766544) or find a sorteddict module.
from collections import OrderedDict
# sort once
ordered_dfs = OrderedDict.fromkeys(sorted(dfs.keys()))
ordered_dfs.move_to_end('event', last=False)
ordered_dfs.update(dfs)
# display as often as you need
for k, v in ordered_dfs.items():
print (k, v)
If you display different fields first in different views, that's not a problem. Just sort all the fields normally, and use a function like the one above, without the sort.
My problem is understanding why these certain lines of code do what they do. Basically why it works logically. I am using PyCharm python 3 I think.
house_Number = {
"Luca": 1, "David": 2, "Alex": 3, "Kaden": 4, "Kian": 5
}
for item in house_Number:
print(house_Number[item]) # Why does this print the values tied with the key?
print(item) # Why does this print the key?
This is my first question so sorry I don't know how to format the code to make it look nice. My question is why when you use the for loop to print the dictionary key or value the syntax to print the key is to print every item? And what does it even mean to print(house_Number[item]).
They both work to print key or value but I really want to know a logical answer as to why it works this way. Thanks :D
I'm not working on any projects just starting to learn off of codeacademey.
In Python, iteration over a dictionary (for item in dict) is defined as iteration over that dictionary's keys. This is simply how the language was designed -- other languages and collection classes do it differently, iterating, for example, over key-value tuples, templated Pair<X,Y> objects, or what have you.
house_Number[item] accesses the value in house_Number referenced by the key item. [...] is the syntax for indexing in Python (and most other languages); an_array[2] gives the third element of an_array and house_Number[item] gives the value corresponding to the key item in the dictionary house_Number.
Just a side note: Python naming conventions would dictate house_number, not house_Number. Capital letters are generally only used in CamelCasedClassNames and CONSTANTS.
In python values inside a dictionary object are accessed using dictionay_name['KEY']
In your case you are iterating over the keys of dictionary
Hope this helps
for item in dic:
print(item) # key
print(dic[item]) # value
Dictionaries are basically containers containing some items (keys) which are stored by hashing method. These keys just map to the values (dic[key]).
Like in set, if you traverse using for loop, you get the keys from it (in random order since they are hashed). Similarly, dictionaries are just sets with a value associated with it. it makes more sense to iterate the keys as in sets (too in random order).
Read more about dicionaries here https://docs.python.org/3/tutorial/datastructures.html#dictionaries and hopefully that will answer your question. Specifically, look at the .items() method of the dictionary object.
When you type for item in house_Number, you don’t specify whether item is the key or value of house_Number. Then python just thinks that you meant the key of house_Number.
So when you do the function print(house_Number[item]), you’re printing the value because your taking the key and finding the value. In other words, you taking each key once, and finding their values, which are 1, 2, 3, 4, 5, 6
The print(item) is just to print the item, which are the keys, "Luca", "David", "Alex", "Kaden", "Kian"
Because the print(house_Number[item]) and print(item) alternating, you get the keys and values alternating, each on a new line.
This question already has answers here:
Extract a subset of key-value pairs from dictionary?
(14 answers)
Closed 6 years ago.
I've a dictionary my_dict and a list of tokens my_tok as shown:
my_dict = {'tutor': 3,
'useful': 1,
'weather': 1,
'workshop': 3,
'thankful': 1,
'puppy': 1}
my_tok = ['workshop',
'puppy']
Is it possible to retain in my_dict, only the values present in my_tok rather than popping the rest?
i.e., I need to retain only workshop and puppy.
Thanks in advance!
Just overwrite it like so:
my_dict = {k:v for k, v in my_dict.items() if k in my_tok}
This is a dictionary comprehension that recreates my_dict using only the keys that are present as entries in the my_tok list.
As said in the comments, if the number of elemenst in the my_tok list is small compaired to the dictionary keys, this solution is not the most efficient one. In that case it would be much better to iterate through the my_tok list instead as follows:
my_dict = {k:my_dict.get(k, default=None) for k in my_tok}
which is more or less what the other answers propose. The only difference is the use of .get dictionary method with allows us not to care whether the key is present in the dictionary or not. If it isn't it would be assigned the default value.
Going over the values from the my_tok, and get the results that are within the original dictionary.
my_dict = {i:my_dict[i] for i in my_tok}
Create a new copy
You can simply overwrite the original dictionary:
new_dic = {token:my_dict[key] for key in my_tok if key in my_dict}
Mind however that you construct a new dictionary (perhaps you immediately writ it to my_dict) but this has implications: other references to the dictionary will not reflect this change.
Since the number of tokens (my_tok) are limited, it is probably better to iterate over these tokens and do a contains-check on the dictionary (instead of looping over the tuples in the original dictionary).
Update the original dictionary
Given you want to let the changes reflect in your original dictionary, you can in a second step you can .clear() the original dictionary and .update() it accordingly:
new_dic = {token:my_dict[key] for key in my_tok if key in my_dict}
my_dict.clear()
my_dict.update(new_dic)
This question already has answers here:
How do I delete items from a dictionary while iterating over it?
(11 answers)
Closed 6 years ago.
I have two dictionaries with keys as string and values as integer:
ground = {"url1":1,"url2":2,....}
prediction = {"url1":5,"url2":3,....}
The thing I want to do is to delete key in ground if it does not exist in prediction.
I wrote the easiest thing that came to my head:
for key in ground:
if key not in prediction:
del ground[key]
and also tried this:
for key in ground:
if not key in prediction.keys():
del ground[key]
Neither worked. How can I achieve the goal?
You could use a dictionary comprehension to create a new dictionary:
ground = {k: ground[k] for k in ground if k in prediction}
Or iterate over the keys of the ground dictionary using ground.keys() in Python 2, or list(ground.keys()) in Python 3 (also works in Python 2). This returns a new list which is not affected when keys are removed from the ground dictionary:
for k in list(ground.keys()):
if k not in prediction:
del ground[k]
Your second attempt works fine in Python 2. In Python 3, keys() is a view rather than a list, so would give the same error as the first: dictionary changed size during iteration. To fix this, convert it to a list first:
for key in list(ground.keys()):
if key not in prediction:
del ground[key]
You loop over the dict and delete a key inside the same loop. That's why it may not work. Try:
pred_set = set(prediction.keys())
g_set = set(ground.keys())
for key in g_set - pred_set:
del ground[key]
Given something like this:
yesterday = {'facebook_adgroups': [4634L], 'google_third_party_tags': [1790L]}
I want to convert the values that happen to be in a list/long format, into just numbers so that I can run some calculations later. I keep running into a few different errors, or the changes are just not 'taking'. Here was my last attempt:
for k, v in yesterday.iteritems():
i = 0
print v[0]
yesterday.values()[i] = v[0]
i += 1
I don't think longs have anything to do with it: IIUC, you simply want to turn the values, which are currently lists of one element, into the elements themselves. You can use a dict comprehension for this:
>>> yesterday = {'facebook_adgroups': [4634L], 'google_third_party_tags': [1790L]}
>>> yesterday = {k: v[0] for k,v in yesterday.iteritems()}
>>> yesterday
{'facebook_adgroups': 4634L, 'google_third_party_tags': 1790L}
This line of your code:
yesterday.values()[i] = v[0]
won't "take" because yesterday.values() constructs a new list, separate from the dictionary, containing the dictionary values. [i] selects the i-th element of that new temporary list, and then you set that to v[0]. The original yesterday dictionary is unaffected. If you'd mutated the element, though, e.g. appending to the list, that change would have showed up in the dictionary because the elements in the .values() list are the very same objects in the dictionary.
(Note also that because dictionaries aren't ordered, referring to elements by index like this is going to get you into trouble, even if it seems to work sometimes.)
DSM's answer shows what you can do to fix your problem (use a dict comprehension), but I thought I'd explain a bit more about why your current code doesn't work.
The key line is:
yesterday.values()[i] = v[0]
This doesn't do anything useful. In Python 2, dict.values() returns a new list with a copy of the dictionary's current values. Mutating that list does not do anything to change the dictionary. (In Python 3, dict.values() returns a "view" object which is directly tied to the dictionary, but it is neither indexable or mutable so trying to assign to an index within it will raise an exception.)
An alternative would be to change the value directly in the dictionary:
yesterday[k] = v[0] # or use int(v[0]) if you still need to get rid of the longs
I believe the following tweak does what you are asking:
When you just assign the value to the corresponding key, it preserves the long characteristic:
yesterday = {'facebook_adgroups': [4634L], 'google_third_party_tags': [1790L]}
for k, v in yesterday.iteritems():
i = 0
print v[0]
yesterday[k] = v[0]
i += 1
print yesterday
Results in
4634
1790
{'facebook_adgroups': 4634L, 'google_third_party_tags': 1790L}
Note the values still have the L. But when I change the assignment:
yesterday[k] = int(v[0])
The output becomes
{'facebook_adgroups': 4634, 'google_third_party_tags': 1790}
I will freely admit this is "not very Pythonic" - but it does appear to do the job.
Upon re-reading your question, it appears I answered the question as stated in the title - but now that I look more closely it's not clear that this is what you were struggling with.
I would strongly caution you that if you started out with key/value pairs, trying to go to a numerical index is fraught with danger. Things may not be in the order you thought, items may be missing, etc. Unless you specifically assign in a known order, with defaults for missing values, this will go wrong one day - and you will be searching for hours to find the bug. Better to do something like
yNum[0] = yesterday['facebook_adgroups']
yNum[1] = yesterday['google_third_party_tags']
etc. Perhaps with int() around it if you want to force the type.
I cant really get the point behind your question. But as per my understanding of your problem,
yesterday = {'facebook_adgroups': [4634L], 'google_third_party_tags': [1790L]}
for k,v in yesterday.items():
print '%s : %s' % (k, map(int,v)[0])
for k,v in yesterday.items():
print '%s : %s' % (k, int(v[0]))
Both the above ways will provide you with the same output
output:
facebook_adgroups : 4634
google_third_party_tags : 1790
facebook_adgroups : 4634
google_third_party_tags : 1790