I have a dictionary of a list of dictionaries. something like below:
x = {'a':[{'p':1, 'q':2}, {'p':4, 'q':5}], 'b':[{'p':6, 'q':1}, {'p':10, 'q':12}]}
The length of the lists (values) is the same for all keys of dict x.
I want to get the length of any one value i.e. a list without having to go through the obvious method -> get the keys, use len(x[keys[0]]) to get the length.
my code for this as of now:
val = None
for key in x.keys():
val = x[key]
break
#break after the first iteration as the length of the lists is the same for any key
try:
what_i_Want = len(val)
except TypeError:
print 'val wasn't set'
i am not happy with this, can be made more 'pythonic' i believe.
This is most efficient way, since we don't create any intermediate lists.
print len(x[next(iter(x))]) # 2
Note: For this method to work, the dictionary should have atleast one key in it.
What about this:
val = x[x.keys()[0]]
or alternatively:
val = x.values()[0]
and then your answer is
len(val)
Some of the other solutions (posted by thefourtheye and gnibbler) are better because they are not creating an intermediate list. I added this response merely as an easy to remember and obvious option, not a solution for time-efficient usage.
Works ok in Python2 or Python3
>>> x = {'a':[{'p':1, 'q':2}, {'p':4, 'q':5}], 'b':[{'p':6, 'q':1}, {'p':10, 'q':12}]}
>>> next(len(i) for i in x.values())
2
This is better for Python2 as it avoids making a list of the values. Works well in Python3 too
>>> next(len(x[k]) for k in x)
2
Using next and iter:
>>> x = {'a':[{'p':1, 'q':2}, {'p':4, 'q':5}], 'b':[{'p':6, 'q':1}, {'p':10, 'q':12}]}
>>> val = next(iter(x.values()), None) # Use `itervalues` in Python 2.x
>>> val
[{'q': 2, 'p': 1}, {'q': 5, 'p': 4}]
>>> len(val)
2
>>> x = {}
>>> val = next(iter(x.values()), None) # `None`: default value
>>> val is None
True
>>> x = {'a':[{'p':1, 'q':2}, {'p':4, 'q':5}], 'b':[{'p':6, 'q':1}, {'p':10, 'q':12}]}
>>> len(x.values()[0])
2
Here, x.values gives you a list of all values then you can get length of any one value from it.
Related
I know to write something simple and slow with loop, but I need it to run super fast in big scale.
input:
lst = [[1, 1, 2], ["txt1", "txt2", "txt3"]]
desired out put:
d = {1 : ["txt1", "txt2"], 2 : "txt3"]
There is something built-in at python which make dict() extend key instead replacing it?
dict(list(zip(lst[0], lst[1])))
One option is to use dict.setdefault:
out = {}
for k, v in zip(*lst):
out.setdefault(k, []).append(v)
Output:
{1: ['txt1', 'txt2'], 2: ['txt3']}
If you want the element itself for singleton lists, one way is adding a condition that checks for it while you build an output dictionary:
out = {}
for k,v in zip(*lst):
if k in out:
if isinstance(out[k], list):
out[k].append(v)
else:
out[k] = [out[k], v]
else:
out[k] = v
or if lst[0] is sorted (like it is in your sample), you could use itertools.groupby:
from itertools import groupby
out = {}
pos = 0
for k, v in groupby(lst[0]):
length = len([*v])
if length > 1:
out[k] = lst[1][pos:pos+length]
else:
out[k] = lst[1][pos]
pos += length
Output:
{1: ['txt1', 'txt2'], 2: 'txt3'}
But as #timgeb notes, it's probably not something you want because afterwards, you'll have to check for data type each time you access this dictionary (if value is a list or not), which is an unnecessary problem that you could avoid by having all values as lists.
If you're dealing with large datasets it may be useful to add a pandas solution.
>>> import pandas as pd
>>> lst = [[1, 1, 2], ["txt1", "txt2", "txt3"]]
>>> s = pd.Series(lst[1], index=lst[0])
>>> s
1 txt1
1 txt2
2 txt3
>>> s.groupby(level=0).apply(list).to_dict()
{1: ['txt1', 'txt2'], 2: ['txt3']}
Note that this also produces lists for single elements (e.g. ['txt3']) which I highly recommend. Having both lists and strings as possible values will result in bugs because both of those types are iterable. You'd need to remember to check the type each time you process a dict-value.
You can use a defaultdict to group the strings by their corresponding key, then make a second pass through the list to extract the strings from singleton lists. Regardless of what you do, you'll need to access every element in both lists at least once, so some iteration structure is necessary (and even if you don't explicitly use iteration, whatever you use will almost definitely use iteration under the hood):
from collections import defaultdict
lst = [[1, 1, 2], ["txt1", "txt2", "txt3"]]
result = defaultdict(list)
for key, value in zip(lst[0], lst[1]):
result[key].append(value)
for key in result:
if len(result[key]) == 1:
result[key] = result[key][0]
print(dict(result)) # Prints {1: ['txt1', 'txt2'], 2: 'txt3'}
I have this current code
lst = [1,2,3,4]
c = dict((el,0) for el in lst)
for key in lst:
c[key] += increase_val(key)
Is there a more pythonic way to do it? Like using map? This code words but i would like probably a one-liner or maybe better way of writing this
In my opinion, that is a very clean, readable way of updating the dictionary in the way you wanted.
However, if you are looking for a one-liner, here's one:
new_dict = {x: y + increase_val(x) for x, y in old_dict.items()}
What's different is that this create's a new dictionary instead of updating the original one. If you want to mutate the dictionary in place, I think the plain old for-loop would be the most readable alternative.
In your case no need of c = dict((el,0) for el in lst) statement, because we create dictionary where value of each key is 0.
and in next for loop you are adding increment value to 0 i.e. 0 + 100 = 100, so need of addition also.
You can write code like:
lst = [1,2,3,4]
c = {}
for key in lst:
c[key] = increase_val(key)
collection.Counter()
Use collections.Counter() to remove one iteration over list to create dictionary because default value of every key in your case is 0.
Use Collections library, import collections
Demo:
>>> lst = [1,2,3,4]
>>> data = collections.Counter()
>>> for key in lst:
data[key] += increase_val(key)
collection.defaultdict()
We can use collections.defaultdict also. Just use data = collections.defaultdict(int) in above code. Here default value is zero.
But if we want to set default value to any constant value like 100 then we can use lambda function to set default value to 100
Demo:
>>> data = {}
>>> data["any"]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'any'
Get key error because there is on any key in dictionary.
>>> data1 = collections.defaultdict(lambda:0, data)
>>> data1["any"]
0
>>> data1 = collections.defaultdict(lambda:100, data)
>>> data1["any"]
>>> 100
Is there a smart pythonic way to check if there is an item (key,value) in a dict?
a={'a':1,'b':2,'c':3}
b={'a':1}
c={'a':2}
b in a:
--> True
c in a:
--> False
Use the short circuiting property of and. In this way if the left hand is false, then you will not get a KeyError while checking for the value.
>>> a={'a':1,'b':2,'c':3}
>>> key,value = 'c',3 # Key and value present
>>> key in a and value == a[key]
True
>>> key,value = 'b',3 # value absent
>>> key in a and value == a[key]
False
>>> key,value = 'z',3 # Key absent
>>> key in a and value == a[key]
False
You can check a tuple of the key, value against the dictionary's .items().
test = {'a': 1, 'b': 2}
print(('a', 1) in test.items())
>>> True
You've tagged this 2.7, as opposed to 2.x, so you can check whether the tuple is in the dict's viewitems:
(key, value) in d.viewitems()
Under the hood, this basically does key in d and d[key] == value.
In Python 3, viewitems is just items, but don't use items in Python 2! That'll build a list and do a linear search, taking O(n) time and space to do what should be a quick O(1) check.
>>> a = {'a': 1, 'b': 2, 'c': 3}
>>> b = {'a': 1}
>>> c = {'a': 2}
First here is a way that works for Python2 and Python3
>>> all(k in a and a[k] == b[k] for k in b)
True
>>> all(k in a and a[k] == c[k] for k in c)
False
In Python3 you can also use
>>> b.items() <= a.items()
True
>>> c.items() <= a.items()
False
For Python2, the equivalent is
>>> b.viewitems() <= a.viewitems()
True
>>> c.viewitems() <= a.viewitems()
False
Converting my comment into an answer :
Use the dict.get method which is already provided as an inbuilt method (and I assume is the most pythonic)
>>> dict = {'Name': 'Anakin', 'Age': 27}
>>> dict.get('Age')
27
>>> dict.get('Gender', 'None')
'None'
>>>
As per the docs -
get(key, default) -
Return the value for key if key is in the dictionary, else default.
If default is not given, it defaults to None, so that this method
never raises a KeyError.
Using get:
# this doesn't work if `None` is a possible value
# but you can use a different sentinal value in that case
a.get('a') == 1
Using try/except:
# more verbose than using `get`, but more foolproof also
a = {'a':1,'b':2,'c':3}
try:
has_item = a['a'] == 1
except KeyError:
has_item = False
print(has_item)
Other answers suggesting items in Python3 and viewitems in Python 2.7 are easier to read and more idiomatic, but the suggestions in this answer will work in both Python versions without any compatibility code and will still run in constant time. Pick your poison.
a.get('a') == 1
=> True
a.get('a') == 2
=> False
if None is valid item:
{'x': None}.get('x', object()) is None
Using .get is usually the best way to check if a key value pair exist.
if my_dict.get('some_key'):
# Do something
There is one caveat, if the key exists but is falsy then it will fail the test which may not be what you want. Keep in mind this is rarely the case. Now the inverse is a more frequent problem. That is using in to test the presence of a key. I have found this problem frequently when reading csv files.
Example
# csv looks something like this:
a,b
1,1
1,
# now the code
import csv
with open('path/to/file', 'r') as fh:
reader = csv.DictReader(fh) # reader is basically a list of dicts
for row_d in reader:
if 'b' in row_d:
# On the second iteration of this loop, b maps to the empty string but
# passes this condition statement, most of the time you won't want
# this. Using .get would be better for most things here.
For python 3.x
use if key in dict
See the sample code
#!/usr/bin/python
a={'a':1,'b':2,'c':3}
b={'a':1}
c={'a':2}
mylist = [a, b, c]
for obj in mylist:
if 'b' in obj:
print(obj['b'])
Output: 2
I have the following dictionary, where keys are integers and values are floats:
foo = {1:0.001,2:2.097,3:1.093,4:5.246}
This dictionary has keys 1, 2, 3 and 4.
Now, I remove the key '2':
foo = {1:0.001,3:1.093,4:5.246}
I only have the keys 1, 3 and 4 left. But I want these keys to be called 1, 2 and 3.
The function 'enumerate' allows me to get the list [1,2,3]:
some_list = []
for k,v in foo.items():
some_list.append(k)
num_list = list(enumerate(some_list, start=1))
Next, I try to populate the dictionary with these new keys and the old values:
new_foo = {}
for i in num_list:
for value in foo.itervalues():
new_foo[i[0]] = value
However, new_foo now contains the following values:
{1: 5.246, 2: 5.246, 3: 5.246}
So every value was replaced by the last value of 'foo'. I think the problem comes from the design of my for loop, but I don't know how to solve this. Any tips?
Using the list-comprehension-like style:
bar = dict( (k,v) for k,v in enumerate(foo.values(), start=1) )
But, as mentioned in the comments the ordering is going to be arbitrary, since the dict structure in python is unordered. To preserve the original order the following can be used:
bar = dict( ( i,foo[k] ) for i, k in enumerate(sorted(foo), start=1) )
here sorted(foo) returns the list of sorted keys of foo. i is the new enumeration of the sorted keys as well as the new enumeration for the new dict.
Like others have said, it would be best to use a list instead of dict. However, in case you prefer to stick with a dict, you can do
foo = {j+1:foo[k] for j,k in enumerate(sorted(foo))}
Agreeing with the other responses that a list implements the behavior you describe, and so it probably more appropriate, but I will suggest an answer anyway.
The problem with your code is the way you are using the data structures. Simply enumerate the items left in the dictionary:
new_foo = {}
for key, (old_key, value) in enumerate( sorted( foo.items() ) ):
key = key+1 # adjust for 1-based
new_foo[key] = value
A dictionary is the wrong structure here. Use a list; lists map contiguous integers to values, after all.
Either adjust your code to start at 0 rather than 1, or include a padding value at index 0:
foo = [None, 0.001, 2.097, 1.093, 5.246]
Deleting the 2 'key' is then as simple as:
del foo[2]
giving you automatic renumbering of the rest of your 'keys'.
This looks suspiciously like Something You Should Not Do, but I'll assume for a moment that you're simplifying the process for an MCVE rather than actually trying to name your dict keys 1, 2, 3, 4, 5, ....
d = {1:0.001, 2:2.097, 3:1.093, 4:5.246}
del d[2]
# d == {1:0.001, 3:1.093, 4:5.246}
new_d = {idx:val for idx,val in zip(range(1,len(d)+1),
(v for _,v in sorted(d.items())))}
# new_d == {1: 0.001, 2: 1.093, 3: 5.246}
You can convert dict to list, remove specific element, then convert list to dict. Sorry, it is not a one liner.
In [1]: foo = {1:0.001,2:2.097,3:1.093,4:5.246}
In [2]: l=foo.values() #[0.001, 2.097, 1.093, 5.246]
In [3]: l.pop(1) #returns 2.097, not the list
In [4]: dict(enumerate(l,1))
Out[4]: {1: 0.001, 2: 1.093, 3: 5.246}
Try:
foo = {1:0.001,2:2.097,3:1.093,4:5.246}
foo.pop(2)
new_foo = {i: value for i, (_, value) in enumerate(sorted(foo.items()), start=1)}
print new_foo
However, I'd advise you to use a normal list instead, which is designed exactly for fast lookup of gapless, numeric keys:
foo = [0.001, 2.097, 1.093, 5.245]
foo.pop(1) # list indices start at 0
print foo
One liner that filters a sequence, then re-enumerates and constructs a dict.
In [1]: foo = {1:0.001, 2:2.097, 3:1.093, 4:5.246}
In [2]: selected=1
In [3]: { k:v for k,v in enumerate((foo[i] for i in foo if i<>selected), 1) }
Out[3]: {1: 2.097, 2: 1.093, 3: 5.246}
I have a more compact method.
I think it's more readable and easy to understand. You can refer as below:
foo = {1:0.001,2:2.097,3:1.093,4:5.246}
del foo[2]
foo.update({k:foo[4] for k in foo.iterkeys()})
print foo
So you can get answer you want.
{1: 5.246, 3: 5.246, 4: 5.246}
I am currently using the following function to compare dictionary values and display all the values that don't match. Is there a faster or better way to do it?
match = True
for keys in dict1:
if dict1[keys] != dict2[keys]:
match = False
print keys
print dict1[keys],
print '->' ,
print dict2[keys]
Edit: Both the dicts contain the same keys.
If the true intent of the question is the comparison between dicts (rather than printing differences), the answer is
dict1 == dict2
This has been mentioned before, but I felt it was slightly drowning in other bits of information. It might appear superficial, but the value comparison of dicts has actually powerful semantics. It covers
number of keys (if they don't match, the dicts are not equal)
names of keys (if they don't match, they're not equal)
value of each key (they have to be '==', too)
The last point again appears trivial, but is acutally interesting as it means that all of this applies recursively to nested dicts as well. E.g.
m1 = {'f':True}
m2 = {'f':True}
m3 = {'a':1, 2:2, 3:m1}
m4 = {'a':1, 2:2, 3:m2}
m3 == m4 # True
Similar semantics exist for the comparison of lists. All of this makes it a no-brainer to e.g. compare deep Json structures, alone with a simple "==".
If the dicts have identical sets of keys and you need all those prints for any value difference, there isn't much you can do; maybe something like:
diffkeys = [k for k in dict1 if dict1[k] != dict2[k]]
for k in diffkeys:
print k, ':', dict1[k], '->', dict2[k]
pretty much equivalent to what you have, but you might get nicer presentation for example by sorting diffkeys before you loop on it.
You can use sets for this too
>>> a = {'x': 1, 'y': 2}
>>> b = {'y': 2, 'x': 1}
>>> set(a.iteritems())-set(b.iteritems())
set([])
>>> a['y']=3
>>> set(a.iteritems())-set(b.iteritems())
set([('y', 3)])
>>> set(b.iteritems())-set(a.iteritems())
set([('y', 2)])
>>> set(b.iteritems())^set(a.iteritems())
set([('y', 3), ('y', 2)])
Uhm, you are describing dict1 == dict2 ( check if boths dicts are equal )
But what your code does is all( dict1[k]==dict2[k] for k in dict1 ) ( check if all entries in dict1 are equal to those in dict2 )
Not sure if this helps but in my app I had to check if a dictionary has changed.
Doing this will not work since basically it's still the same object:
val={'A':1,'B':2}
old_val=val
val['A']=10
if old_val != val:
print('changed')
Using copy/deepcopy works:
import copy
val={'A':1,'B':2}
old_val=copy.deepcopy(val)
val['A']=10
if old_val != val:
print('changed')
If your values are hashable (ie. strings), then you can simply compare the ItemsView of the two dicts.
https://docs.python.org/3/library/stdtypes.html#dict-views
set_with_unique_key_value_pairs = dict1.items() ^ dict2.items()
set_with_matching_key_value_pairs = dict1.items() & dict2.items()
Any set operations are available to you.
Since you might not care about keys in this case, you can also just use the ValuesView (again, provided the values are hashable).
set_with_matching_values = dict1.values() & dict2.values()
>>> a = {'x': 1, 'y': 2}
>>> b = {'y': 2, 'x': 1}
>>> print a == b
True
>>> c = {'z': 1}
>>> print a == c
False
>>>
If you're just comparing for equality, you can just do this:
if not dict1 == dict2:
match = False
Otherwise, the only major problem I see is that you're going to get a KeyError if there is a key in dict1 that is not in dict2, so you may want to do something like this:
for key in dict1:
if not key in dict2 or dict1[key] != dict2[key]:
match = False
You could compress this into a comprehension to just get the list of keys that don't match too:
mismatch_keys = [key for key in x if not key in y or x[key] != y[key]]
match = not bool(mismatch_keys) #If the list is not empty, they don't match
for key in mismatch_keys:
print key
print '%s -> %s' % (dict1[key],dict2[key])
The only other optimization I can think of might be to use "len(dict)" to figure out which dict has fewer entries and loop through that one first to have the shortest loop possible.
If your dictionaries are deeply nested and if they contain different types of collections, you could convert them to json string and compare.
import json
match = (json.dumps(dict1) == json.dumps(dict2))
caveat- this solution may not work if your dictionaries have binary strings in the values as this is not json serializable