I am pretty new to all of this so this might be a noobie question.. but I am looking to find length of dictionary values... but I do not know how this can be done.
So for example,
d = {'key':['hello', 'brave', 'morning', 'sunset', 'metaphysics']}
I was wondering is there a way I can find the len or number of items of the dictionary value.
Thanks
Sure. In this case, you'd just do:
length_key = len(d['key']) # length of the list stored at `'key'` ...
It's hard to say why you actually want this, but, perhaps it would be useful to create another dict that maps the keys to the length of values:
length_dict = {key: len(value) for key, value in d.items()}
length_key = length_dict['key'] # length of the list stored at `'key'` ...
Lets do some experimentation, to see how we could get/interpret the length of different dict/array values in a dict.
create our test dict, see list and dict comprehensions:
>>> my_dict = {x:[i for i in range(x)] for x in range(4)}
>>> my_dict
{0: [], 1: [0], 2: [0, 1], 3: [0, 1, 2]}
Get the length of the value of a specific key:
>>> my_dict[3]
[0, 1, 2]
>>> len(my_dict[3])
3
Get a dict of the lengths of the values of each key:
>>> key_to_value_lengths = {k:len(v) for k, v in my_dict.items()}
{0: 0, 1: 1, 2: 2, 3: 3}
>>> key_to_value_lengths[2]
2
Get the sum of the lengths of all values in the dict:
>>> [len(x) for x in my_dict.values()]
[0, 1, 2, 3]
>>> sum([len(x) for x in my_dict.values()])
6
To find all of the lengths of the values in a dictionary you can do this:
lengths = [len(v) for v in d.values()]
A common use case I have is a dictionary of numpy arrays or lists where I know they're all the same length, and I just need to know one of them (e.g. I'm plotting timeseries data and each timeseries has the same number of timesteps). I often use this:
length = len(next(iter(d.values())))
Let dictionary be :
dict={'key':['value1','value2']}
If you know the key :
print(len(dict[key]))
else :
val=[len(i) for i in dict.values()]
print(val[0])
# for printing length of 1st key value or length of values in keys if all keys have same amount of values.
d={1:'a',2:'b'}
sum=0
for i in range(0,len(d),1):
sum=sum+1
i=i+1
print i
OUTPUT=2
Related
I know to write something simple and slow with loop, but I need it to run super fast in big scale.
input:
lst = [[1, 1, 2], ["txt1", "txt2", "txt3"]]
desired out put:
d = {1 : ["txt1", "txt2"], 2 : "txt3"]
There is something built-in at python which make dict() extend key instead replacing it?
dict(list(zip(lst[0], lst[1])))
One option is to use dict.setdefault:
out = {}
for k, v in zip(*lst):
out.setdefault(k, []).append(v)
Output:
{1: ['txt1', 'txt2'], 2: ['txt3']}
If you want the element itself for singleton lists, one way is adding a condition that checks for it while you build an output dictionary:
out = {}
for k,v in zip(*lst):
if k in out:
if isinstance(out[k], list):
out[k].append(v)
else:
out[k] = [out[k], v]
else:
out[k] = v
or if lst[0] is sorted (like it is in your sample), you could use itertools.groupby:
from itertools import groupby
out = {}
pos = 0
for k, v in groupby(lst[0]):
length = len([*v])
if length > 1:
out[k] = lst[1][pos:pos+length]
else:
out[k] = lst[1][pos]
pos += length
Output:
{1: ['txt1', 'txt2'], 2: 'txt3'}
But as #timgeb notes, it's probably not something you want because afterwards, you'll have to check for data type each time you access this dictionary (if value is a list or not), which is an unnecessary problem that you could avoid by having all values as lists.
If you're dealing with large datasets it may be useful to add a pandas solution.
>>> import pandas as pd
>>> lst = [[1, 1, 2], ["txt1", "txt2", "txt3"]]
>>> s = pd.Series(lst[1], index=lst[0])
>>> s
1 txt1
1 txt2
2 txt3
>>> s.groupby(level=0).apply(list).to_dict()
{1: ['txt1', 'txt2'], 2: ['txt3']}
Note that this also produces lists for single elements (e.g. ['txt3']) which I highly recommend. Having both lists and strings as possible values will result in bugs because both of those types are iterable. You'd need to remember to check the type each time you process a dict-value.
You can use a defaultdict to group the strings by their corresponding key, then make a second pass through the list to extract the strings from singleton lists. Regardless of what you do, you'll need to access every element in both lists at least once, so some iteration structure is necessary (and even if you don't explicitly use iteration, whatever you use will almost definitely use iteration under the hood):
from collections import defaultdict
lst = [[1, 1, 2], ["txt1", "txt2", "txt3"]]
result = defaultdict(list)
for key, value in zip(lst[0], lst[1]):
result[key].append(value)
for key in result:
if len(result[key]) == 1:
result[key] = result[key][0]
print(dict(result)) # Prints {1: ['txt1', 'txt2'], 2: 'txt3'}
I've looked all over the internet asking the question how can I find all the keys in a dictionary that have the same value. But this value is not known. The closest thing that came up was this, but the values are known.
Say I had a dictionary like this and these values are totally random, not hardcoded by me.
{'AGAA': 2, 'ATAA': 5,'AJAA':2}
How can I identify all the keys with the same value? What would be the most efficient way of doing this.
['AGAA','AJAA']
The way I would do it is "invert" the dictionary. By this I mean to group the keys for each common value. So if you start with:
{'AGAA': 2, 'ATAA': 5, 'AJAA': 2}
You would want to group it such that the keys are now values and values are now keys:
{2: ['AGAA', 'AJAA'], 5: ['ATAA']}
After grouping the values, you can use max to determine the largest grouping.
Example:
from collections import defaultdict
data = {'AGAA': 2, 'ATAA': 5, 'AJAA': 2}
grouped = defaultdict(list)
for key in data:
grouped[data[key]].append(key)
max_group = max(grouped.values(), key=len)
print(max_group)
Outputs:
['AGAA', 'AJAA']
You could also find the max key and print it that way:
max_key = max(grouped, key=lambda k: len(grouped[k]))
print(grouped[max_key])
You can try this:
from collections import Counter
d = {'AGAA': 2, 'ATAA': 5,'AJAA':2}
l = Counter(d.values())
l = [x for x,y in l.items() if y > 1]
out = [x for x,y in d.items() if y in l]
# Out[21]: ['AGAA', 'AJAA']
I have my program's output as a python dictionary and i want a list of keys from the dictn:
s = "cool_ice_wifi"
r = ["water_is_cool", "cold_ice_drink", "cool_wifi_speed"]
good_list=s.split("_")
dictn={}
for i in range(len(r)):
split_review=r[i].split("_")
counter=0
for good_word in good_list:
if good_word in split_review:
counter=counter+1
d1={i:counter}
dictn.update(d1)
print(dictn)
The conditions on which we should get the keys:
The keys with the same values will have the index copied as it is in a dummy list.
The keys with highest values will come first and then the lowest in the dummy list
Dictn={0: 1, 1: 1, 2: 2}
Expected output = [2,0,1]
You can use a list comp:
[key for key in sorted(dictn, key=dictn.get, reverse=True)]
In Python3 it is now possible to use the sorted method, as described here, to sort the dictionary in any way you choose.
Check out the documentation, but in the simplest case you can .get the dictionary's values, while for more complex operations, you'd define a key function yourself.
Dictionaries in Python3 are now insertion-ordered, so one other way to do things is to sort at the moment of dictionary creation, or you could use an OrderedDict.
Here's an example of the first option in action, which I think is the easiest
>>> a = {}
>>> a[0] = 1
>>> a[1] = 1
>>> a[2] = 2
>>> print(a)
{0: 1, 1: 1, 2: 2}
>>>
>>> [(k) for k in sorted(a, key=a.get, reverse=True)]
[2, 0, 1]
I have arrays like this:
['[camera_positive,3]', '[lens_positive,1]', '[camera_positive,2]', '[lens_positive,1]', '[lens_positive,1]', '[camera_positive,1]']
How to sum all value on index [1] with same string on index [0]?
Example:
camera_positive = 3 + 2 + 1 = 6
lens_positive = 1 + 1 + 1 = 3
You could use set in order to extract the unique keys and then use list comprehension to compute the sum for each key:
data = [['camera_positive', 3],
['lens_positive', 1],
['camera_positive', 2],
['lens_positive', 1],
['lens_positive', 1],
['camera_positive', 1]]
keys = set(key for key, value in data)
for key1 in keys:
total = sum(value for key2, value in data if key1 == key2)
print("key='{}', sum={}".format(key1, total))
this gives:
key='camera_positive', sum=6
key='lens_positive', sum=3
I'm assuming that you have a list of list, not a list of strings as shown in the question. Otherwise you'll have to do some parsing. That said, I would solve this problem by creating a dictionary, and then iterating over the values and adding them to the dictionary as you go.
The default dict allows this program to work without getting a key error, as it'll assume 0 if the key does not exist yet. You can read up on defaultdict here: https://docs.python.org/3.3/library/collections.html#collections.defaultdict
lmk if that helps!
>>> from collections import defaultdict
>>> d = defaultdict(int)
>>> d
defaultdict(<class 'int'>, {})
>>> lst=[['a',1], ['b', 2], ['a',4]]
>>> for k, v in lst:
... d[k] += v
...
>>> d
defaultdict(<class 'int'>, {'a': 5, 'b': 2})
You could group the entries by their first index using groupby with lambda x: x[0] or operator.itemgetter(0) as key.
This is maybe a bit less code than what Nick Brady showed. However you would need to sort the list first (for the same key), so it might be slower than his approach.
I have the following dictionary, where keys are integers and values are floats:
foo = {1:0.001,2:2.097,3:1.093,4:5.246}
This dictionary has keys 1, 2, 3 and 4.
Now, I remove the key '2':
foo = {1:0.001,3:1.093,4:5.246}
I only have the keys 1, 3 and 4 left. But I want these keys to be called 1, 2 and 3.
The function 'enumerate' allows me to get the list [1,2,3]:
some_list = []
for k,v in foo.items():
some_list.append(k)
num_list = list(enumerate(some_list, start=1))
Next, I try to populate the dictionary with these new keys and the old values:
new_foo = {}
for i in num_list:
for value in foo.itervalues():
new_foo[i[0]] = value
However, new_foo now contains the following values:
{1: 5.246, 2: 5.246, 3: 5.246}
So every value was replaced by the last value of 'foo'. I think the problem comes from the design of my for loop, but I don't know how to solve this. Any tips?
Using the list-comprehension-like style:
bar = dict( (k,v) for k,v in enumerate(foo.values(), start=1) )
But, as mentioned in the comments the ordering is going to be arbitrary, since the dict structure in python is unordered. To preserve the original order the following can be used:
bar = dict( ( i,foo[k] ) for i, k in enumerate(sorted(foo), start=1) )
here sorted(foo) returns the list of sorted keys of foo. i is the new enumeration of the sorted keys as well as the new enumeration for the new dict.
Like others have said, it would be best to use a list instead of dict. However, in case you prefer to stick with a dict, you can do
foo = {j+1:foo[k] for j,k in enumerate(sorted(foo))}
Agreeing with the other responses that a list implements the behavior you describe, and so it probably more appropriate, but I will suggest an answer anyway.
The problem with your code is the way you are using the data structures. Simply enumerate the items left in the dictionary:
new_foo = {}
for key, (old_key, value) in enumerate( sorted( foo.items() ) ):
key = key+1 # adjust for 1-based
new_foo[key] = value
A dictionary is the wrong structure here. Use a list; lists map contiguous integers to values, after all.
Either adjust your code to start at 0 rather than 1, or include a padding value at index 0:
foo = [None, 0.001, 2.097, 1.093, 5.246]
Deleting the 2 'key' is then as simple as:
del foo[2]
giving you automatic renumbering of the rest of your 'keys'.
This looks suspiciously like Something You Should Not Do, but I'll assume for a moment that you're simplifying the process for an MCVE rather than actually trying to name your dict keys 1, 2, 3, 4, 5, ....
d = {1:0.001, 2:2.097, 3:1.093, 4:5.246}
del d[2]
# d == {1:0.001, 3:1.093, 4:5.246}
new_d = {idx:val for idx,val in zip(range(1,len(d)+1),
(v for _,v in sorted(d.items())))}
# new_d == {1: 0.001, 2: 1.093, 3: 5.246}
You can convert dict to list, remove specific element, then convert list to dict. Sorry, it is not a one liner.
In [1]: foo = {1:0.001,2:2.097,3:1.093,4:5.246}
In [2]: l=foo.values() #[0.001, 2.097, 1.093, 5.246]
In [3]: l.pop(1) #returns 2.097, not the list
In [4]: dict(enumerate(l,1))
Out[4]: {1: 0.001, 2: 1.093, 3: 5.246}
Try:
foo = {1:0.001,2:2.097,3:1.093,4:5.246}
foo.pop(2)
new_foo = {i: value for i, (_, value) in enumerate(sorted(foo.items()), start=1)}
print new_foo
However, I'd advise you to use a normal list instead, which is designed exactly for fast lookup of gapless, numeric keys:
foo = [0.001, 2.097, 1.093, 5.245]
foo.pop(1) # list indices start at 0
print foo
One liner that filters a sequence, then re-enumerates and constructs a dict.
In [1]: foo = {1:0.001, 2:2.097, 3:1.093, 4:5.246}
In [2]: selected=1
In [3]: { k:v for k,v in enumerate((foo[i] for i in foo if i<>selected), 1) }
Out[3]: {1: 2.097, 2: 1.093, 3: 5.246}
I have a more compact method.
I think it's more readable and easy to understand. You can refer as below:
foo = {1:0.001,2:2.097,3:1.093,4:5.246}
del foo[2]
foo.update({k:foo[4] for k in foo.iterkeys()})
print foo
So you can get answer you want.
{1: 5.246, 3: 5.246, 4: 5.246}