I am fairly new to python, but I haven't been able to find a solution to my problem anywhere.
I want to count the occurrences of a string inside a list of tuples.
Here is the list of tuples:
list1 = [
('12392', 'some string', 'some other string'),
('12392', 'some new string', 'some other string'),
('7862', None, 'some other string')
]
I've tried this but it just prints 0
for entry in list1:
print list1.count(entry[0])
As the same ID occurs twice in the list, this should return:
2
1
I also tried to increment a counter for each occurrence of the same ID but couldn't quite grasp how to write it.
*EDIT:
Using Eumiro's awesome answer. I just realized that I didn't explain the whole problem.
I actually need the total amount of entries which has a value more than 1. But if I try doing:
for name, value in list1:
if value > 1:
print value
I get this error:
ValueError: Too many values to unpack
Maybe collections.Counter could solve your problem:
from collections import Counter
Counter(elem[0] for elem in list1)
returns
Counter({'12392': 2, '7862': 1})
It is fast since it iterates over your list just once. You iterate over entries and then try to get a count of these entries within your list. That cannot be done with .count, but might be done as follows:
for entry in list1:
print(sum(1 for elem in list1 if elem[0] == entry[0]))
But seriously, have a look at collections.Counter.
EDIT: I actually need the total amount of entries which has a value more than 1.
You can still use the Counter:
c = Counter(elem[0] for elem in list1)
sum(v for k, v in c.iteritems() if v > 1)
returns 2, i.e. the sum of counts that are higher than 1.
list1.count(entry[0]) will not work because it looks at each of the three tuples in list1, eg. ('12392', 'some string', 'some other string') and checks if they are equal to '12392' for example, which is obviously not the case.
#eurmiro's answer shows you how to do it with Counter (which is the best way!) but here is a poor man's version to illustrate how Counter works using a dictionary and the dict.get(k, [,d]) method which will attempt to get a key (k), but if it doesn't exist it returns the default value instead (d):
>>> list1 = [
('12392', 'some string', 'some other string'),
('12392', 'some new string', 'some other string'),
('7862', None, 'some other string')
]
>>> d = {}
>>> for x, y, z in list1:
d[x] = d.get(x, 0) + 1
>>> d
{'12392': 2, '7862': 1}
I needed some extra functionality that Counter didn't have. I have a list of tuples that the first element is the key and the second element is the amount to add. #jamylak solution was a great adaptation for this!
>>> list = [(0,5), (3,2), (2,1), (0,2), (3,4)]
>>> d = {}
>>> for x, y in list1:
d[x] = d.get(x, 0) + y
>>> d
{0: 7, 2: 1, 3: 6}
Related
I've been working on a solution for an assignment where we which accepts a list of tuple objects and returns a dictionary containing the frequency of all the strings that appear in the list
So I've been trying to use Counter from collections to count the frequency of a key that is occurring inside a tuple list
tuple_list = [('a',5), ('a',5), ('b',6), ('b',4), ('b',3), ('b',7)]
I can't get the Counter to only check for 'a' or 'b' or just the strings in the list.
from collections import Counter
def get_frequency(tuple_list):
C = Counter(new_list)
print (C('a'), C('b'))
tuple_list = [('a',5), ('a',5), ('b',6), ('b',4), ('b',3), ('b',7)]
freq_dict = get_frequency(tuple_list)
for key in sorted(freq_dict.keys()):
print("{}: {}".format(key, freq_dict[key]))
The output that I was expecting should be a: 2 b: 4 but I kept on getting a: 0 b: 0
Since the second (numeric) element in each tuple appears to be irrelevant, you need to pass in a sequence of the letters you're trying to count. Try a list comprehension:
>>> tuple_list = [('a',5), ('a',5), ('b',6), ('b',4), ('b',3), ('b',7)]
>>>
>>> items = [item[0] for item in tuple_list]
>>> items
['a', 'a', 'b', 'b', 'b', 'b']
>>> from collections import Counter
>>> c = Counter(items)
>>> print(c)
Counter({'b': 4, 'a': 2})
if you don't want to use counter, you can just do the length of the lists like this...
unique_values = list(set([x[0] for x in tuple_list]))
a_dict = {}
for v in unique_values:
a_dict[v] = len([x[1] for x in tuple_list if x[0] == v])
print(a_dict)
which gives you:
{'b': 4, 'a': 2}
Since you only want to count the first element (the string) in each tuple, you should only use the counter object on that first element as you can see in the get_frequency function below:
def get_frequency(tuple_list):
cnt = Counter()
for tuple_elem in tuple_list:
cnt[tuple_elem[0]] += 1
return cnt
tuple_list = [('a',5), ('a',5), ('b',6)]
freq_dict = get_frequency(tuple_list)
for key, value in freq_dict.items():
print(f'{key}: {value}')
Also, make sure if you hope to receive a value from a function, you usually need to return a value using a return statement.
Hope that helps out!
Another solution is to use zip and next to extract the first item of each tuple into a new tuple and feed it into Counter.
from collections import Counter
result = Counter(next(zip(*items)))
I have the following dictionary, where keys are integers and values are floats:
foo = {1:0.001,2:2.097,3:1.093,4:5.246}
This dictionary has keys 1, 2, 3 and 4.
Now, I remove the key '2':
foo = {1:0.001,3:1.093,4:5.246}
I only have the keys 1, 3 and 4 left. But I want these keys to be called 1, 2 and 3.
The function 'enumerate' allows me to get the list [1,2,3]:
some_list = []
for k,v in foo.items():
some_list.append(k)
num_list = list(enumerate(some_list, start=1))
Next, I try to populate the dictionary with these new keys and the old values:
new_foo = {}
for i in num_list:
for value in foo.itervalues():
new_foo[i[0]] = value
However, new_foo now contains the following values:
{1: 5.246, 2: 5.246, 3: 5.246}
So every value was replaced by the last value of 'foo'. I think the problem comes from the design of my for loop, but I don't know how to solve this. Any tips?
Using the list-comprehension-like style:
bar = dict( (k,v) for k,v in enumerate(foo.values(), start=1) )
But, as mentioned in the comments the ordering is going to be arbitrary, since the dict structure in python is unordered. To preserve the original order the following can be used:
bar = dict( ( i,foo[k] ) for i, k in enumerate(sorted(foo), start=1) )
here sorted(foo) returns the list of sorted keys of foo. i is the new enumeration of the sorted keys as well as the new enumeration for the new dict.
Like others have said, it would be best to use a list instead of dict. However, in case you prefer to stick with a dict, you can do
foo = {j+1:foo[k] for j,k in enumerate(sorted(foo))}
Agreeing with the other responses that a list implements the behavior you describe, and so it probably more appropriate, but I will suggest an answer anyway.
The problem with your code is the way you are using the data structures. Simply enumerate the items left in the dictionary:
new_foo = {}
for key, (old_key, value) in enumerate( sorted( foo.items() ) ):
key = key+1 # adjust for 1-based
new_foo[key] = value
A dictionary is the wrong structure here. Use a list; lists map contiguous integers to values, after all.
Either adjust your code to start at 0 rather than 1, or include a padding value at index 0:
foo = [None, 0.001, 2.097, 1.093, 5.246]
Deleting the 2 'key' is then as simple as:
del foo[2]
giving you automatic renumbering of the rest of your 'keys'.
This looks suspiciously like Something You Should Not Do, but I'll assume for a moment that you're simplifying the process for an MCVE rather than actually trying to name your dict keys 1, 2, 3, 4, 5, ....
d = {1:0.001, 2:2.097, 3:1.093, 4:5.246}
del d[2]
# d == {1:0.001, 3:1.093, 4:5.246}
new_d = {idx:val for idx,val in zip(range(1,len(d)+1),
(v for _,v in sorted(d.items())))}
# new_d == {1: 0.001, 2: 1.093, 3: 5.246}
You can convert dict to list, remove specific element, then convert list to dict. Sorry, it is not a one liner.
In [1]: foo = {1:0.001,2:2.097,3:1.093,4:5.246}
In [2]: l=foo.values() #[0.001, 2.097, 1.093, 5.246]
In [3]: l.pop(1) #returns 2.097, not the list
In [4]: dict(enumerate(l,1))
Out[4]: {1: 0.001, 2: 1.093, 3: 5.246}
Try:
foo = {1:0.001,2:2.097,3:1.093,4:5.246}
foo.pop(2)
new_foo = {i: value for i, (_, value) in enumerate(sorted(foo.items()), start=1)}
print new_foo
However, I'd advise you to use a normal list instead, which is designed exactly for fast lookup of gapless, numeric keys:
foo = [0.001, 2.097, 1.093, 5.245]
foo.pop(1) # list indices start at 0
print foo
One liner that filters a sequence, then re-enumerates and constructs a dict.
In [1]: foo = {1:0.001, 2:2.097, 3:1.093, 4:5.246}
In [2]: selected=1
In [3]: { k:v for k,v in enumerate((foo[i] for i in foo if i<>selected), 1) }
Out[3]: {1: 2.097, 2: 1.093, 3: 5.246}
I have a more compact method.
I think it's more readable and easy to understand. You can refer as below:
foo = {1:0.001,2:2.097,3:1.093,4:5.246}
del foo[2]
foo.update({k:foo[4] for k in foo.iterkeys()})
print foo
So you can get answer you want.
{1: 5.246, 3: 5.246, 4: 5.246}
I am trying to create a list of lists based on hashes. That is, I want a list of lists of items that hash the same. Is this possible in a single-line comprehension?
Here is the simple code that works without comprehensions:
def list_of_lists(items):
items_by_hash = defaultdict(list)
for item in items:
words_by_key[hash(item)].append(item)
return words_by_key.values()
For example, let's say we have this simple hash function:
def hash(string):
import __builtin__
return __builtin__.hash(string) % 10
Then,
>>> l = ['sam', 'nick', 'nathan', 'mike']
>>> [hash(x) for x in l]
[4, 3, 2, 2]
>>>
>>> list_of_lists(l)
[['nathan', 'mike'], ['nick'], ['sam']]
Is there any way I could do this in a comprehension? I need to be able to reference the dictionary I'm building mid-comprehension, in order to append the next item to the list-value.
This is the best I've got, but it doesn't work:
>>> { hash(word) : [word] for word in l }.values()
[['mike'], ['nick'], ['sam']]
It obviously creates a new list every time which is not what I want. I want something like
{ hash(word) : __this__[hash(word)] + [word] for word in l }.values()
or
>>> dict([ (hash(word), word) for word in l ])
{2: 'mike', 3: 'nick', 4: 'sam'}
but this causes the same problem.
[[y[1] for y in x[1]] for x in itertools.groupby(sorted((hash(y), y)
for y in items), operator.itemgetter(0))]
I have a dictionary of a list of dictionaries. something like below:
x = {'a':[{'p':1, 'q':2}, {'p':4, 'q':5}], 'b':[{'p':6, 'q':1}, {'p':10, 'q':12}]}
The length of the lists (values) is the same for all keys of dict x.
I want to get the length of any one value i.e. a list without having to go through the obvious method -> get the keys, use len(x[keys[0]]) to get the length.
my code for this as of now:
val = None
for key in x.keys():
val = x[key]
break
#break after the first iteration as the length of the lists is the same for any key
try:
what_i_Want = len(val)
except TypeError:
print 'val wasn't set'
i am not happy with this, can be made more 'pythonic' i believe.
This is most efficient way, since we don't create any intermediate lists.
print len(x[next(iter(x))]) # 2
Note: For this method to work, the dictionary should have atleast one key in it.
What about this:
val = x[x.keys()[0]]
or alternatively:
val = x.values()[0]
and then your answer is
len(val)
Some of the other solutions (posted by thefourtheye and gnibbler) are better because they are not creating an intermediate list. I added this response merely as an easy to remember and obvious option, not a solution for time-efficient usage.
Works ok in Python2 or Python3
>>> x = {'a':[{'p':1, 'q':2}, {'p':4, 'q':5}], 'b':[{'p':6, 'q':1}, {'p':10, 'q':12}]}
>>> next(len(i) for i in x.values())
2
This is better for Python2 as it avoids making a list of the values. Works well in Python3 too
>>> next(len(x[k]) for k in x)
2
Using next and iter:
>>> x = {'a':[{'p':1, 'q':2}, {'p':4, 'q':5}], 'b':[{'p':6, 'q':1}, {'p':10, 'q':12}]}
>>> val = next(iter(x.values()), None) # Use `itervalues` in Python 2.x
>>> val
[{'q': 2, 'p': 1}, {'q': 5, 'p': 4}]
>>> len(val)
2
>>> x = {}
>>> val = next(iter(x.values()), None) # `None`: default value
>>> val is None
True
>>> x = {'a':[{'p':1, 'q':2}, {'p':4, 'q':5}], 'b':[{'p':6, 'q':1}, {'p':10, 'q':12}]}
>>> len(x.values()[0])
2
Here, x.values gives you a list of all values then you can get length of any one value from it.
So I have a script like this
for key, Value in mydictionary.iteritems():
if 'Mammal' in Value[1]:
#because the value is a list of 2 items and I want to get at the second one
Value[1] = Value[1].strip('Mammal')
This code effectively removes Mammal from the beginning of the second item in the Value list. Now I want to make this nicer python looking with list comprehension so I came up with this but obviously is wrong.... Any help?
Value[1] = [Value[1].strip('Mammal') for Key, Value in mydictionary.iteritems() if 'Mammal' in Value[1] ]
Also, on the same lines, a list comprehension to list all the keys in this dictionary. I am having a hard time coming up with that one.
I came up with:
for key, Value in mydictionary.iteritems():
Templist.append(key)
but as a list comprehension I am thinking....but it doesn't work :(
alist = [key for Key, Value in mydictionary.iteritems()]
mydictionary = {
1: [4, "ABC Mammal"],
2: [8, "Mammal 123"],
3: [15, "Bird (Not a Mammal)"]
}
mydictionary = {key: ([value[0], value[1].strip('Mammal')] if 'Mammal' in value[1] else value) for key, value in mydictionary.iteritems()}
print mydictionary
Output:
{1: [4, 'ABC '], 2: [8, ' 123'], 3: [15, 'Bird (Not a Mammal)']}
Although I wouldn't call this objectively "nicer looking", so the iterative method may be preferable.
List comprehension creates a new list
If you were able to use strip(), then Value[1] is a string - not a list
You may do 2nd part just with dictionary method keys() - both your attempts are redundant.
alist = mydictionary.keys()
mydict = {'a':['Mammal','BC','CD'],
'b':['AB','XY','YZ'],
'c':['Mammal','GG','FD'],}
print [x for x,y in mydict.items() if y[0]=='Mammal']
You should not use a list comprehension solely to create side effects. You can, but it is considered bad practice, and you should stick with the for loop.
Anyway, since you are working with a dictionary, you may be looking for a dict comprehension:
mydictionary = {'foo': ['unknown', 'Mammal is great'],
'bar': ['something', 'dont touch me']}
mydictionary = {k: [a, b.replace('Mammal', '', 1)] for k, [a, b] in mydictionary.iteritems() if b.startswith('Mammal')}
Also note that using if you use dict comprehension, you create a new dictionary instead of replacing the values in your old one.
...
Value[1] = Value[1].strip('Mammal')
...
This code effectively removes Mammal from the beginning of the second item in the Value list.
No, it does not. It replaces all occurrences of M, a, m and l from the beginning and the end of that item. Better use the replace method to replace the first occurrence of Mammal with an empty string.
alist = [key for Key, Value in mydictionary.iteritems()]
You have a typo here. It should read:
alist = [key for key, value in mydictionary.iteritems()]
or just
alist = mydictionary.keys()