I am iterating through a database column
I need to create a dictionary that updates certain values of certain keys if a criterion is met.
For example
The first iteration is: 'apples'
The dictionary should be {'apples': 1}
The second iteration is: 'peers'
The dictionary should be {'apples': 1, 'peers': 1}
The third iteration is: 'apples'
The dictionary should be {'apples': 2, 'peers': 1}
I apologise for the basic explanation. Its the best way ( I think ) to communicate what I want, because I don't know how to code this.
I need this to be in a dictionary because this operation is deep into a nested for loop structure
THE GOAL:
Is to get the iteration that appears most
DESIRED OUTCOME:
mostListed = 'apples'
I am new to python, if I am missing something obvious I am very open to learning
Using Counter() from collections:
>>> from collections import Counter
>>> l = ["apples", "pears", "apples"]
>>> Counter(l)
Counter({'apples': 2, 'pears': 1})
Making it work for your case goes for example like:
from collections import Counter
list_ = []
for item in ["first", "second", "third"]:
input_value = input(f"{item} iteration: ")
list_.append(input_value)
count = Counter(list_)
print(count) # output: Counter({'apples': 2, 'pears': 1})
print(count.most_common(1)) # output: [('apples', 2)]
Without defaultdict
You can use the following code:
d = {}
for iteration in ['first', 'second', 'third']:
value = input(f'The {iteration} iteration is:')
if value in d:
d[value] += 1
else:
d[value] = 1
print(d)
Output:
The first iteration is:apples
The second iteration is:peers
The third iteration is:apples
{'apples': 2, 'peers': 1}
Using defaultdict
You can create a defaultdict which default value is 0 as follows:
from _collections import defaultdict
d = defaultdict(lambda: 0)
for iteration in ['first', 'second', 'third']:
value = input(f'The {iteration} iteration is:')
d[value] += 1
print(dict(d))
Output
The first iteration is:apples
The second iteration is:peers
The third iteration is:apples
{'apples': 2, 'peers': 1}
Adding this to the already numerous answers for its clarity
from collections import Counter
values = ['apples', 'peers', 'apples']
Counter(values).most_common(1)
>>> [('apples', 2)]
Here is an example:
my_list = ['apples', 'apples', 'peers', 'apples', 'peers']
new_dict = {}
for i in my_list:
if i in new_dict:
new_dict[i] += 1
else:
new_dict[i] = 1
print(new_dict)
You can update the values of a key in a dictionary by doing
if 'apples' in dict:
dict['apples'] += 1
else:
dict['apples'] = 1
and you can find the key with the maximum value by something like this:
most_listed = max(dict, key=lambda k: dict[k])
Related
For ex:
people_list = [{"name":"joe", "age":20}, {"name":"tom", "age":35}, {"name":"joe", "age":46}]
How do I check if any key-value pair appears more than once in this list of dictionaries?
I am aware of the counter function.
from collections import Counter
for i in range(len(people_list):
Counter(people_list[i]["name"])
for key,value in Counter:
if Counter(key:value) > 1:
...
Just not sure how to make it look and count for a specific key value pair in this list of dictionaries and check if it appears more than once.
You want to take all the key/value pairs in each dictionary and add them to a Counter instance whose key is (key, value) and whose value will end up being the number of times this pair exists:
from itertools import chain
from collections import Counter
l = [
{"name":"joe", "age":20},
{"name":"tom", "age":35},
{"name":"joe", "age":46}
]
c = Counter(chain.from_iterable(d.items() for d in l))
# Now build the new list of tuples (key-name, key-value, count-of-occurrences)
# but include only those key/value pairs that occur more than once:
l = [(k[0], k[1], v) for k, v in c.items() if v > 1]
print(l)
Prints:
[('name', 'joe', 2)]
If you want to create a new list of dictionaries where the 'name' key values are all the distinct name values in the original list of dictionaries and whose value for the 'age' key is the sum of the ages for that name, then:
from collections import defaultdict
l = [
{"name":"joe", "age":20},
{"name":"tom", "age":35},
{"name":"joe", "age":46}
]
d = defaultdict(int)
for the_dict in l:
d[the_dict["name"]] += the_dict["age"]
new_l = [{"name": k, "age": v} for k, v in d.items()]
print(new_l)
Prints:
[{'name': 'joe', 'age': 66}, {'name': 'tom', 'age': 35}]
You can use collections.Counter and update like the below:
from collections import Counter
people_list = [{"name":"joe", "age":20}, {"name":"tom", "age":35}, {"name":"joe", "age":46}]
cnt = Counter()
for dct in people_list:
cnt.update(dct.items())
print(cnt)
# Get the items > 1
for k,v in cnt.items():
if v>1:
print(k)
Output:
Counter({('name', 'joe'): 2, ('age', 20): 1, ('name', 'tom'): 1, ('age', 35): 1, ('age', 46): 1})
('name', 'joe')
If you want to get as output "joe", because it appears more than once for the key "name", then:
from collections import Counter
people_list = [{"name":"joe", "age":20}, {"name":"tom", "age":35}, {"name":"joe", "age":46}]
counter = Counter([person["name"] for person in people_list])
print([name for name, count in counter.items() if count > 1]) # ['joe']
You can use a Counter to count the number of occurrences of each value and check if any key-value pair appears more than once.
from collections import Counter
people_list = [{"name":"joe", "age":20}, {"name":"tom", "age":35}, {"name":"joe", "age":46}]
# Count the number of occurrences of each value
counts = Counter(d["name"] for d in people_list)
# Check if any key-value pair appears more than once
for key, value in counts.items():
if value > 1:
print(f"{key} appears more than once")
Gives:
joe appears more than once
I would convert each pair into a string representation and then put it on a list. After that you can use the Counter logic that is already present in this question. Granted that this is not the most effective way to treat the data, but it is still a simple solution.
from functools import reduce
from collections import Counter
people_list = [{"name":"joe", "age":20}, {"name":"tom", "age":35}, {"name":"joe", "age":46}]
str_lists = [[f'{k}_{v}' for k,v in value.items()] for value in people_list]
# [['name_joe', 'age_20'], ['name_tom', 'age_35'], ['name_joe', 'age_46']]
normalized_list = list(reduce(lambda a,b: a+b,str_lists))
# ['name_joe', 'age_20', 'name_tom', 'age_35', 'name_joe', 'age_46']
dict_ = dict(Counter(normalized_list))
# {'name_joe': 2, 'age_20': 1, 'name_tom': 1, 'age_35': 1, 'age_46': 1}
print([k for k,v in dict_.items() if v>1])
# ['name_joe']
I've been working on a solution for an assignment where we which accepts a list of tuple objects and returns a dictionary containing the frequency of all the strings that appear in the list
So I've been trying to use Counter from collections to count the frequency of a key that is occurring inside a tuple list
tuple_list = [('a',5), ('a',5), ('b',6), ('b',4), ('b',3), ('b',7)]
I can't get the Counter to only check for 'a' or 'b' or just the strings in the list.
from collections import Counter
def get_frequency(tuple_list):
C = Counter(new_list)
print (C('a'), C('b'))
tuple_list = [('a',5), ('a',5), ('b',6), ('b',4), ('b',3), ('b',7)]
freq_dict = get_frequency(tuple_list)
for key in sorted(freq_dict.keys()):
print("{}: {}".format(key, freq_dict[key]))
The output that I was expecting should be a: 2 b: 4 but I kept on getting a: 0 b: 0
Since the second (numeric) element in each tuple appears to be irrelevant, you need to pass in a sequence of the letters you're trying to count. Try a list comprehension:
>>> tuple_list = [('a',5), ('a',5), ('b',6), ('b',4), ('b',3), ('b',7)]
>>>
>>> items = [item[0] for item in tuple_list]
>>> items
['a', 'a', 'b', 'b', 'b', 'b']
>>> from collections import Counter
>>> c = Counter(items)
>>> print(c)
Counter({'b': 4, 'a': 2})
if you don't want to use counter, you can just do the length of the lists like this...
unique_values = list(set([x[0] for x in tuple_list]))
a_dict = {}
for v in unique_values:
a_dict[v] = len([x[1] for x in tuple_list if x[0] == v])
print(a_dict)
which gives you:
{'b': 4, 'a': 2}
Since you only want to count the first element (the string) in each tuple, you should only use the counter object on that first element as you can see in the get_frequency function below:
def get_frequency(tuple_list):
cnt = Counter()
for tuple_elem in tuple_list:
cnt[tuple_elem[0]] += 1
return cnt
tuple_list = [('a',5), ('a',5), ('b',6)]
freq_dict = get_frequency(tuple_list)
for key, value in freq_dict.items():
print(f'{key}: {value}')
Also, make sure if you hope to receive a value from a function, you usually need to return a value using a return statement.
Hope that helps out!
Another solution is to use zip and next to extract the first item of each tuple into a new tuple and feed it into Counter.
from collections import Counter
result = Counter(next(zip(*items)))
So I have a dict, which contains keys corresponding to a list, which contains str. I want to collect all the same values in said list and sum them together. Perhaps my explanation was confusing so I'll provide an example:
function_name({'key1':['apple', 'orange'], 'key2':['orange', 'pear'})
>>> {'apple':1, 'orange':2, 'pear':1}
How would I create this function? I was thinking of somehow making a for loop like this:
count = 0
for fruit in dict_name:
if food == 'apple'
count = count + fruit
I am still unsure about how to format this especially how to count the values and collect them, thanks in advance for any suggestions!
You can un-nest the dict's values and apply a Counter.
>>> from collections import Counter
>>>
>>> d = {'key1':['apple', 'orange'], 'key2':['orange', 'pear']}
>>> Counter(v for sub in d.values() for v in sub)
Counter({'apple': 1, 'orange': 2, 'pear': 1})
If you don't like the nested generator comprehension, the un-nesting can be done with itertools.chain.from_iterable.
>>> from itertools import chain
>>> Counter(chain.from_iterable(d.values()))
Counter({'apple': 1, 'orange': 2, 'pear': 1})
Without imports and with traditional loops, it would look like this:
>>> result = {}
>>> for sub in d.values():
...: for v in sub:
...: result[v] = result.get(v, 0) + 1
...:
>>> result
{'apple': 1, 'orange': 2, 'pear': 1}
Something like this should do the trick:
>>> from collections import Counter
>>> counts = Counter([item for sublist in your_dict.values() for item in sublist])
If you don't want to import any libraries you can do as follows:
function_name = {'key1':['apple', 'orange'], 'key2':['orange', 'pear']}
foobar = {}
for key, value in function_name.items():
for element in value:
if element in foobar:
foobar[element] += 1
else:
foobar[element] = 1
print(foobar)
You check if the value is already in the created dict 'foobar'. If it is you add its value by one. If its not, then you add the value as a key and define its value as one. :)
I have a dictionary contains lists of values and a list:
dict1={'first':['hi','nice'], 'second':['night','moon']}
list1= [ 'nice','moon','hi']
I want to compare the value in the dictionary with the list1 and make a counter for the keys if the value of each key appeared in the list:
the output should like this:
first 2
second 1
here is my code:
count = 0
for list_item in list1:
for dict_v in dict1.values():
if list_item.split() == dict_v:
count+= 1
print(dict.keys,count)
any help? Thanks in advance
I would make a set out of list1 for the O(1) lookup time and access to the intersection method. Then employ a dict comprehension.
>>> dict1={'first':['hi','nice'], 'second':['night','moon']}
>>> list1= [ 'nice','moon','hi']
>>>
>>> set1 = set(list1)
>>> {k:len(set1.intersection(v)) for k, v in dict1.items()}
{'first': 2, 'second': 1}
intersection accepts any iterable argument, so creating sets from the values of dict1 is not necessary.
You can use the following dict comprehension:
{k: sum(1 for i in l if i in list1) for k, l in dict1.items()}
Given your sample input, this returns:
{'first': 2, 'second': 1}
You can get the intersection of your list and the values of dict1 using sets:
for key in dict1.keys():
count = len(set(dict1[key]) & set(list1))
print("{0}: {1}".format(key,count))
While brevity can be great, I thought it would be good to also provide an example that is as close to the OPs original code as possible:
# notice conversion to set for O(1) lookup
# instead of O(n) lookup where n is the size of the list of desired items
dict1={'first':['hi','nice'], 'second':['night','moon']}
set1= set([ 'nice','moon','hi'])
for key, values in dict1.items():
counter = 0
for val in values:
if val in set1:
counter += 1
print key, counter
Using collections.Counter
from collections import Counter
c = Counter(k for k in dict1 for i in list1 if i in dict1[k])
# Counter({'first': 2, 'second': 1})
The most simplest and basic approach would be:
dict1={'first':['hi','nice'], 'second':['night','moon']}
list1= [ 'nice','moon','hi']
listkeys=list(dict1.keys())
listvalues=list(dict1.values())
for i in range(0,len(listvalues)):
ctr=0
for j in range(0,len(listvalues[i])):
for k in range(0,len(list1)):
if list1[k]==listvalues[i][j]:
ctr+=1
print(listkeys[i],ctr)
Hope it helps.
I have a list that looks like this,
lista = ['hello','2','go','5','sit','4','line','3','sit','2', 'go','9','play','0']
In this list, each number after the word represents the value of the word. I want to represent this list in a dictionary such that the value of each repeated word gets added. I want the dictionary to be like this:
dict = {'hello':'2', 'go':'14', 'sit':'6','line':'3','play':'0'}
In the list 'go' occurs twice with two different values so we add the number that occur just after the word, similarly for other words.
This is my approach, it does not seem to work.
import csv
with open('teest.txt', 'rb') as input:
count = {}
my_file = input.read()
listt = my_file.split()
i = i + 2
for i in range(len(listt)-1):
if listt[i] in count:
count[listt[i]] = count[listt[i]] + listt[i+1]
else:
count[listt[i]] = listt[i+1]
Counting occurrences of unique keys is usually possible with defaultdict.
import collections as ct
lista = ['hello','2','go','5','sit','4','line','3','sit','2', 'go','9','play','0']
dd = ct.defaultdict(int)
iterable = iter(lista)
for word in iterable:
dd[word] += int(next(iterable))
dd
# defaultdict(int, {'go': 14, 'hello': 2, 'line': 3, 'play': 0, 'sit': 6})
Here we initialize the defaultdict to accept integers. We make a list iterator, both creates a generator and allows us to call next() on it. Since the word and value occur in consecutive pairs in the list, we will iterate and immediately call next() to extract these values in sync. We assign these items as (key, value) pairs to the defaultdict, which happens to keep count.
Convert the integers to strings if this is required:
{k: str(v) for k, v in dd.items()}
# {'go': '14', 'hello': '2', 'line': '3', 'play': '0', 'sit': '6'}
An alternate tool may be the Counter (see #DexJ's answer), which is related to this type of defaultdict. In fact, Counter() can substitute defaultdict(int) here and return the same result.
You can "stride" the array 2 items at a time using a range(). The optional 3rd argument in a range lets you define a "skip".
range(start, stop[, step])
Using this, we can create a range of indexes that skip ahead 2 at a time, for the entire length of your list. We can then ask the list what "name" is at that index lista[i] and what "value" is after it lista[i + 1].
new_dict = {}
for i in range(0, len(lista), 2):
name = lista[i]
value = lista[i + 1]
# the name already exists
# convert their values to numbers, add them, then convert back to a string
if name in new_dict:
new_dict[name] = str( int(new_dict[name]) + int(value) )
# the name doesn't exist
# simply append it with the value
else:
new_dict[name] = value
as explained by #Soviut you may use range() function with step value 2 to reach to word directly. as I seen in your list you have value stored as string so I have converted them to integers.
lista = ['hello','2','go','5','sit','4','line','3','sit','2', 'go','9','play','0']
data = {}
for i in range(0, len(lista), 2): # increase searching with step of 2 from 0 i.e. 0,2,4,...
if lista[i] in data.keys(): # this condition checks whether your element exist in dictionary key or not
data[lista[i]] = int(data[lista[i]]) + int(lista[i+1])
else:
data[lista[i]] = int(lista[i+1])
print(data)
Output
{'hello': 2, 'go': 14, 'sit': 6, 'line': 3, 'play': 0}
lista = ['hello','2','go','5','sit','4','line','3','sit','2', 'go','9','play','0']
dictionary = {}
for keyword, value in zip(*[iter(lista)]*2): # iterate two at a time
if keyword in dictionary: # if the key is present, add to the existing sum
dictionary[keyword] = dictionary[keyword] + int(value)
else: # if not present, set the value for the first time
dictionary[keyword] = int(value)
print(dictionary)
Output:
{'hello': 2, 'go': 14, 'sit': 6, 'line': 3, 'play': 0}
Another solution using iter(), itertools.zip_longest() and itertools.groupby() functions:
import itertools
lista = ['hello','2','go','5','sit','4','line','3','sit','2', 'go','9','play','0']
it = iter(lista)
d = {k: sum(int(_[1]) for _ in g)
for k,g in itertools.groupby(sorted(itertools.zip_longest(it, it)), key=lambda x: x[0])}
print(d)
The output:
{'line': 3, 'sit': 6, 'hello': 2, 'play': 0, 'go': 14}
You can use range(start,end,steps) to get endpoint and split list and just use Counter() from collections to sum duplicate key's value and you're done :)
here yourdict will be {'go': 14, 'line': 3, 'sit': 6, 'play': 0, 'hello': 2}
from collections import Counter
counter_obj = Counter()
lista = ['hello','2','go','5','sit','4','line','3','sit','2', 'go','9','play','0']
items, start = [], 0
for end in range(2,len(lista)+2,2):
print end
items.append(lista[start:end])
start = end
for item in items:
counter_obj[item[0]] += int(item[1])
yourdict = dict(counter_obj)
print yourdict