Independent objects needed when zipping into dict - python

I have a list of items:
my_list = ['first', 'second', 'third']
I need to convert this items into a dictionary (so I can access each element by it's name), and associate multiple counters to each element: counter1, counter2, counter3.
So I do the following:
counter_dict = {'counter1': 0, 'counter2': 0, 'counter3': 0}
my_dict = dict(zip(mylist, [counter_dict]*len(mylist)))
This way I obtain a nested dictionary
{
'first': {
'counter1': 0,
'counter2': 0,
'counter3': 0
},
'second': {
'counter1': 0,
'counter2': 0,
'counter3': 0
},
'third': {
'counter1': 0,
'counter2': 0,
'counter3': 0
}
}
The problem here is that each counter dictionary is not independent, so when I update one counter, all of them are updated.
The following line not only updates the counter2 of second but also updates counter2 of first and third
my_dict['second']['counter2'] += 1
{
'first': {
'counter1': 0,
'counter2': 1,
'counter3': 0
},
'second': {
'counter1': 0,
'counter2': 1,
'counter3': 0
},
'third': {
'counter1': 0,
'counter2': 1,
'counter3': 0
}
}
I am aware that by doing [counter_dict]*len(mylist) I am pointing all dictionaries to a single one.
Question is, how can I achieve what I need creating independent dictionaries?
At the end I need to:
Acces each element of the list as a key
For each element have multiple counters that I can update independently
Thanks a lot

You can use copy.deepcopy to copy a dict and copy its contents.
import copy
Instead of [counter_dict]*len(mylist)
use:
[copy.deepcopy(counter_dict) for _ in mylist]

Try this one:
my_dict = {k: counter_dict.copy() for k in my_list}
Instead of zipping values - it's enough to iterate over keys and copy your counters to values using dict compression.
But note, that this will work only with one level counter_dict. If needed nested dictionary - use deepcopy from copy package:
from copy import deepcopy
my_dict = {k: deepcopy(counter_dict) for k in my_list}
Another aproach - using defaultdict from collections, so you dont even need to create dictionary with predefined counters:
from collections import defaultdict
my_dict = {k: defaultdict(int) for k in my_list}

Use a list comprehension, not the * operator, and create a new dictionary for each element of the list.
my_dict = dict(zip(mylist, [dict(counter_dict) for _ in mylist])

You can go and make a function that will create a counter_dict whenever you want to assign one:
def create_counter_dict(length):
return {'counter{}'.format(n+1) : 0 for n in range(length)}
Then when you want to assign a counter_dict to your my_dict:
my_dict = dict(zip(my_list, [create_counter_dict(3) for _ in range(len(my_list))]))
This way you don't have to make a copy of your dictionary, and you can alter this factory function to your needs.

Related

Create new dictionaries based on names of keys of another dictionary

I have a dictionary "A":
A = {
"Industry1": 1,
"Industry2": 1,
"Industry3": 1,
"Customer1": 1,
"Customer2": 1,
"LocalShop1": 1,
"LocalShop2": 1,
}
I want to group by key names and create new dictionaries for each "category", the names should be generated automatically.
Expected Output:
Industry = {
"Industry1": 1,
"Industry2": 1,
"Industry3": 1,
}
Customer = {
"Customer1": 1,
"Customer2": 1,
}
LocalShop = {
"LocalShop1": 1,
"LolcalShop2": 1,
}
Can you guys give me a hint to achieve this output, please?
Assuming your keys are in (KEYNAME)(NUM), you can do the following:
import re
from collections import defaultdict
from pprint import pprint
A = {
"Industry1": 1,
"Industry2": 1,
"Industry3": 1,
"Customer1": 1,
"Customer2": 1,
"LocalShop1": 1,
"LocalShop2": 1,
}
key_pattern = re.compile(r"[a-zA-Z]+")
result = defaultdict(dict)
for k, v in A.items():
key = key_pattern.search(k).group()
result[key][k] = v
pprint(dict(result))
output:
{'Customer': {'Customer1': 1, 'Customer2': 1},
'Industry': {'Industry1': 1, 'Industry2': 1, 'Industry3': 1},
'LocalShop': {'LocalShop1': 1, 'LocalShop2': 1}}
I created a dictionary of dictionaries instead of having individual variables for each dictionary. Its easier to manage and it doesn't pollute the global namespace.
Basically you iterate through the key value pairs and with r"[a-zA-Z]+" pattern, you grab the part without number. This is what is gonna be used for the key in outer dictionary.
For the purposes of this answer I've assumed that every key in the dictionary A has either the word "Industry", "Customer", or "Shop" in it. This allows us to detect what category each entry needs to be in by checking if each key contains a certain substring (i.e. "Industry"). If this assumption doesn't hold for your specific circumstances, you'll have to find a different way to write the if/elif statements in the solutions below that fits your situation better.
Here's one way to do it. You make a new dictionary for each category and check if "Industry", "Customer", or "Shop" is in each key.
industries = {}
customers = {}
shops = {}
for key, value in A.items():
if "Industry" in key:
industries[key] = value
elif "Customer" in key:
customers[key] = value
elif "Shop" in key:
shops[key] = value
Another, cleaner version would be where you have a nested dictionary that stores all of your categories, and each category would have its own dictionary inside the main one. This would help in the future if you needed to add more categories. You'd only have to add them in one place (in the dictionary definition) and the code would automatically adjust.
categories = {
"Industry": {},
"Customer": {},
"Shop": {},
}
for key, value in A.items():
for category_name, category_dict in categories.items():
if category_name in key:
category_dict[key] = value
If you can't detect the category from the string of an entry, then you may have to store that categorical information in the key or the value of each entry in A, so that you can detect the category when trying to filter everything.
You can use itertools.groupby with a key that only extracts the word without the number. I wouldn't recommend to make variables of them, this is not scalable if there are more than 3 keys... just put them in a new dictionary or in a list.
A = {'Industry1': 1,
'Industry2': 1,
'Industry3': 1,
'Customer1': 1,
'Customer2': 1,
'LocalShop1': 1,
'LocalShop2': 1}
grouped = [dict(val) for k, val in itertools.groupby(A.items(), lambda x: re.match('(.+)\d{1,}', x[0]).group(1))]
Output grouped:
[
{'Industry1': 1, 'Industry2': 1, 'Industry3': 1},
{'Customer1': 1, 'Customer2': 1},
{'LocalShop1': 1, 'LocalShop2': 1}
]
If you are sure, that there are exactly 3 elements in that list and you really want them as variables, you can do it with tuple unpacking:
Industry, Customer, LocalShop = [dict(val) for k, val in itertools.groupby(A.items(), lambda x: re.match('(.+)\d{1,}', x[0]).group(1))]
I think I would save the results in a new dictionary with the grouped key as new key and the list as value:
grouped_dict = {k: dict(val) for k, val in itertools.groupby(A.items(), lambda x: re.match('(.+)\d{1,}', x[0]).group(1))}
Output grouped_dict:
{'Industry': {'Industry1': 1, 'Industry2': 1, 'Industry3': 1},
'Customer': {'Customer1': 1, 'Customer2': 1},
'LocalShop': {'LocalShop1': 1, 'LocalShop2': 1}}

How to find indexes of first unique value from a list of dictionary using a key

Assuming a list of dictionaries
l=[
{"id":1, "score":80, "remarks":"B" },
{"id":2, "score":80, "remarks":"A" },
{"id":1, "score":80, "remarks":"C" },
{"id":3, "score":80, "remarks":"B" },
{"id":1, "score":80, "remarks":"F" },
]
I would like to find the indexes of the first unique value given a key. So given the list above i am expecting a result of
using_id = [0,1,3]
using_score = [0]
using_remarks = [0,1,2,4]
What makes it hard form me is the list has type of dictionary, if it were numbers i could just use this line
indexes = [l.index(x) for x in sorted(set(l))]
using set() on a list of dictionary throws an error TypeError: unhashable type: 'dict'.
The constraints are: Must only use modules that came default with python3.10,the code should be scalable friendly as the length of the list will reach into the hundreds, as few lines as possible is a bonus too :)
Of course there is the brute force method, but can this be made to be more efficient, or use lesser lines of code ?
unique_items = []
unique_index = []
for index, item in enumerate(l, start=0):
if item["remarks"] not in unique_items:
unique_items.append(item["remarks"])
unique_index.append(index)
print(unique_items)
print(unique_index)
You could refigure your data into a dict of lists, then you can use code similar to what you would use for a list of values:
dd = { k : [d[k] for d in l] for k in l[0] }
indexes = { k : sorted(dd[k].index(x) for x in set(dd[k])) for k in dd }
Output:
{'id': [0, 1, 3], 'score': [0], 'remarks': [0, 1, 2, 4]}
Since dict keys are inherently unique, you can use a dict to keep track of the first index of each unique value by storing the values as keys of a dict and setting the index as the default value of a key with dict.setdefault:
for key in 'id', 'score', 'remarks':
unique = {}
for i, d in enumerate(l):
unique.setdefault(d[key], i)
print(key, list(unique.values()))
This outputs:
id [0, 1, 3]
score [0]
remarks [0, 1, 2, 4]
Demo: https://replit.com/#blhsing/PalatableAnxiousMetric
With functools.reduce:
l=[
{"id":1, "score":80, "remarks":"B" },
{"id":2, "score":80, "remarks":"A" },
{"id":1, "score":80, "remarks":"C" },
{"id":3, "score":80, "remarks":"B" },
{"id":1, "score":80, "remarks":"F" },
]
from functools import reduce
result = {}
reduce(lambda x, y: result.update({y[1]['remarks']:y[0]}) \
if y[1]['remarks'] not in result else None, \
enumerate(l), result)
result
# {'B': 0, 'A': 1, 'C': 2, 'F': 4}
unique_items = list(result.keys())
unique_items
# ['B', 'A', 'C', 'F']
unique_index = list(result.values())
unique_index
# [0, 1, 2, 4]
Explanation: the lambda function adds to the dictionary result at each step a list containing index (in l) and id but only at the first occurrence of a given value for remarks.
The dictionary structure for the result makes sense since you're extracting unique values and they can therefore be seen as keys.

How to find and append dictionary values which are present in a list

I have a list which has unique sorted values
arr = ['Adam', 'Ben', 'Chris', 'Dean', 'Flower']
I have a dictionary which has values as such
dict = {
'abc': {'Dean': 1, 'Adam':0, 'Chris':1},
'def': {'Flower':0, 'Ben':1, 'Dean':0}
}
From looking at values from arr I need to have each item and if the value isn't present in subsequent smaller dict that should be assigned a value -1
Result
dict = {
'abc': {'Adam':0, 'Ben':-1, 'Chris':1, 'Dean': 1, 'Flower':-1},
'def': {'Adam':-1, 'Ben':1, 'Chris':-1, 'Dean': 0, 'Flower':0}
}
how can I achieve this using list and dict comprehensions in python
dd = {
key: {k: value.get(k, -1) for k in arr}
for key, value in dd.items()
}
{k: value.get(k, -1) for k in arr} will make sure that your keys are in the same order as you defined in the arr list.
A side note on the order of keys in dictionary.
Dictionaries preserve insertion order. Note that updating a key does
not affect the order. Keys added after deletion are inserted at the
end.
Changed in version 3.7: Dictionary order is guaranteed to be insertion
order. This behavior was an implementation detail of CPython from 3.6.
Please do not make a variable called dict, rename it to dct or something since dict it is a reserved python internal.
As for your question: just iterate through your dct and add the missing keys using setdefault:
arr = ['Adam', 'Ben', 'Chris', 'Dean', 'Flower']
dct = {
'abc': {'Dean': 1, 'Adam':0, 'Chris':1},
'def': {'Flower':0, 'Ben':1, 'Dean':0}
}
def add_dict_keys(dct, arr):
for key in arr:
dct.setdefault(key, -1)
return dct
for k, v in dct.items():
add_dict_keys(v, arr)
print(dct) # has updated values

Creating Dictionaries from Lists inside of Dictionaries

I'm quite new to Python and I have been stumped by a seemingly simple task.
In part of my program, I would like to create Secondary Dictionaries from the values inside of lists, of which they are values of a Primary Dictionary.
I would also like to default those values to 0
For the sake of simplicity, the Primary Dictionary looks something like this:
primaryDict = {'list_a':['apple', 'orange'], 'list_b':['car', 'bus']}
What I would like my result to be is something like:
{'list_a':[{'apple':0}, {'orange':0}], 'list_b':[{'car':0}, {'bus':0}]}
I understand the process should be to iterate through each list in the primaryDict, then iterate through the items in the list and then assign them as Dictionaries.
I've tried many variations of "for" loops all looking similar to:
for listKey in primaryDict:
for word in listKey:
{word:0 for word in listKey}
I've also tried some methods of combining Dictionary and List comprehension,
but when I try to index and print the Dictionaries with, for example:
print(primaryDict['list_a']['apple'])
I get the "TypeError: list indices must be integers or slices, not str", which I interpret that my 'apple' is not actually a Dictionary, but still a string in a list. I tested that by replacing 'apple' with 0 and it just returns 'apple', proving it true.
I would like help with regards to:
-Whether or not the values in my list are assigned as Dictionaries with value '0'
or
-Whether the mistake is in my indexing (in the loop or the print function), and what I am mistaken with
or
-Everything I've done won't get me the desired outcome and I should attempt a different approach
Thanks
Here is a dict comprehension that works:
{k: [{v: 0} for v in vs] for k, vs in primaryDict.items()}
There are two problems with your current code. First, you are trying to iterate over listKey, which is a string. This produces a sequence of characters.
Second, you should use something like
[{word: 0} for word in words]
in place of
{word:0 for word in listKey}
You are close. The main issue is the way you iterate your dictionary, and the fact you do not append or assign your sub-dictionaries to any variable.
This is one solution using only for loops and list.append.
d = {}
for k, v in primaryDict.items():
d[k] = []
for w in v:
d[k].append({w: 0})
{'list_a': [{'apple': 0}, {'orange': 0}],
'list_b': [{'car': 0}, {'bus': 0}]}
A more Pythonic solution is to use a single list comprehension.
d = {k: [{w: 0} for w in v] for k, v in primaryDict.items()}
If you are using your dictionary for counting, which seems to be the implication, an even more Pythonic solution is to use collections.Counter:
from collections import Counter
d = {k: Counter(dict.fromkeys(v, 0)) for k, v in primaryDict.items()}
{'list_a': Counter({'apple': 0, 'orange': 0}),
'list_b': Counter({'bus': 0, 'car': 0})}
There are specific benefits attached to collections.Counter relative to normal dictionaries.
You can get the data structure that you desire via:
primaryDict = {'list_a':['apple', 'orange'], 'list_b':['car', 'bus']}
for k, v in primaryDict.items():
primaryDict[k] = [{e: 0} for e in v]
# primaryDict
{'list_b': [{'car': 0}, {'bus': 0}], 'list_a': [{'apple': 0}, {'orange': 0}]}
But the correct nested access would be:
print(primaryDict['list_a'][0]['apple']) # note the 0
If you actually want primaryDict['list_a']['apple'] to work, do instead
for k, v in primaryDict.items():
primaryDict[k] = {e: 0 for e in v}
# primaryDict
{'list_b': {'car': 0, 'bus': 0}, 'list_a': {'orange': 0, 'apple': 0}}
primaryDict = {'list_a':['apple', 'orange'], 'list_b':['car', 'bus']}
for listKey in primaryDict:
primaryDict[i] = [{word:0} for word in primaryDict[listKey]]
print(primaryDict)
Output:
{'list_a':[{'apple':0}, {'orange':0}], 'list_b':[{'car':0}, {'bus':0}]}
Hope this helps!
#qqc1037, I checked and updated your code to make it working. I have mentioned the problem with your code as comments. Finally, I have also added one more example using list comprehension, map() & lambda function.
import json
secondaryDict = {}
for listKey in primaryDict:
new_list = [] # You did not define any temporary list
for word in primaryDict [listKey]: # You forgot to use key that refers the list
new_list.append( {word:0}) # Here you forgot to append to list
secondaryDict2.update({listKey: new_list}) # Finally, you forgot to update the secondary dictionary
# Pretty printing dictionary
print(json.dumps(secondaryDict, indent=4));
"""
{
"list_a": [
{
"apple": 0
},
{
"orange": 0
}
],
"list_b": [
{
"car": 0
},
{
"bus": 0
}
]
}
"""
Another example: Using list comprehension, map(), lambda function
# Using Python 3.5.2
import json
primaryDict = {'list_a':['apple', 'orange'], 'list_b':['car', 'bus']}
secondaryDict = dict(map(lambda key: (key, [{item:0} for item in primaryDict[key]]), list(primaryDict) ))
# Pretty printing secondary dictionary
print(json.dumps(secondaryDict, indent=4))
"""
{
"list_a": [
{
"apple": 0
},
{
"orange": 0
}
],
"list_b": [
{
"car": 0
},
{
"bus": 0
}
]
}
"""

How to sum a list of dicts

What is the most Pythonic way to take a list of dicts and sum up all the values for matching keys from every row in the list?
I did this but I suspect a comprehension is more Pythonic:
from collections import defaultdict
demandresult = defaultdict(int) # new blank dict to store results
for d in demandlist:
for k,v in d.iteritems():
demandresult[k] = demandresult[k] + v
In Python - sum values in dictionary the question involved the same key all the time, but in my case, the key in each row might be a new key never encountered before.
I think that your method is quite pythonic. Comprehensions are nice but they shouldn't really be overdone, and they can lead to really messy one-liners, like the one below :).
If you insist on a dict comp:
demand_list = [{u'2018-04-29': 1, u'2018-04-30': 1, u'2018-05-01': 1},
{u'2018-04-21': 1},
{u'2018-04-18': 1, u'2018-04-19': 1, u'2018-04-17' : 1}]
d = {key:sum(i[key] for i in demand_list if key in i)
for key in set(a for l in demand_list for a in l.keys())}
print(d)
>>>{'2018-04-21': 1, '2018-04-17': 1, '2018-04-29': 1, '2018-04-30': 1, '2018-04-19': 1, '2018-04-18': 1, '2018-05-01': 1}
Here is another one-liner (ab-)using collections.ChainMap to get the combined keys:
>>> from collections import ChainMap
>>> {k: sum(d.get(k, 0) for d in demand_list) for k in ChainMap(*demand_list)}
{'2018-04-17': 1, '2018-04-21': 1, '2018-05-01': 1, '2018-04-30': 1, '2018-04-19': 1, '2018-04-29': 1, '2018-04-18': 1}
This is easily the slowest of the methods proposed here.
The only thing that seemed unclear in your code was the double-for-loop. It may be clearer to collapse the demandlist into a flat iterable—then the loopant presents the logic as simply as possible. Consider:
demandlist = [{
u'2018-04-29': 1,
u'2018-04-30': 1,
u'2018-05-01': 1
}, {
u'2018-04-21': 1
}, {
u'2018-04-18': 1,
u'2018-04-19': 1,
u'2018-04-17': 1
}]
import itertools as it
from collections import defaultdict
demandresult = defaultdict(int)
for k, v in it.chain.from_iterable(map(lambda d: d.items(), demandlist)):
demandresult[k] = demandresult[k] + v
(With this, print(demandresult) prints defaultdict(<class 'int'>, {'2018-04-29': 1, '2018-04-30': 1, '2018-05-01': 1, '2018-04-21': 1, '2018-04-18': 1, '2018-04-19': 1, '2018-04-17': 1}).)
Imagining myself reading this for the first time (or a few months later), I can see myself thinking, "Ok, I'm collapsing demandlist into a key-val iterable, I don't particularly care how, and then summing values of matching keys."
It's unfortunate that I need that map there to ensure the final iterable has key-val pairs… it.chain.from_iterable(demandlist) is a key-only iterable, so I need to call items on each dict.
Note that unlike many of the answers proposed, this implementation (like yours!) minimizes the number of scans over the data to just one—performance win (and I try to pick up as many easy performance wins as I can).
I suppose you want to return a list of summed values of each dictionary.
list_of_dict = [
{'a':1, 'b':2, 'c':3},
{'d':4, 'e':5, 'f':6}
]
sum_of_each_row = [sum(v for v in d.values()) for d in list_of_dict] # [6,15]
If you want to return the total sum, just simply wrap sum() to "sum_of_each_row".
EDIT:
The main problem is that you don't have a default value for each of the keys, so you can make use of the method dict.setdefault() to set the default value when there's a new key.
list_of_dict = [
{'a':1, 'b':1},
{'b':1, 'c':1},
{'a':2}
]
d = {}
d = {k:d[k]+v if k in d.keys() else d.setdefault(k,v)
for row in list_of_dict for k,v in row.items()} # {'a':3, 'b':2, 'c':1}

Categories