I'm trying to get dictionary with same keys and merge its values and if there is a duplicate leave only one value of duplicate.
data = {"test1":["data1", "data2"],
"test1":["data3", "data4", "data2"],
"test2":["1data", "2data"],
"test2":["3data", "4data", "2data"]
}
desired_result = {"test1":["data1", "data2", "data3", "data4"],
"test2":["1data", "2data", "3data", "4data"]
}
any ideas how to get result?
First you need create list of dict (because you can't have dictionary with same keys) then iterate over them and extend them to list with key of dict then use set for delete duplicated like below:
data = [{"test1":["data1", "data2"]},{"test1":["data3", "data4", "data2"]},{"test2":["1data", "2data"]},{"test2":["3data", "4data", "2data"]}]
from collections import defaultdict
rslt_out = defaultdict(list)
for dct in data:
for k,v in dct.items():
rslt_out[k].extend(v)
for k,v in rslt_out.items():
rslt_out[k] = list(set((v)))
print(rslt_out)
output:
defaultdict(list,
{'test1': ['data3', 'data4', 'data2', 'data1'],
'test2': ['2data', '3data', '1data', '4data']})
Related
I'm not understanding how the k:v coding works. I've read that k:v pairs the items. k is the key and v is the item. If I want an additional field called 'cusip' in addition to 'lastPrice', how would I add that? Thanks
response_dict = response.json()
new_dict = {k: v['lastPrice'] for k, v in response_dict.items()}
df = pd.DataFrame.from_dict(new_dict, orient='index', columns=['lastPrice'])
You just need to build the appropriate tuple.
new_dict = {k: (v['lastPrice'], v['cusIP']) for k, v in response_dict.items()}
In a dictionary comprehension {key_expr: value_expr for ...}, both key_expr and value_expr are allowed to be arbitrary expressions.
Your question is vague, but I would like simulate it maybe useful or close to your issue solution:
Instead of response I defined a new dict with initial values.
import pandas as pd
response_dict = {"price":[10,5,9],"Products":["shoes","clothes","hat"], "lastprices":[5,6,7]}
new_dict = {k: v for k, v in response_dict.items()}
df = pd.DataFrame.from_dict(new_dict)
df
If you keen to add new key with new values just try to modify dictioner: for instance
new_dict["cusip"]=[1,2,3]
df = pd.DataFrame.from_dict(new_dict)
df
I have a csv file that looks something like this:
apple 12 yes
apple 15 no
apple 19 yes
and I want to use the fruit as a key and turn rest of the row into a list of lists that's a value, so it looks like:
{'apple': [[12, 'yes'],[15, 'no'],[19, 'yes']]}
A sample of my code below, turns each row into its own dictionary, when I want to combine the data.
import csv
fp = open('fruits.csv', 'r')
reader = csv.reader(fp)
next(reader,None)
D = {}
for row in reader:
D = {row[0]:[row[1],row[2]]}
print(D)
My output looks like:
{'apple': [12,'yes']}
{'apple': [15,'no']}
{'apple': [19,'yes']}
Your problem is you reset D in every iteration. Don't do that.
Note that the output may look somewhat related to what you want, but this isn't actually the case. If you inspect the variable D after this code is finished running, you'll see that it contains only the last value that you set it to:
{'apple': [19,'yes']}
Instead, add keys to the dictionary whenever you encounter a new fruit. The value at this key will be an empty list. Then append the data you want to this empty list.
import csv
fp = open('fruits.csv', 'r')
reader = csv.reader(fp)
next(reader,None)
D = {}
for row in reader:
if row[0] not in D: # if the key doesn't already exist in D, add an empty list
D[row[0]] = []
D[row[0]].append([row[1:]]) # append the rest of this row to the list in the dictionary
print(D) # print the dictionary AFTER you finish creating it
Alternatively, define D as a collections.defaultdict(list) and you can skip the entire if block
Note that in a single dictionary, one key can only have one value. There can not be multiple values assigned to the same key. In this case, each fruit name (key) has a single list value assigned to it. This list contains more lists inside it, but that is immaterial to the dictionary.
You can use a mix of sorting and groupby:
from itertools import groupby
from operator import itemgetter
_input = """apple 12 yes
apple 15 no
apple 19 yes
"""
entries = [l.split() for l in _input.splitlines()]
{key : [values[1:] for values in grp] for key, grp in groupby( sorted(entries, key=itemgetter(0)), key=itemgetter(0))}
Sorting is applied before groupby to have unduplicated keys, and the key of both is taking the first element of each line.
Part of the issue you are running into is that rather than "adding" data to D[key] via append, you are just replacing it. In the end you get only the last result per key.
You might look at collections.defaultdict(list) as a strategy to initialize D or use setdefault(). In this case I'll use setdefault() as it is straightforward, but don't discount defaultdict() in more complicated senarios.
data = [
["apple", 12, "yes"],
["apple", 15, "no"],
["apple", 19, "yes"]
]
result = {}
for item in data:
result.setdefault(item[0], []).append(item[1:])
print(result)
This should give you:
{
'apple': [
[12, 'yes'],
[15, 'no'],
[19, 'yes']
]
}
If you were keen on looking at defaultdict() an solution based on it might look like:
import collections
data = [
["apple", 12, "yes"],
["apple", 15, "no"],
["apple", 19, "yes"]
]
result = collections.defaultdict(list)
for item in data:
result[item[0]].append(item[1:])
print(dict(result))
I have a list of nested dictionaries that I want to get specific values and put into a dictionary like this:
vid = [{'a':{'display':'axe', 'desc':'red'}, 'b':{'confidence':'good'}},
{'a':{'display':'book', 'desc':'blue'}, 'b':{'confidence':'poor'}},
{'a':{'display':'apple', 'desc':'green'}, 'b':{'confidence':'good'}}
]
I saw previous questions similar to this, but I still can't get the values such as 'axe' and 'red'. I would like the new dict to have a 'Description', 'Confidence' and other columns with the values from the nested dict.
I have tried this for loop:
new_dict = {}
for x in range(len(vid)):
for y in vid[x]['a']:
desc = y['desc']
new_dict['Description'] = desc
I got many errors but mostly this error:
TypeError: string indices must be integers
Can someone please help solve how to get the values from the nested dictionary?
You don't need to iterate through the keys in the dictionary (the inner for-loop), just access the value you want.
vid = [{'a':{'display':'axe', 'desc':'red'}, 'b':{'confidence':'good'} },
{'a':{'display':'book', 'desc':'blue'}, 'b':{'confidence':'poor'}},
{'a':{'display':'apple', 'desc':'green'}, 'b':{'confidence':'good'}}
]
new_dict = {}
list_of_dicts = []
for x in range(len(vid)):
desc = vid[x]['a']['desc']
list_of_dicts.append({'desc': desc})
I have found a temporary solution for this. I decided to use the pandas dataframe instead.
df = pd.DataFrame(columns = ['Desc'])
for x in range(len(vid)):
desc = vid[x]['a']['desc']
df.loc[len(df)] = [desc]
so you want to write this to csv later so pandas will help you a lot for this problem using pandas you can get the desc by
import pandas as pd
new_dict = {}
df = pd.DataFrame(vid)
for index, row in df.iterrows() :
new_dict['description'] = row['a']['desc']
a b
0 {'display': 'axe', 'desc': 'red'} {'confidence': 'good'}
1 {'display': 'book', 'desc': 'blue'} {'confidence': 'poor'}
2 {'display': 'apple', 'desc': 'green'} {'confidence': 'good'}
this is how dataframe looks like a b are column of the dataframe and your nested dicts are rows of dataframe
Try using this list comprehension:
d = [{'Description': i['a']['desc'], 'Confidence': i['b']['confidence']} for i in vid]
print(d)
I am working on a List which contains many dictionaries. Here I am trying to combine those dictionary into a single dict based on their key value. For illustration see the below example.
my_dict =[{'COLUMN_NAME': 'TABLE_1_COL_1', 'TABLE_NAME': 'TABLE_1'},
{'COLUMN_NAME': 'TABLE_1_COL_2', 'TABLE_NAME': 'TABLE_1'},
{'COLUMN_NAME': 'TABLE_1_COL_3', 'TABLE_NAME': 'TABLE_1'},
{'COLUMN_NAME': 'TABLE_2_COL_1', 'TABLE_NAME': 'TABLE_2'},
{'COLUMN_NAME': 'TABLE_2_COL_2', 'TABLE_NAME': 'TABLE_2'}]
Here for any key value matches with another key value then need to combine other key values.
Below is the sample output what I expect from the above list of dict.
new_lst = [{'TABLE_NAME': 'TABLE_1','COLUMN_NAME':['TABLE_1_COL_1','TABLE_1_COL_2','TABLE_1_COL_3']}, {'TABLE_NAME': 'TABLE_2','COLUMN_NAME': ['TABLE_2_COL_1','TABLE_2_COL_2']]
How can i achieve this in most efficient way.
You can use defaultdict to get similar output.
from collections import defaultdict
new_lst = []
for some_dict in list_of_dicts:
new_lst.append(defaultdict(list))
for key, value in some_dict.items():
new_lst[len(new_lst) - 1][key].append(value)
new_lst will be of the form:
[{'TABLE_NAME': ['TABLE_1'],'COLUMN_NAME':['TABLE_1_COL_1','TABLE_1_COL_2','TABLE_1_COL_3']}, {'TABLE_NAME': ['TABLE_2'],'COLUMN_NAME': ['TABLE_2_COL_1','TABLE_2_COL_2']]
Which is slightly different from what you wanted (even the singular elements are in arrays). I would recommend you leave it in this format if given the choice.
To get exactly what you wanted, add this after the above code:
for some_dict in new_lst:
for key, value in some_dict.items():
if len(value) == 1:
some_dict[key] = value[0]
Now, new_lst is exactly like you expected:
[{'TABLE_NAME': 'TABLE_1','COLUMN_NAME':['TABLE_1_COL_1','TABLE_1_COL_2','TABLE_1_COL_3']}, {'TABLE_NAME': 'TABLE_2','COLUMN_NAME': ['TABLE_2_COL_1','TABLE_2_COL_2']]
Something like that?
data = {}
for element in my_dict:
table_name = element['TABLE_NAME']
column_name = element['COLUMN_NAME']
if table_name not in data:
data[table_name] = []
data[table_name].append(column_name)
new_lst = [{'TABLE_NAME': key, 'COLUMN_NAME': val} for key, val in data.items()]
I want to map some values(a list of lists) to some keys(a list) in a python dictionary.
I read Map two lists into a dictionary in Python
and figured I could do that this way :
headers = ['name', 'surname', 'city']
values = [
['charles', 'rooth', 'kentucky'],
['william', 'jones', 'texas'],
['john', 'frith', 'los angeles']
]
data = []
for entries in values:
data.append(dict(itertools.izip(headers, entries)))
But I was just wondering is there is a nicer way to go?
Thanks
PS: I'm on python 2.6.7
You could use a list comprehension:
data = [dict(itertools.izip(headers, entries)) for entries in values]
from functools import partial
from itertools import izip, imap
data = map(dict, imap(partial(izip, headers), values))
It's already really nice...
data = [dict(itertools.izip(headers, entries) for entries in values]