formatting dictionary structure in python - python

I have an input list of the form:
d=[{'CLIENT': ['A','B','C']},{'ROW':['1','2','3']},{'KP':['ROM','MON','SUN']}]
I want the output to look like:
S=[{'CLIENT':'A','ROW':'1','KP':'ROM'},
{'CLIENT':'B','ROW':'2','KP':'MON'},
{'CLIENT':'C','ROW':'3','KP':'SUN'},]
How can i do this in python?
the input element dictionaries' keys may change, so I don't want to hardcode them in the code as well.

With a little cheating by letting pandas do the work:
Setup:
from collections import ChainMap
import pandas as pd
d = [{'CLIENT': ['A','B','C']},{'ROW':['1','2','3']},{'KP':['ROM','MON','SUN']}]
Solution:
result = pd.DataFrame(dict(ChainMap(*d))).to_dict(orient='records')
Result:
[{'KP': 'ROM', 'ROW': '1', 'CLIENT': 'A'},
{'KP': 'MON', 'ROW': '2', 'CLIENT': 'B'},
{'KP': 'SUN', 'ROW': '3', 'CLIENT': 'C'}]

Manually it would look like this
S=[{} for i in range(len(d))]
i = 0
for dict in d:
for k, v in dict.items(): # Always 1
for value in v:
S[i][k]=value
i+=1
i=0
print(S)

Extract the key from the each dictionary and the values, then zip() them together into a new dict():
data = [{'CLIENT': ['A', 'B', 'C']}, {'ROW': ['1', '2', '3']}, {'KP': ['ROM', 'MON', 'SUN']}]
new_keys = [list(d.keys())[0] for d in data]
new_values = zip(*[val for d in data for val in d.values()])
s = [dict(zip(new_keys, val)) for val in new_values]
print(s)
Output:
[{'CLIENT': 'A', 'ROW': '1', 'KP': 'ROM'},
{'CLIENT': 'B', 'ROW': '2', 'KP': 'MON'},
{'CLIENT': 'C', 'ROW': '3', 'KP': 'SUN'}]

This is another way of doing it, by making use of the builtin zip() function a couple of times, and the chain() function of the itertools module.
The idea is to use zip() first to group together the lists' items (('A', '1', 'ROM'), ('B', '2', 'MON'), ('C', '3', 'SUN')) as we desire, and the keys of each dictionary ('CLIENT', 'ROW', 'KP')).
Then, we can use a list comprehension, iterating over the just created values list, and zipping its content together with the keys tuple, to finally produce the dictionaries that will be stored within the s list
from itertools import chain
d = [{'CLIENT': ['A','B','C']},{'ROW':['1','2','3']},{'KP':['ROM','MON','SUN']}]
keys, *values = zip(*[chain(dict_.keys(), *dict_.values()) for dict_ in d])
s = [dict(zip(keys, tuple_)) for tuple_ in values]
The content of s will be:
[
{'CLIENT': 'A', 'ROW': '1', 'KP': 'ROM'},
{'CLIENT': 'B', 'ROW': '2', 'KP': 'MON'},
{'CLIENT': 'C', 'ROW': '3', 'KP': 'SUN'}
]

Related

create a specific python dictionary from 2 lists

I have the following data:
fsm_header = ['VLAN_ID', 'NAME', 'STATUS']
fsm_results = [['1', 'default', 'active'],
['2', 'VLAN0002', 'active'],
['3', 'VLAN0003', 'active']]
I want to create a specific dictionary like this:
{'VLAN_ID':['1','2','3'],
'NAME':['default','VLAN0002','VLAN0003'],
'STATUS':['active','active','active']}
I'm having trouble finding the right combination, as the one I'm using:
dict(zip(fsm_header, row)) for row in fsm_results
gives me another type of useful output, but not the one I mentioned above.
I would prefer to see something without using the zip function, but even with zip is ok.
You need to unpack and zip fsm_results too:
out = {k:list(v) for k,v in zip(fsm_header, zip(*fsm_results))}
Output:
{'VLAN_ID': ['1', '2', '3'],
'NAME': ['default', 'VLAN0002', 'VLAN0003'],
'STATUS': ['active', 'active', 'active']}
If you don't mind tuple as values; then you could use:
out = dict(zip(fsm_header, zip(*fsm_results)))
Output:
{'VLAN_ID': ('1', '2', '3'),
'NAME': ('default', 'VLAN0002', 'VLAN0003'),
'STATUS': ('active', 'active', 'active')}
You could also write the same thing using dict.setdefault:
out = {}
for lst in fsm_results:
for k, v in zip(fsm_header, lst):
out.setdefault(k, []).append(v)

Getting "keys" from list of dicts [duplicate]

This question already has answers here:
How to return dictionary keys as a list in Python?
(13 answers)
Closed 4 years ago.
I have the following data:
[{'id': ['132605', '132750', '132772', '132773', '133065', '133150', '133185', '133188', '133271', '133298']},
{'number': ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']},
{'id': ['1', '1', '1', '1', '1', '1', '1', '1', '1', '1']}]
What would be the best way to get a list of the keys (as if it was a dict and not an array)? Currently I'm doing:
>>> [list(i.keys())[0] for i in e.original_column_data]
['id', 'number', 'id']
But that feels a bit hackish
What is hacky about it? It's a bit inelegant. You just need to do the following:
>>> keys = []
>>> data = [{'id': ['132605', '132750', '132772', '132773', '133065', '133150', '133185', '133188', '133271', '133298']},
... {'number': ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']},
... {'id': ['1', '1', '1', '1', '1', '1', '1', '1', '1', '1']}]
>>> for d in data:
... keys.extend(d)
...
>>> keys
['id', 'number', 'id']
Or if you prefer one-liners:
>>> [k for d in data for k in d]
['id', 'number', 'id']
first way
iteration on a dictionary gives you its keys, so a simple
>>> [key for key in dict]
gives you a list of keys and you can get what you want with
>>> [key for dict in dict_list for key in dict]
second way (only python 2)
use .key() (used in your code)
but there is no need to use list() (edit: for python 2)
here's what it will look like:
>>> [dict.keys()[0] for dict in dict_list]
in your code, dictionaries have only one key so these two have the same result.
but I prefer the first one since it gives all keys of all the dictionaries
This is simpler and does the same thing:
[k for d in e.original_column_data for k in d]
=> ['id', 'number', 'id']

How to use a dict comprehension to split a list?

I currently have a dict in the form:
data = {"var1":"600103", "var2":[{"a":"1","b":"2"}]}
I would like the output to be:
op = {"var1":"600103","var2[0]":{"a":"1","b":"2"}}
I am currently using loops to manually loop through. I'd like to know if there's a more pythonic way of doing this.
If this isn't what you're already doing, you can eliminate the need for a nested loop by using a dict comprehension for the values which are lists.
data = {"var1":"600103", "var2":[{"a":"1","b":"2"}, {"a":"22","b":"555"}]}
op = {}
for k in data:
if not isinstance(data[k], list):
op[k] = data[k]
else:
op.update({k + '[{}]'.format(i) : data[k][i] for i in range(len(data[k])) })
And, your output will look like this:
{'var1': '600103', 'var2[1]': {'a': '22', 'b': '555'}, 'var2[0]': {'a': '1', 'b': '2'}}
I do not know if it is very pythonic or not but I know for sure that it is difficult to read :S
Sorry, just playing... ;)
data = {"var1":"600103", "var2":[{"a":"1","b":"2"},{"a":"3","b":"4"},{"a":"5","b":"6"},{"a":"7","b":"8"}], "var3":"600103"}
reduce(
lambda a, b: dict(a.items() + b.items()),
[
dict(map(lambda (idx, i): ('{0}[{1}]'.format(key, idx), i), enumerate(value))) if type(value) is list else {key: value}
for key, value
in data.items()
]
)
output:
{'var1': '600103',
'var2[0]': {'a': '1', 'b': '2'},
'var2[1]': {'a': '3', 'b': '4'},
'var2[2]': {'a': '5', 'b': '6'},
'var2[3]': {'a': '7', 'b': '8'},
'var3': '600103'}

How to put keys of dictionary in first row of csv rather than first column?

keys = ['key1', 'key2', 'key3', 'key4']
list1 = ['a1', 'b3', 'c4', 'd2', 'h0', 'k1', 'p2', 'o3']
list2 = ['1', '2', '25', '23', '4', '5', '6', '210', '8', '02', '92', '320']
abc = dict(zip(keys[:4], [list1,list2]))
with open('myfilecsvs.csv', 'wb') as f:
[f.write('{0},{1}\n'.format(key, value)) for key, value in abc.items()]
I am getting all keys in 1st column with this and values in other column respectively.
What I am trying to achieve is all keys in first row i-e each key in specific column of first row and then their values below. Something like transpose
I willbe much grateful for your assist on this
You can use join and zip_longest to do this.
",".join(abc.keys()) will return first row (the keys) like key1,key2,and then use zip_longest(Python2.x use izip_longest) to aggregate elements.And use the same way append , and \n to the string.
zip_longest
Make an iterator that aggregates elements from each of the iterables.
If the iterables are of uneven length, missing values are filled-in
with fillvalue.
from itertools import zip_longest
with open('myfilecsvs.csv', 'w') as f:
f.write("\n".join([",".join(abc.keys()),*(",".join(i) for i in zip_longest(*abc.values(),fillvalue=''))]))
Output:
key1,key2
a1,1
b3,2
...
,02
,92
,320

Flatten Entity-Attribute-Value (EAV) Schema in Python

I've got a csv file in something of an entity-attribute-value format (i.e., my event_id is non-unique and repeats k times for the k associated attributes):
event_id, attribute_id, value
1, 1, a
1, 2, b
1, 3, c
2, 1, a
2, 2, b
2, 3, c
2, 4, d
Are there any handy tricks to transform a variable number of attributes (i.e., rows) into columns? The key here is that the output ought to be an m x n table of structured data, where m = max(k); filling in missing attributes with NULL would be optimal:
event_id, 1, 2, 3, 4
1, a, b, c, null
2, a, b, c, d
My plan was to (1) convert the csv to a JSON object that looks like this:
data = [{'value': 'a', 'id': '1', 'event_id': '1', 'attribute_id': '1'},
{'value': 'b', 'id': '2', 'event_id': '1', 'attribute_id': '2'},
{'value': 'a', 'id': '3', 'event_id': '2', 'attribute_id': '1'},
{'value': 'b', 'id': '4', 'event_id': '2', 'attribute_id': '2'},
{'value': 'c', 'id': '5', 'event_id': '2', 'attribute_id': '3'},
{'value': 'd', 'id': '6', 'event_id': '2', 'attribute_id': '4'}]
(2) extract unique event ids:
events = set()
for item in data:
events.add(item['event_id'])
(3) create a list of lists, where each inner list is a list the of attributes for the corresponding parent event.
attributes = [[k['value'] for k in j] for i, j in groupby(data, key=lambda x: x['event_id'])]
(4) create a dictionary that brings events and attributes together:
event_dict = dict(zip(events, attributes))
which looks like this:
{'1': ['a', 'b'], '2': ['a', 'b', 'c', 'd']}
I'm not sure how to get all inner lists to be the same length with NULL values populated where necessary. It seems like something that needs to be done in step (3). Also, creating n lists full of m NULL values had crossed my mind, then iterate through each list and populate the value using attribute_id as the list location; but that seems janky.
Your basic idea seems right, though I would implement it as follows:
import itertools
import csv
events = {} # we're going to keep track of the events we read in
with open('path/to/input') as infile:
for event, _att, val in csv.reader(infile):
if event not in events:
events[event] = []
events[int(event)].append(val) # track all the values for this event
maxAtts = max(len(v) for _k,v in events.items()) # the maximum number of attributes for any event
with open('path/to/output', 'w') as outfile):
writer = csv.writer(outfile)
writer.writerow(["event_id"] + list(range(1, maxAtts+1))) # write out the header row
for k in sorted(events): # let's look at the events in sorted order
writer.writerow([k] + events[k] + ['null']*(maxAtts-len(events[k]))) # write out the event id, all the values for that event, and pad with "null" for any attributes without values

Categories