Separating nested list and dictionary in separate columns

Separating nested list and dictionary in separate columns - python

I created an function to gather the following sample list below:
full_list = ['Group1', [{'a':'1', 'b':'2'},{'c':'3', 'x':'1'}]
'Group2', [{'d':'7', 'e':'18'}],
'Group3', [{'m':'21'}, {'n':'44','p':'13'}]]
As you can see some of the elements inside the lists are made up of key-value pair dictionaries.
And these dictionaries are of different sizes (number of kv pairs).
Can anyone suggest what to use in python to display this list in separate columns?
Group1 Group2 Group3
{'a':'1', 'b':'2'} {'d':'7', 'e':'18'} {'m':'21'}
{'c':'3', 'x':'1'} {'n':'44','p':'13'}
I am not after a solution but rather a point in the right direction for a novice like me.
I have briefly looked at itertools and pandas dataframes
Thanks in advance

Here is one way:
First extract the columns and the data:
import pandas as pd
columns = full_list[::2]
#['Group1', 'Group2', 'Group3']
data = full_list[1::2]
#[[{'a': '1', 'b': '2'}, {'c': '3', 'x': '1'}],
# [{'d': '7', 'e': '18'}],
# [{'m': '21'}, {'n': '44', 'p': '13'}]]
Here the [::2] means iterate from begin to end but only every 2 items and so does [1::2] but it starts iterating from index 1 (second position)
Then create a pd.DataFrame:
df = pd.DataFrame(data)
#0 {'a': '1', 'b': '2'} {'c': '3', 'x': '1'}
#1 {'d': '7', 'e': '18'} None
#2 {'m': '21'} {'n': '44', 'p': '13'}
Ooops but the columns and rows are transposed so we need to convert it:
df = df.T
Then add the columns:
df.columns = columns
And there we have it:
Group1 Group2 Group3
0 {'a': '1', 'b': '2'} {'d': '7', 'e': '18'} {'m': '21'}
1 {'c': '3', 'x': '1'} None {'n': '44', 'p': '13'}

Related

I want to convert file data into 3d dictionary using python

Like I want this type of dictionary by reading file:
table = {
0: {'A': '1', 'B': '2', 'C': '3'},
1: {'A': '4', 'B': '5', 'C': '6'},
2: {'A': '7', 'B': '8', 'C': '9'}
}
or this will be enough.
table = {
{'A': '1', 'B': '2', 'C': '3'},
{'A': '4', 'B': '5', 'C': '6'},
{'A': '7', 'B': '8', 'C': '9'}
}
I have a file lets name file.txt which has data like
A B C
1 2 3
4 5 6
7 8 9
I am trying but i dint get the result this following is my try:
it gives me output {'A': '7', 'B': '8', 'C': '9'}
I know its obvious it will not give me 3d dict but I don't know how to get there.
array=[]
with open("file.txt") as f:
for line in f:
array = line.split()
break #it will give me array=['A','B','C']
v=[]
dic = {}
for i in range(0,len(array)):
for line in open("file.txt"):
x=0
v = line.split()
dic[ array[i] ] = v[i]
print(dic)

You can use Pandas
# Python env: pip install pandas
# Anaconda env: conda install pandas
import pandas as pd
df = pd.read_table('file.txt', sep=' ')
table = df.to_dict('index')
print(table)
# Output
{0: {'A': 1, 'B': 2, 'C': 3},
1: {'A': 4, 'B': 5, 'C': 6},
2: {'A': 7, 'B': 8, 'C': 9}}

If you want to use just built-in modules, you can use csv.DictReader:
import csv
with open("data.csv", "r") as f_in:
reader = csv.DictReader(f_in, delimiter=" ")
# if the file countains floats use float(v) instead int(v)
# if you want values just strings you can do:
# data = list(reader)
data = [{k: int(v) for k, v in row.items()} for row in reader]
print(data)
Prints:
[{"A": 1, "B": 2, "C": 3}, {"A": 4, "B": 5, "C": 6}, {"A": 7, "B": 8, "C": 9}]

Try to use the following code:
table = {}
with open("file.txt") as f:
headers = next(f).split() # get the headers from the first line
for i, line in enumerate(f):
row = {}
for j, value in enumerate(line.split()):
row[headers[j]] = value
table[i] = row
print(table)
You should get format like this:
{
0: {'A': '1', 'B': '2', 'C': '3'},
1: {'A': '4', 'B': '5', 'C': '6'},
2: {'A': '7', 'B': '8', 'C': '9'}
}
If you only want the inner dictionaries and not the outer structure, you can use a list instead of a dictionary to store the rows:
table = []
with open("file.txt") as f:
headers = next(f).split() # get the headers from the first line
for line in f:
row = {}
for j, value in enumerate(line.split()):
row[headers[j]] = value
table.append(row)
print(table)
This will give you the following output:
[
{'A': '1', 'B': '2', 'C': '3'},
{'A': '4', 'B': '5', 'C': '6'},
{'A': '7', 'B': '8', 'C': '9'}
]

DictReader from the csv module will give you what you seem to need - i.e., a list of dictionaries.
import csv
with open('file.txt', newline='') as data:
result = list(csv.DictReader(data, delimiter=' '))
print(result)
Output:
[{'A': '1', 'B': '2', 'C': '3'}, {'A': '4', 'B': '5', 'C': '6'}, {'A': '7', 'B': '8', 'C': '9'}]
Optionally:
If you have an aversion to module imports you could achieve the same objective as follows:
result = []
with open('file.txt') as data:
columns = data.readline().strip().split()
for line in map(str.strip, data):
result.append(dict(zip(columns, line.split())))
print(result)
Output:
[{'A': '1', 'B': '2', 'C': '3'}, {'A': '4', 'B': '5', 'C': '6'}, {'A': '7', 'B': '8', 'C': '9'}]

inside list of dictionaries, merge lists based on key

I have nested dictionaries in a list of dictionaries, I want to merge the lists based on 'id'
res = [{'i': ['1'], 'id': '123'},
{'i': ['1'], 'id': '123'},
{'i': ['1','2','3','4','5','6'],'id': '123'},
{'i': ['1'], 'id': '234'},
{'i': ['1','2','3','4','5'],'id': '234'}]
Desired output:
[{'i': [1, 1, 1, 2, 3, 4, 5, 6], 'id': '123'},
{'i': [1, 1, 2, 3, 4, 5], 'id': '234'}]
I am trying to merge the nested dictionaries based on key "id". I couldn't figure out the best way out:
import collections
d = collections.defaultdict(list)
for i in res:
for k, v in i.items():
d[k].extend(v)
The above code is merging all the lists, but i wantto merge lists based on key "id".

Something like this should do the trick
from collections import defaultdict
merged = defaultdict(list)
for r in res:
merged[r['id']].extend(r['i'])
output = [{'id': key, 'i': merged_list} for key, merged_list in merged.items()]

The following produces the desired output, using itertools.groupby:
from operator import itemgetter
from itertools import groupby
k = itemgetter('id')
[
{'id': k, 'i': [x for d in g for x in d['i']]}
for k, g in groupby(sorted(res, key=k), key=k)
]

I'm not sure what the expected behavior should be when there are duplicates -- for example, should the lists be:
treated like a set() ?
appended, and there could be multiple items, such as [1,1,2,3...] ?
doesn't matter -- just take any
Here would be one variation where we use a dict comprehension:
{item['id']: item for item in res}.values()
# [{'i': ['1', '2', '3', '4', '5'], 'id': '234'}, {'i': ['1', '2', '3', '4', '5', '6'], 'id': '123'}]
If you provide a bit more information in your question, I can update the answer accordingly.

How can i edit this dataframe to merge two columns dictionaries lists?

I have a dataframe like this.
ID Name id2 name2 name3
101 A [{'a': '1'}, {'b': '2'}] [{'e': '4'}, {'f': '5'}] [{'x': '4'}, {'y': '5'}]
103 B [{'c': '3'},{'d': '6'}] [{'g': '7'},{'h': '8'}] [{'t': '4'}, {'o': '5'}]
and I want the output df like this.
ID Name id2 name2
101 A [{'a': '1','e': '4','x': '4'}, {'b': '2', 'f': '5','y': '5'}}] [{'e': '4'}, {'f': '5'}]
103 B [{'c': '3', 'g': '7','t': '4'},{'d': '6', 'h': '8','o': '5'}] [{'e': '4'}, {'f': '5'}]
The Column name 3 will be as it is in the Op I have just removed it from the sample above. The thing is that even if more columns get added, its dictionaries will update in id2 column.
Thanks :)

You can try using collections.ChainMap in a list comprehension:
From the docs...
A ChainMap groups multiple dicts or other mappings together to create a single, updateable view...
So first we zip columns together, then a nested zip to get the dictsfrom each column "side-by-side" in a single list. This list is passed to ChainMap which joins them into a single dict.
Example
from collections import ChainMap
# Setup
df = pd.DataFrame({'ID': [101, 103], 'Name': ['A', 'B'], 'id2': [[{'a': '1'}, {'b': '2'}], [{'c': '3'}, {'d': '6'}]], 'name2': [[{'e': '4'}, {'f': '5'}], [{'g': '7'}, {'h': '8'}]]})
df['id2'] = [[dict(ChainMap(*x)) for x in zip(i, n)]
for i, n in zip(df['id2'], df['name2'])]
[out]
ID Name id2 name2
0 101 A [{'e': '4', 'a': '1'}, {'b': '2', 'f': '5'}] [{'e': '4'}, {'f': '5'}]
1 103 B [{'c': '3', 'g': '7'}, {'d': '6', 'h': '8'}] [{'g': '7'}, {'h': '8'}]
Update
A more scalable solution, if you have multiple columns to combine would be to use DataFrame.filter first to extract all the columns that need to be combined:
df = pd.DataFrame({'ID': [101, 103], 'Name': ['A', 'B'], 'id2': [[{'a': '1'}, {'b': '2'}], [{'c': '3'}, {'d': '6'}]], 'name2': [[{'e': '4'}, {'f': '5'}], [{'g': '7'}, {'h': '8'}]], 'name3': [[{'x': '4'}, {'y': '5'}], [{'t': '4'}, {'o': '5'}]]})
df['id2'] = [[dict(ChainMap(*y)) for y in zip(*x)]
for x in zip(*df.filter(regex='id2|name').apply(tuple))]
[out]
ID Name id2 name2 name3
0 101 A [{'e': '4', 'x': '4', 'a': '1'}, {'b': '2', 'f': '5', 'y': '5'}] [{'e': '4'}, {'f': '5'}] [{'x': '4'}, {'y': '5'}]
1 103 B [{'c': '3', 't': '4', 'g': '7'}, {'o': '5', 'd': '6', 'h': '8'}] [{'g': '7'}, {'h': '8'}] [{'t': '4'}, {'o': '5'}]
This is essentially doing the same as above, only we filter to "id" or "name" columns, and combine them all.

Considering the name of your dataframe is df, try this:
i=0
for i in range(0,df.shape[0]):
df.id2[i][0].update(df.name2[i][0])
df.id2[i][1].update(df.name2[i][1])

How to compare same keys values from two different dict

Trying to compare same keys values from two different dict, If second dict value is bigger than first dict value then output should show different keys values only.
Example:
first={'a': '1000', 'b': '2000', 'c': '3000'}
second={'a': '1000', 'b': '3000', 'c': '5000'}
new dict output should be {'b': '3000', 'c': '5000'}
how to do this comperison

Using a dict comprehension
Ex:
first={'a': '1000', 'b': '2000', 'c': '3000'}
second={'a': '1000', 'b': '3000', 'c': '5000'}
print(dict((k, second[k])for k in second if second[k] > first[k]))
Output:
{'c': '5000', 'b': '3000'}

Splitting a dictionary's multiple values from one key

I have the dictionary:
dict1 = {'A': {'val1': '5', 'val2': '1'},
'B': {'val1': '10', 'val2': '10'},
'C': {'val1': '15', 'val3': '100'}
Here, I have one key, with two values. I can obtain the key by using:
letters = dict1.keys()
which returns:
['A', 'B', 'C']
I am use to working with arrays and being able to "slice" them. How can I break this dictionary in a similar way as the key for the values?
val1_and_val2 = dict1.values()
returns:
[{'val1': '5', 'val2': '1'},
{'val1': '10', 'val2': '10'},
{'val1': '15', 'val2': '100'}]
How can I get:
number1 = [5, 10, 15]
number2 = [1, 10, 100]

If I understand you correctly then:
number1 = [val["val1"] for val in dict1.values()]
If you prefer, this will accomplish the same thing with lambdas.
number1 = map(lambda value: value["val1"], dict1.values())
Note how you really need to take the dict1[key]["val1"] to get an individual value.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Separating nested list and dictionary in separate columns - python

Related

I want to convert file data into 3d dictionary using python

inside list of dictionaries, merge lists based on key

How can i edit this dataframe to merge two columns dictionaries lists?

How to compare same keys values from two different dict

Splitting a dictionary's multiple values from one key

Categories

Resources