Python Pandas from dictionary

Python Pandas from dictionary - python

I have a dictionary
x={'XYZ': [4, 5, 6], 'ABC': [1, 2, 3]}
I want a pd.DataFrame like this:
'SomeColumnName'
'XYZ' [4,5,6]
'ABC' [1,2,3]
Whatever I do, it splits the list of x.values() in 3 separate columns. I could do a '~'.join before creating the Dataframe. Just wondering if there was an easier way

Why don't you just input the data as:
x={'XYZ': [[4, 5, 6]], 'ABC': [[1, 2, 3]]}
Then you get:
In [7]: pd.DataFrame(x).transpose()
Out[7]:
0
ABC [1, 2, 3]
XYZ [4, 5, 6]
You can recode your dictionary using:
for key in x.keys():
x[key] = [x[key]]

Ok, this is how I did it
z = pd.DataFrame.from_records(list(x.items()),columns=['A','SomeColumnName'],index='A')
Problem was - I wasnt using list() for data

Related

How can i sum up all values with the same index in a dictionary which each key has a nested list as a value?

I have a dictionary, each key of dictionary has a list of list (nested list) as its value. What I want is imagine we have:
x = {1: [[1, 2], [3, 5]], 2: [[2, 1], [2, 6]], 3: [[1, 5], [5, 4]]}
My question is how can I access each element of the dictionary and concatenate those with same index: for example first list from all keys:
[1,2] from first keye +
[2,1] from second and
[1,5] from third one
How can I do this?

You can access your nested list easily when you're iterating through your dictionary and append it to a new list and the you apply the sum function.
Code:
x={1: [[1,2],[3,5]] , 2:[[2,1],[2,6]], 3:[[1,5],[5,4]]}
ans=[]
for key in x:
ans += x[key][0]
print(sum(ans))
Output:
12

Assuming you want a list of the first elements, you can do:
>>> x={1: [[1,2],[3,5]] , 2:[[2,1],[2,6]], 3:[[1,5],[5,4]]}
>>> y = [a[0] for a in x.values()]
>>> y
[[1, 2], [2, 1], [1, 5]]
If you want the second element, you can use a[1], etc.

The output you expect is not entirely clear (do you want to sum? concatenate?), but what seems clear is that you want to handle the values as matrices.
You can use numpy for that:
summing the values
import numpy as np
sum(map(np.array, x.values())).tolist()
output:
[[4, 8], [10, 15]] # [[1+2+1, 2+1+5], [3+2+5, 5+6+4]]
concatenating the matrices (horizontally)
import numpy as np
np.hstack(list(map(np.array, x.values()))).tolist()
output:
[[1, 2, 2, 1, 1, 5], [3, 5, 2, 6, 5, 4]]

As explained in How to iterate through two lists in parallel?, zip does exactly that: iterates over a few iterables at the same time and generates tuples of matching-index items from all iterables.
In your case, the iterables are the values of the dict. So just unpack the values to zip:
x = {1: [[1, 2], [3, 5]], 2: [[2, 1], [2, 6]], 3: [[1, 5], [5, 4]]}
for y in zip(*x.values()):
print(y)
Gives:
([1, 2], [2, 1], [1, 5])
([3, 5], [2, 6], [5, 4])

How to insert a list at a specific index?

I got a list
a=[1,2,3]
and a list of list
b=[[1,2],[3,4,5]]
and I want to insert a into b at index 1 so b becomes
b=[[1,2],[1,2,3],[3,4,5]]
How do I do that?If I use insert it won't work because I can only insert an item not a list?
EDIT:I realised insert can be used for lists as well.Thanks.

You can use list.insert which takes the index as the first argument
>>> a=[1,2,3]
>>> b=[[1,2],[3,4,5]]
>>> b.insert(1, a)
>>> b
[[1, 2], [1, 2, 3], [3, 4, 5]]

You can use list slicing:
b=[[1,2],[3,4,5]]
a = [1, 2, 3]
final_list = b[:1]+[a]+b[1:]
Output:
[[1, 2], [1, 2, 3], [3, 4, 5]]

Shuffle a dictionary of lists aggregating by rows

I have a defaultfict(list) that might look like this
d = {0: [2, 4, 5], 1: [5, 6, 1]}
that I need to shuffle all the first elements from all of the lists together, and move one to the second and third rows. So in this example I need to take [2, 5], [4, 6], [5, 1] shuffle them and then put them back. At the end my dictionary might look like this
d = {0: [5, 4, 1], 1: [2, 6, 5]}
is there a pythonic way of doing this avoiding loops?
What I have until now is a way to extract and aggregate all the first, second, etc., elements of the lists and shuffle them using this
[random.sample([tmp_list[tmp_index] for tmp_list in d.values()], 2) for tmp_index in range(3)]
that will create the following
[[2, 5], [4, 6], [5, 1]]
and then in order to create my final shuffled-by-rows dictionary I use simple for loops.

Get a transposed version of the dict values:
>>> data = [list(v) for v in zip(*d.values())]
>>> data
[[2, 5], [4, 6], [5, 1]]
Shuffle them in-place
>>> for x in data:
... random.shuffle(x)
...
>>> data
[[5, 2], [4, 6], [5, 1]]
Transpose the data again
>>> data = zip(*data)
Assign the new values to the dict
>>> for x, k in zip(data, d):
... d[k][:] = x # Could also be written as d[k] = list(x)
...
>>> d
{0: [5, 4, 5], 1: [2, 6, 1]}

List of List, want to output variable, not value

I have a list of lists like so:
a = [1, 2, 3]
b = [2, 3, 4]
c = []
append blah blah blah
I currently am doing:
for x in c:
print(x)
and it is outputing [1, 2, 3]. How would i get it to output 'a' instead?

There are a few ways to achieve what you want. The first suggestions require using a different data structure. The last suggestion is for demonstration purposes ONLY and should NEVER BE USED.
Option 1. Store you data in a dictionary:
my_data = {"a": [1, 2, 3], "b": [2, 3, 4]}
my_data["c"] = [my_data.get('a'), my_data.get('b')]
Then you would simply iterate over the key, value pairs.
>>> for name, value in my_data.items():
... print name, value
...
a [1, 2, 3]
c [[1, 2, 3], [2, 3, 4]]
b [2, 3, 4]
The dictionary has no useful ordering, so if you wanted it ordered you could use an OrderedDict, or another data structure like a list of tuples.
Or you could sort them before you iterate:
for name, value in sorted(my_data.items()):
print name, value
You could also create the dictionary after the variables are assigned
>>> a = [1, 2, 3]
>>> b = [2, 3, 4]
>>> c = [a, b]
>>> my_data = {"a": a, "b": b, "c": c}
Option Terrible. The very hackish way to do this (and only for demonstration purposes) is to use locals()
>>> a = [1, 2, 3]
>>> b = [2, 3, 4]
>>> c = [a, b]
>>> for name, value in locals().items():
... if len(name) != 1:
... continue
... print name, value
...
a [1, 2, 3]
c [[1, 2, 3], [2, 3, 4]]
b [2, 3, 4]

You are printing a. Your list c is actually
C = [[1,2,3], [2,3,4]]
If you modify a before printing c. The new values in a will be shown. As python passes by reference and c contains a reference to a

If you want to print the variable name see Can I print original variable's name in Python? and How can you print a variable name in python?
However the answers there say you should not do it.
You are telling it to print the complete contents of c which contains the objects a and b which are indeed
[1, 2, 3]
[2, 3, 4]
You are saying that you want to print the string 'a'
To do that you would have to define
c = ['a', 'b']
which is completely different.

Dictionary where list is value as dataframe

This may be an incorrect way to use dataframes, but I have a dictionary where the values are a list of items. Such as:
my_dict = {'a':[1,2,3], 'b':[3,4,5]}
I want to create a data frame where the indices are the keys and there is one column, where the value is the list. This is the output I'd like to see:
In [69]: my_df
Out[69]:
0
a [1, 2, 3]
b [3, 4, 5, 6]
This is the closest I've gotten, by changing the dictionary value to a list of lists and using a transpose. What is the better way?
In [64]: my_dict = {'a':[[1,2,3]], 'b':[[3,4,5,6]]}
In [65]: my_df = pd.DataFrame(my_dict)
In [66]: print my_df
a b
0 [1, 2, 3] [3, 4, 5, 6]
In [67]: my_df.T
Out[67]:
0
a [1, 2, 3]
b [3, 4, 5, 6]
Thanks for the help!

import pandas as pd
my_dict = {'a':[1,2,3], 'b':[3,4,5]}
pd.DataFrame([[i] for i in my_dict.values()],index=my_dict)
Out[3]:
0
a [1, 2, 3]
b [3, 4, 5]
But as what you have is more of a Series than a DataFrame:
pd.Series(my_dict)
Out[4]:
a [1, 2, 3]
b [3, 4, 5]
and if you need to, you can convert it to a DataFrame:
pd.DataFrame(pd.Series(my_dict))
Out[5]:
0
a [1, 2, 3]
b [3, 4, 5]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Pandas from dictionary - python

Why don't you just input the data as: x={'XYZ': [[4, 5, 6]], 'ABC': [[1, 2, 3]]} Then you get: In [7]: pd.DataFrame(x).transpose() Out[7]: 0 ABC [1, 2, 3] XYZ [4, 5, 6] You can recode your dictionary using: for key in x.keys(): x[key] = [x[key]]

Ok, this is how I did it z = pd.DataFrame.from_records(list(x.items()),columns=['A','SomeColumnName'],index='A') Problem was - I wasnt using list() for data

Related

How can i sum up all values with the same index in a dictionary which each key has a nested list as a value?

How to insert a list at a specific index?

Shuffle a dictionary of lists aggregating by rows

List of List, want to output variable, not value

Dictionary where list is value as dataframe

Categories

Resources