I have a correlation dataframe, and I'm trying to turn it into different lists:
A B C
A 1.000000 0.932159 -0.976221
B 0.932159 1.000000 -0.831509
C -0.976221 -0.831509 1.000000
The output I need is:
[A, B, 0.932159]
[A, C, -0.976221]
[B, A, 0.932159]
[B, C, -0.831509]
[C, A, -0.976221]
[C, B, -0.831509]
I have tried converting the dataframe into list, but I don't get what I need.
Thanks
Stack the dataframe, reset the index, then exclude the rows which have identical 1st and 2nd column values, then create the list out of it:
out=df.stack().reset_index()
out=out[out.iloc[:,0].ne(out.iloc[:,1])].values.tolist()
OUTPUT
[['A', 'B', 0.932159],
['A', 'C', -0.976221],
['B', 'A', 0.932159],
['B', 'C', -0.831509],
['C', 'A', -0.976221],
['C', 'B', -0.831509]]
The simplest way I can think of for a tiny DataFrame like this is with a list comprehension:
corr
A B C
A 1.000000 0.932159 -0.976221
B 0.932159 1.000000 -0.831509
C -0.976221 -0.831509 1.000000
[[row, col, corr[col][row]] for row in corr.index for col in corr if row != col]
[['A', 'B', 0.932159],
['A', 'C', -0.976221],
['B', 'A', 0.932159],
['B', 'C', -0.831509],
['C', 'A', -0.976221],
['C', 'B', -0.831509]]
The longer-form way may be easier to read for a general audience:
result = []
for row in corr.index:
for col in corr:
if row != col:
result.append([row, col, corr[col][row]])
result
[['A', 'B', 0.932159],
['A', 'C', -0.976221],
['B', 'A', 0.932159],
['B', 'C', -0.831509],
['C', 'A', -0.976221],
['C', 'B', -0.831509]]
A simple list comprehension will be enough
from itertools import permutations
elements = df.index
out = [[val[0], val[1], df.loc[val[0], val[1]]]
for val in permutations(elements, 2)]
Output
[['A', 'B', 0.932159],
['A', 'C', -0.976221],
['B', 'A', 0.932159],
['B', 'C', -0.831509],
['C', 'A', -0.976221],
['C', 'B', -0.831509]]
Related
I want to filter a list of lists for duplicates. I consider two lists to be a duplicate of each other when they contain the same elements but not necessarily in the same order. So for example
[['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
should become
[['A', 'B', 'C'], ['D', 'B', 'A']]
since ['C', 'B', 'A'] is a duplicate of ['A', 'B', 'C'].
It does not matter which one of the duplicates gets removed, as long as the final list of lists does not contain any duplicates anymore. And all lists need to keep the order of there elements. So using set() may not be an option.
I found this related questions:
Determine if 2 lists have the same elements, regardless of order? ,
How to efficiently compare two unordered lists (not sets)?
But they only talk about how to compare two lists, not how too efficiently remove duplicates. I'm using python.
using dictionary comprehension
>>> data = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
>>> result = {tuple(sorted(i)): i for i in data}.values()
>>> result
dict_values([['C', 'B', 'A'], ['D', 'B', 'A']])
>>> list( result )
[['C', 'B', 'A'], ['D', 'B', 'A']]
You can use frozenset
>>> x = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
>>> [list(s) for s in set([frozenset(item) for item in x])]
[['A', 'B', 'D'], ['A', 'B', 'C']]
Or, with map:
>>> [list(s) for s in set(map(frozenset, x))]
[['A', 'B', 'D'], ['A', 'B', 'C']]
If you want to keep the order of elements:
data = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
seen = set()
result = []
for obj in data:
if frozenset(obj) not in seen:
result.append(obj)
seen.add(frozenset(obj))
Output:
[['A', 'B', 'C'], ['D', 'B', 'A']]
Do you want to keep the order of elements?
from itertools import groupby
data = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
print([k for k, _ in groupby(data, key=sorted)])
Output:
[['A', 'B', 'C'], ['A', 'B', 'D']]
In python you have to remember that you can't change existing data but you can somehow append / update data.
The simplest way is as follows:
dict = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
temp = []
for i in dict:
if sorted(i) in temp:
pass
else:
temp.append(i)
print(temp)
cheers, athrv
There are a lot of similar question (like this one) but I did not find anything that suited my needs.
My objective is to remove groups of adjacent duplicates from a list.
For instance, if my list is
['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'A', 'C', 'C']
my desired output is
['A', 'B', 'C', 'A', 'C']
i.e. every group of adjacent duplicates is removed, only one of their group remains.
My code so far involves a for cycle with a condition:
def reduce_duplicates(l):
assert len(l) > 0, "Passed list is empty."
result = [l[0]] # initialization
for i in l:
if i != result[-1]:
result.append(i)
return result
l = ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'A', 'C', 'C']
print(reduce_duplicates(l))
# ['A', 'B', 'C', 'A', 'C']
It produces the expected output, but I think there is a native, optimized and elegant way to achieve the same result. Is it true?
Use groupby from itertools:
lst = ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'A', 'C', 'C']
out = [k for k, _ in groupby(lst)]
print(out)
# Output
['A', 'B', 'C', 'A', 'C']
Update
You can also use zip_longest from itertools:
lst = ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'A', 'C', 'C']
out = [l for l, r in zip_longest(lst, lst[1:]) if l != r]
print(out)
# Output
['A', 'B', 'C', 'A', 'C']
Or without any imports:
lst = ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'A', 'C', 'C']
out = [lst[0]] + [r for l, r in zip(lst, lst[1:]) if l != r]
print(out)
# Output
['A', 'B', 'C', 'A', 'C']
The itertools documentation provides a recipe for exactly this, unique_justseen. Since it uses map, it may be a tiny bit faster than the regular list comprehension, and also supports a key-function.
def unique_justseen(iterable, key=None):
"List unique elements, preserving order. Remember only the element just seen."
# unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
# unique_justseen('ABBCcAD', str.lower) --> A B C A D
return map(next, map(operator.itemgetter(1), groupby(iterable, key)))
seats = 4 # user can choose an even input, I put 4 for this example
rows = 4 # user can choose an even or odd input, I put 4 for this example
seats_in_row_list = [i for i in string.ascii_uppercase[:seats]]
main_seat_list = [seats_in_row_list for i in range(rows)]
The output is:
[['A', 'B', 'C', 'D'], ['A', 'B', 'C', 'D'], ['A', 'B', 'C', 'D'], ['A', 'B', 'C', 'D']]
But when I try to change 'A' to 'X' in the first list all of the lists change:
[['X', 'B', 'C', 'D'], ['X', 'B', 'C', 'D'], ['X', 'B', 'C', 'D'], ['X', 'B', 'C', 'D']]
What I'm looking for is this output:
[['X', 'B', 'C', 'D'], ['A', 'B', 'C', 'D'], ['A', 'B', 'C', 'D'], ['A', 'B', 'C', 'D']]
Use copy method to have a copy of the individual list before assigning
main_seat_list = [seats_in_row_list.copy() for i in range(rows)]
If you aren't using seats_in_row_list for anything other than the construction of main_seat_list, you should just inline the definition. Calling list here would be simpler than using a list comprehension.
seats = 4
rows = 4
main_seat_list = [list(string.ascii_uppercase[:seats]) for i in range(rows)]
I'm trying to move the second value in a list to the third value in a list for each nested list. I tried the below, but it's not working as expected.
Code
List = [['a','b','c','d'],['a','b','c','d'],['a','b','c','d']]
print(List)
col_out = [List.pop(1) for col in List]
col_in = [List.insert(2,List) for col in col_out]
print(List)
Result
[['a', 'b', 'c', 'd'], ['a', 'b', 'c', 'd'], ['a', 'b', 'c', 'd']]
[['a', 'b', 'c', 'd'], [...], [...]]
Desired Result
[['a', 'c', 'b', 'd'], ['a', 'c', 'b', 'd'], ['a', 'c', 'b', 'd']]
UPDATE
Based upon pynoobs comment, i came up with the following. But i'm still not there. Why is 'c' printing?
Code
List = [['a','b','c','d'],['a','b','c','d'],['a','b','c','d']]
col_out = [col.pop(1) for col in List for i in col]
print(col_out)
Result
['b', 'c', 'b', 'c', 'b', 'c']
[List.insert(2,List) for col in col_out]
^^^^ -- See below.
You are inserting an entire list as an element within the same list. Think recursion!
Also, please refrain from using state-changing expressions in list comprehension. A list comprehension should NOT modify any variables. It is bad manners!
In your case, you'd do:
lists = [['a','b','c','d'],['a','b','c','d'],['a','b','c','d']]
for lst in lists:
lst[1], lst[2] = lst[2], lst[1]
print(lists)
Output:
[['a', 'c', 'b', 'd'], ['a', 'c', 'b', 'd'], ['a', 'c', 'b', 'd']]
You can do it like this
myList = [['a','b','c','d'],['a','b','c','d'],['a','b','c','d']]
myOrder = [0,2,1,3]
myList = [[sublist[i] for i in myOrder] for sublist in myList]
I have many troubles to merge two simple nested lists while keeping the overall structure. For example:
# From this source:
x = [['a','b','c'],['d','e','f'],['g','h','i']]
y = [['A','B','C'],['D','E','F'],['G','H','I']]
# To this result:
z = [[['a','A'],['b','B'],['c','C']],
[['d','D'],['e','E'],['f','F']],
[['g','G'],['h','H'],['i','I']]]
Any idea?
Here, some (embarrassing) code I tried:
X = []
Y = []
for i in iter(x[:]):
X.append('=')
for v in iter(i[:]):
X.append(v)
print X; print
for i in iter(y[:]):
Y.append('=')
for v in iter(i[:]):
Y.append(v)
print Y; print
for i in zip(X, Y):
print i
print([list(map(list,zip(*t))) for t in zip(x,y)])
[[['a', 'A'], ['b', 'B'], ['c', 'C']],
[['d', 'D'], ['e', 'E'], ['f', 'F']],
[['g', 'G'], ['h', 'H'], ['i', 'I']]]
The steps:
In [20]: zip(x,y) # zip the lists together
Out[20]:
[(['a', 'b', 'c'], ['A', 'B', 'C']),
(['d', 'e', 'f'], ['D', 'E', 'F']),
(['g', 'h', 'i'], ['G', 'H', 'I'])]
In [21]: t = (['a', 'b', 'c'], ['A', 'B', 'C']) # first "t"
In [22]: zip(*t) # transpose, row to columns, columns to rows
Out[22]: [('a', 'A'), ('b', 'B'), ('c', 'C')]
In [23]: list(map(list,zip(*t))) # convert inner tuples to lists
Out[23]: [['a', 'A'], ['b', 'B'], ['c', 'C']]