Changing 2-dimensional list to standard matrix form - python

org = [['A', 'a', 1],
['A', 'b', 2],
['A', 'c', 3],
['B', 'a', 4],
['B', 'b', 5],
['B', 'c', 6],
['C', 'a', 7],
['C', 'b', 8],
['C', 'c', 9]]
I want to change the 'org' to the standard matrix form like below.
transform = [['\t','A', 'B', 'C'],
['a', 1, 4, 7],
['b', 2, 5, 8],
['c', 3, 6, 9]]
I made a small function that converts this.
The code I wrote is below:
import numpy as np
def matrix(li):
column = ['\t']
row = []
result = []
rest = []
for i in li:
if i[0] not in column:
column.append(i[0])
if i[1] not in row:
row.append(i[1])
result.append(column)
for i in li:
for r in row:
if r == i[1]:
rest.append([i[2]])
rest = np.array(rest).reshape((len(row),len(column)-1)).tolist()
for i in range(len(rest)):
rest[i] = [row[i]]+rest[i]
result += rest
for i in result:
print(i)
matrix(org)
The result was this:
>>>['\t', 'school', 'kids', 'really']
[72, 0.008962252017017516, 0.04770759762717251, 0.08993156334317577]
[224, 0.004180594204995023, 0.04450803342634945, 0.04195010047081213]
[385, 0.0021807662921382335, 0.023217182598008267, 0.06564858527712682]
I don't think this is efficient since I use so many for loops.
Is there any efficient way to do this?

Since you are using 3rd party libraries, this is a task well suited for pandas.
There is some messy, but not inefficient, work to incorporate index and columns as per your requirement.
org = [['A', 'a', 1],
['A', 'b', 2],
['A', 'c', 3],
['B', 'a', 4],
['B', 'b', 5],
['B', 'c', 6],
['C', 'a', 7],
['C', 'b', 8],
['C', 'c', 9]]
df = pd.DataFrame(org)
pvt = df.pivot_table(index=0, columns=1, values=2)
cols = ['\t'] + pvt.columns.tolist()
res = pvt.values.T.tolist()
res.insert(0, pvt.index.tolist())
res = [[i]+j for i, j in zip(cols, res)]
print(res)
[['\t', 'A', 'B', 'C'],
['a', 1, 4, 7],
['b', 2, 5, 8],
['c', 3, 6, 9]]

Here's another "manual" way using only numpy:
org_arr = np.array(org)
key1 = np.unique(org_arr[:,0])
key2 = np.unique(org_arr[:,1])
values = org_arr[:,2].reshape((len(key1),len(key2))).transpose()
np.block([
["\t", key1 ],
[key2[:,None], values]
])
""" # alternatively, for numpy < 1.13.0
np.vstack((
np.hstack(("\t", key1)),
np.hstack((key2[:, None], values))
))
"""
For simplicity, it requires the input matrix to be strictly ordered (first col is major and ascending ...).
Output:
Out[58]:
array([['\t', 'A', 'B', 'C'],
['a', '1', '4', '7'],
['b', '2', '5', '8'],
['c', '3', '6', '9']],
dtype='<U1')

Related

Sequence numbers for lists with lists

I am preparing a list of lists. the initial list can contain any number of entries but the sub-lists each contain 3 entries, eg:
colony = [['A', 'B', 'C'], [1, 'b', 'c'], [2, 'b', 'c'], [3, 'b', 'c'], [4, 'b', 'c'], [5, 'b', 'c']]
The first entry in each of the sub-lists is the sequence number of the entry and needs to be sequential,
ie. 1, 2, 3, 4, 5, 6,...
A, B and C are the column headings, the data is in the subsequent lists. My difficulty is that if the number of sub-lists that I need to add is say 5, then the sequence number of every entry is 5.How can I change my code to insert the correct sequence number in each sub-list:
colony = [['A', 'B', 'C']]
add_colony = [0, 0, 0]
R = 5
i = 1
for i in range(R):
add_colony[0] = i + 1
add_colony[1] = 'b'
add_colony[2] = 'c'
colony.append(add_colony)
i = i + 1
print()
print('colony = ', colony)
produces:
colony = [['A', 'B', 'C'], [5, 'b', 'c'], [5, 'b', 'c'], [5, 'b', 'c'], [5, 'b', 'c'], [5, 'b', 'c']]
not:
colony = [['A', 'B', 'C'], [1, 'b', 'c'], [2, 'b', 'c'], [3, 'b', 'c'], [4, 'b', 'c'], [5, 'b', 'c']]
I have tried all sorts of variations but end up with the incorrect output.
Thanks in advance
Bob
You are permanently mutating and appending the same list object add_colony. All the lists in colony are references to this same object. You have to create a new list for each loop iteration:
for i in range(R):
add_colony = [0, 0, 0] # without this line ...
add_colony[0] = i + 1 # ... this mutation will affect all the references in colony
add_colony[1] = 'b'
add_colony[2] = 'c'
colony.append(add_colony)
Or shorter:
for i in range(R):
colony.append([i + 1, 'b', 'c'])
Hello there fellow University of Melbourne Student!
As #schwobaseggl mentioned, you need to create a new list object on each iteration of your loop, or you just keep appending the same object over and over again. You could also just make add_colony have the default values ['b', 'c'], and insert() the new i value at the beginning each of the list.
Here is an example:
colony = [['A', 'B', 'C']]
R = 5
for i in range(R):
add_colony = ['b', 'c']
add_colony.insert(0, i+1)
colony.append(add_colony)
print('colony = ', colony)
Which Outputs:
colony = [['A', 'B', 'C'], [1, 'b', 'c'], [2, 'b', 'c'], [3, 'b', 'c'], [4, 'b', 'c'], [5, 'b', 'c']]
You could also use a list comprehension:
colony = [['A', 'B', 'C']] + [[i + 1] + ['b', 'c'] for i in range(R)]
Good Luck!
1 liner list comp that mutates the original list without saving it to a variable (kinda weird):
[colony.append([i, 'b', 'c']) for i in range(1, R + 1)]
print(colony) outputs
[['A', 'B', 'C'], [1, 'b', 'c'], [2, 'b', 'c'], [3, 'b', 'c'], [4, 'b', 'c'], [5, 'b', 'c']]

Create all combinations of two sets of lists in Python

I am trying to create all combinations of two sets of lists using as follows:
x = [[1,2,3],[4,5,6]]
y = [['a','b','c'],['d','e','f']]
combos = [[1,2,3,'a','b','c'],[4,5,6,'d','e','f'],[4,5,6,'a','b','c'],[4,5,6,'d','e','f']]
I think itertools may be of some help but not sure how. Thanks
You can use product and chain:
from itertools import product, chain
[list(chain(*i)) for i in product(x, y)]
#[[1, 2, 3, 'a', 'b', 'c'],
# [1, 2, 3, 'd', 'e', 'f'],
# [4, 5, 6, 'a', 'b', 'c'],
# [4, 5, 6, 'd', 'e', 'f']]
Or you can use a list comprehension:
[i + j for i in x for j in y]
#[[1, 2, 3, 'a', 'b', 'c'],
# [1, 2, 3, 'd', 'e', 'f'],
# [4, 5, 6, 'a', 'b', 'c'],
# [4, 5, 6, 'd', 'e', 'f']]

python: combine lists of lists for SQLITE table

I need to combine 3 lists into one list so that I can insert it smoothly into sqlite table.
list1= [[a1,b1,c1],[a2,b2,c2]]
list2= [[d1,e1,f1],[d2,e2,f2]]
Output should look like:
combined_list = [[a1,b1,c1,d1,e1,f1],[a2,b2,c2,d2,e2,f2]]
I tried sum list1 + list2 but both didn't work as this output.
You can try this:
from operator import add
a=[[1, 2, 3], [4, 5, 6]]
b=[['a', 'b', 'c'], ['d', 'e', 'f']]
print a + b
print map(add, a, b)
Output:
[[1, 2, 3], [4, 5, 6], ['a', 'b', 'c'], ['d', 'e', 'f']]
[[1, 2, 3, 'a', 'b', 'c'], [4, 5, 6, 'd', 'e', 'f']]
Edit:
To add more than two arrays:
u=[[]]*lists[0].__len__()
for x in lists:
u=map(add, u, x)

extracting a range of elements from a csv / 2d array

I want to extract elements from a range of elements is a specific column from a csv file.
I've simplified the problem to this:
data = [['a',1,'A',100],['b',2,'B',200],['c',3,'C',300],['d',4,'D',400]]
print(data[0:2][:],'\nROWS 0&1')
print(data[:][0:2],'\nCOLS 1&1')
I thought that meant
'show me all columns for just row 0 and 1'
'show me all the rows for just column 0 and 1'
But the output is always just showing me rows 0 and 1, never the columns,
[['a', 1, 'A', 100], ['b', 2, 'B', 200]]
ROWS 0&1
[['a', 1, 'A', 100], ['b', 2, 'B', 200]]
COLS 1&1
when I want to see this:
['a', 1, 'A', 100,'b', 2, 'B', 200] # ... i.e. ROWS 0 and 1
['a','b','c','d',1,2,3,4]
Is there a nice way to do this?
Your problem here is that data[:] is just a copy of data:
>>> data
[['a', 1, 'A', 100], ['b', 2, 'B', 200], ['c', 3, 'C', 300], ['d', 4, 'D', 400]]
>>> data[:]
[['a', 1, 'A', 100], ['b', 2, 'B', 200], ['c', 3, 'C', 300], ['d', 4, 'D', 400]]
... so both your attempts at slicing are giving you the same result as data[0:2].
You can get just columns 0 and 1 with a list comprehension:
>>> [x[0:2] for x in data]
[['a', 1], ['b', 2], ['c', 3], ['d', 4]]
... which can be rearranged to the order you want with zip():
>>> list(zip(*(x[0:2] for x in data)))
[('a', 'b', 'c', 'd'), (1, 2, 3, 4)]
To get a single list rather than a list of 2 tuples, use itertools.chain.from_iterable():
>>> from itertools import chain
>>> list(chain.from_iterable(zip(*(x[0:2] for x in data))))
['a', 'b', 'c', 'd', 1, 2, 3, 4]
... which can also be used to collapse data[0:2]:
>>> list(chain.from_iterable(data[0:2]))
['a', 1, 'A', 100, 'b', 2, 'B', 200]

Returning indexes of intersected lists with Python

I have two lists:
1. ['a', 'b', 'c', 'd', 'e', 'c', 'd', 'f']
2. ['c', 'd']
and I'd like to get indexes of the intersection a, b:
3. [[2, 3], [5, 6]]
How would you do that with Python?
Also these inputs:
1. ['263', '9', '470', '370', '576', '770', '800', '203', '62', '370', '576', '370', '25', '770', '484', '61', '914', '301', '550', '770', '484', '1276', '108']
2. ['62', '370', '576']
should give:
3. [[8, 9, 10]]
One way would be:
>>> l1 = ['a', 'b', 'c', 'd', 'e', 'c', 'd', 'f']
>>> l2 = ['c', 'd']
>>> [range(i,i+len(l2)) for i in xrange(len(l1)-len(l2)+1) if l2 == l1[i:i+len(l2)]]
[[2, 3], [5, 6]]
>>>
For your given example this will work
>>> x = ['a', 'b', 'c', 'd', 'e', 'c', 'd', 'f']
>>> y = ['c', 'd']
>>> z = [[i for i, xi in enumerate(x) if xi == yi] for yi in y]
>>> z
[[2, 5], [3, 6]]
>>> zip(*z)
[(2, 3), (5, 6)]
It makes uses of the enumerate function to get the indices of x along with the values and then transposes the result using zip(*z). You can convert from tuples to lists afterward.
Edit: transposed result.
Maybe a little too much code, but it works.
def indexes(list, element):
c = 0
output = []
for e in list:
if e == element:
output.append(c)
c += 1
return output
a = ['a', 'b', 'c', 'd', 'a']
b = ['a', 'd']
output = []
for el in b:
output.append(indexes(a, el))
print(output)

Categories