Kronecker Power of sparse matrices problem - python

I am trying to create a sparse matrix using scipy package.
Why the following code does not work? I try it also in loops.
import numpy
from scipy import sparse
from scipy.sparse import coo_matrix
def SparseMatrixPower(A,p):
if p == 1:
return(A)
elif Mod(p,2):
return(SparseMatrixPower(A,SparseMatrixProduct(A,A)), (p - 1) / 2)
else:
return(SparseMatrixPower(SparseMatrixProduct(A,A), p / 2))
def SparseMatrixProduct(A,B):
return(sparse.kron(A,B)+sparse.kronsum(A,B))
A=sparse.coo_matrix([[0,1,2],[1,1,2],[2,2,3]])
B=sparse.coo_matrix([[0,1,2],[1,1,2],[2,2,3]])
SparseMatrixProduct(A,B)
SparseMatrixPower(A,3)
Error message is below which I am getting:
ValueError Traceback (most recent call last)
in ()
19 B=sparse.coo_matrix([[Integer(0),Integer(1),Integer(2)],[Integer(1),Integer(1),Integer(2)],[Integer(2),Integer(2),Integer(3)]])
20 SparseMatrixProduct(A,B)
---> 21 SparseMatrixPower(A,Integer(11))
<ipython-input-7-9b98e47610f5> in SparseMatrixPower(A, p)
9 return(A)
10 elif Mod(p,Integer(2)):
---> 11 return(SparseMatrixPower(A,SparseMatrixProduct(A,A)), (p - Integer(1)) / Integer(2))
12 else:
13 return(SparseMatrixPower(SparseMatrixProduct(A,A), p / Integer(2)))
<ipython-input-7-9b98e47610f5> in SparseMatrixPower(A, p)
6
7 def SparseMatrixPower(A,p):
----> 8 if p == Integer(1):
9 return(A)
10 elif Mod(p,Integer(2)):
/home/calc/SageMath/local/lib/python3.7/site-packages/scipy/sparse/base.py in __bool__(self)
286 return self.nnz != 0
287 else:
--> 288 raise ValueError("The truth value of an array with more than one "
289 "element is ambiguous. Use a.any() or a.all().")
290 __nonzero__ = __bool__
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().

The piece of code SparseMatrixPower(A, SparseMatrixProduct(A, A)) is not in the intention of the function SparseMatrixPower, which wants as second argument an integer. The error message points to this, and it seems that the code was started from the sagemath ipython interpreter.
Let us better try a version of the above code, which better suits pep8 guidelines:
import numpy
from scipy import sparse
from scipy.sparse import coo_matrix
def SparseMatrixProduct(A, B):
return sparse.kron(A, B) + sparse.kronsum(A, B)
def SparseMatrixPower(A, p):
if p == 1:
return(A)
phalf = int(p/2) # p/2 truncated if not int
rest = p - 2*phalf
AA = SparseMatrixProduct(A, A)
B = SparseMatrixPower(AA, phalf)
if not rest:
return B
return SparseMatrixProduct(A, B)
A = sparse.coo_matrix([[0,1,2],[1,1,2],[2,2,3]])
B = sparse.coo_matrix([[0,1,2],[1,1,2],[2,2,3]])
AB = SparseMatrixProduct(A, B)
A3 = SparseMatrixPower(A,3)
The above works for me.
Printing the objects AB and A3 above is not really supported by the format of this site, i will also use the ipython interpreter to show AB and the shape and some first elements of A3.
Well, AB "is":
sage: AB
<9x9 sparse matrix of type '<class 'numpy.int64'>'
with 81 stored elements (blocksize = 3x3) in Block Sparse Row format>
sage: AB.todense()
matrix([[ 0, 1, 2, 1, 1, 2, 2, 2, 4],
[ 1, 1, 2, 1, 2, 2, 2, 4, 4],
[ 2, 2, 3, 2, 2, 4, 4, 4, 8],
[ 1, 1, 2, 1, 2, 4, 2, 2, 4],
[ 1, 2, 2, 2, 3, 4, 2, 4, 4],
[ 2, 2, 4, 4, 4, 7, 4, 4, 8],
[ 2, 2, 4, 2, 2, 4, 3, 4, 8],
[ 2, 4, 4, 2, 4, 4, 4, 7, 8],
[ 4, 4, 8, 4, 4, 8, 8, 8, 15]])
and the 3³ x 3³ matrix A3 is shown in part, first few lines and columns:
sage: A3.shape
(27, 27)
sage: A3.todense()[0:15, 0:15]
matrix([[ 0, 1, 2, 1, 0, 0, 2, 0, 0, 1, 1, 2, 2, 1, 2],
[ 1, 1, 2, 0, 1, 0, 0, 2, 0, 1, 2, 2, 1, 3, 2],
[ 2, 2, 3, 0, 0, 1, 0, 0, 2, 2, 2, 4, 2, 2, 5],
[ 1, 0, 0, 1, 1, 2, 2, 0, 0, 2, 1, 2, 3, 2, 4],
[ 0, 1, 0, 1, 2, 2, 0, 2, 0, 1, 3, 2, 2, 5, 4],
[ 0, 0, 1, 2, 2, 4, 0, 0, 2, 2, 2, 5, 4, 4, 9],
[ 2, 0, 0, 2, 0, 0, 3, 1, 2, 4, 2, 4, 4, 2, 4],
[ 0, 2, 0, 0, 2, 0, 1, 4, 2, 2, 6, 4, 2, 6, 4],
[ 0, 0, 2, 0, 0, 2, 2, 2, 6, 4, 4, 10, 4, 4, 10],
[ 1, 1, 2, 2, 1, 2, 4, 2, 4, 1, 2, 4, 3, 1, 2],
[ 1, 2, 2, 1, 3, 2, 2, 6, 4, 2, 3, 4, 1, 4, 2],
[ 2, 2, 4, 2, 2, 5, 4, 4, 10, 4, 4, 7, 2, 2, 6],
[ 2, 1, 2, 3, 2, 4, 4, 2, 4, 3, 1, 2, 4, 3, 6],
[ 1, 3, 2, 2, 5, 4, 2, 6, 4, 1, 4, 2, 3, 7, 6],
[ 2, 2, 5, 4, 4, 9, 4, 4, 10, 2, 2, 6, 6, 6, 13]])
sage:

Related

Rosalind - Consensus and Profile - Issue with answer formatting

I am working on the Consensus and Profile problem on Rosalind, and I am so close to getting it done. My answer is correct, I have the right consensus string and the correct matrix, but I am having issues formatting my data for the answer. Rosalind expects the answer to look like:
ATGCAACT
A: 5 1 0 0 5 5 0 0
C: 0 0 1 4 2 0 6 1
G: 1 1 6 3 0 1 0 0
T: 1 5 0 0 0 1 1 6
My raw output looks like this:
{'A': [5, 3, 3, 3, 1, 4, 2, 1, 3, 5, 2, 2, 2, 3, 1, 3, 2, 2, 2, 4, 4, 4, 1, 2, 1, 3, 1, 2, 1, 2, 2, 3, 2, 1, 3, 5, 3, 4, 2, 2, 2, 3, 3, 2, 0, 0, 1, 2, 2, 4, 3, 5, 2, 4, 3, 1, 2, 2, 2, 3], 'C': [2, 1, 3, 2, 1, 2, 2, 1, 3, 2, 1, 2, 3, 2, 6, 3, 4, 1, 2, 0, 3, 2, 4, 2, 1, 3, 3, 3, 6, 2, 2, 1, 5, 5, 3, 0, 1, 1, 2, 3, 3, 5, 3, 2, 1, 2, 3, 5, 0, 2, 3, 2, 3, 2, 5, 3, 4, 3, 2, 4], 'G': [1, 3, 2, 4, 3, 2, 1, 3, 3, 0, 5, 3, 3, 2, 1, 2, 1, 5, 3, 2, 2, 2, 2, 4, 6, 3, 2, 3, 2, 3, 1, 3, 0, 2, 0, 3, 3, 3, 4, 2, 2, 2, 1, 3, 5, 2, 1, 0, 2, 1, 2, 1, 4, 2, 2, 3, 2, 0, 4, 2], 'T': [2, 3, 2, 1, 5, 2, 5, 5, 1, 3, 2, 3, 2, 3, 2, 2, 3, 2, 3, 4, 1, 2, 3, 2, 2, 1, 4, 2, 1, 3, 5, 3, 3, 2, 4, 2, 3, 2, 2, 3, 3, 0, 3, 3, 4, 6, 5, 3, 6, 3, 2, 2, 1, 2, 0, 3, 2, 5, 2, 1]}
And with some simple editing, I submit it as:
'A': [5, 3, 3, 3, 1, 4, 2, 1, 3, 5, 2, 2, 2, 3, 1, 3, 2, 2, 2, 4, 4, 4, 1, 2, 1, 3, 1, 2, 1, 2, 2, 3, 2, 1, 3, 5, 3, 4, 2, 2, 2, 3, 3, 2, 0, 0, 1, 2, 2, 4, 3, 5, 2, 4, 3, 1, 2, 2, 2, 3]
'C': [2, 1, 3, 2, 1, 2, 2, 1, 3, 2, 1, 2, 3, 2, 6, 3, 4, 1, 2, 0, 3, 2, 4, 2, 1, 3, 3, 3, 6, 2, 2, 1, 5, 5, 3, 0, 1, 1, 2, 3, 3, 5, 3, 2, 1, 2, 3, 5, 0, 2, 3, 2, 3, 2, 5, 3, 4, 3, 2, 4]
'G': [1, 3, 2, 4, 3, 2, 1, 3, 3, 0, 5, 3, 3, 2, 1, 2, 1, 5, 3, 2, 2, 2, 2, 4, 6, 3, 2, 3, 2, 3, 1, 3, 0, 2, 0, 3, 3, 3, 4, 2, 2, 2, 1, 3, 5, 2, 1, 0, 2, 1, 2, 1, 4, 2, 2, 3, 2, 0, 4, 2]
'T': [2, 3, 2, 1, 5, 2, 5, 5, 1, 3, 2, 3, 2, 3, 2, 2, 3, 2, 3, 4, 1, 2, 3, 2, 2, 1, 4, 2, 1, 3, 5, 3, 3, 2, 4, 2, 3, 2, 2, 3, 3, 0, 3, 3, 4, 6, 5, 3, 6, 3, 2, 2, 1, 2, 0, 3, 2, 5, 2, 1]
But the editing still doesn't matter because of the metric f**k ton of commas and brackets that would need to be manually deleted as well, especially considering the fact that you only have 5 minutes to submit your answer - I've tried and I have found it impossible to format my answer manually in the five-minute window.
I was wondering if anyone knew of some tips, tricks, or solutions that can help me get over this hurdle. I have seen some other solutions, but they essentially require me to take a different approach to logic, which pisses me off because I spent a lot of time thinking about this answer and also creating my own function that manages the FASTA file format from scratch.
Here is my source code:
data = open('/Users/danielpintard/Downloads/rosalind_cons (1).txt', 'r').read()
if '>' in data :
data_array = data.split('>')
for i in data_array:
if i == '':
data_array.remove(i)
for i in data_array: data_array[data_array.index(i)] = i.split('\n', 2)
#create profile
prof_sequences = []
for i in data_array:
data_array[data_array.index(i)] = i[1]
prof_sequences.append(i[1])
n = len(prof_sequences[0])
profile_matrix = {
'A': [0]*n,
'C': [0]*n,
'G': [0]*n,
'T': [0]*n,
}
for dna in prof_sequences:
for position, nucleotide in enumerate(dna):
profile_matrix[nucleotide][position] += 1
result = []
#still having a hard time understanding this block of code
for position in range(n):
max_count = 0
max_nucleotide = None
for nucleotide in profile_matrix:
if profile_matrix[nucleotide][position] > max_count:
max_count = profile_matrix[nucleotide][position]
max_nucleotide = nucleotide
result.append(max_nucleotide)
print(profile_matrix)
print(result)
And here is the data:
>Rosalind_7283
TATTCATTGATCATATGAAGCCTTGCGACCTGCCCGGTTCTGAAGTCAGCTAGCACATTA
GTGTCAAGGTATTAGTGTAGTTGCTGACTCGAACGTGTGTTAATATTCATGTAGGGGTCT
GGCGACCCAATAGGCGCGTGGTGTACCGAATTGTGCACACACACGTGTATTTCGAACGCA
AGATGCAGCCGAATCAGACCGTAGTAAACCGTTTGAGTGGCGTTTTGGCGTGAGAAGGCT
TAGGTGTTACAAGTGCAGCGCGGGTGCATTTTCTCCGGCTTGGAGCAATAGTCCCTATGC
ATCGGCCCCGTATATGAGGATCGCATTACGCAACATCGTAAGCCTTGCACATCTGGCAAA
TGCACGGCTCTCATTATAGTTGCCAAAAATCAGCCCTACCACACGTAATATTCAAGGCTG
TGCTTGTCCAACTAGTTGGCGAATGATCCTCCAAGATTGCGGCGGGGTATAATCCCGCAC
GTCCGAATACCAATGTTCGAGTGCGGCACTACCAAGATGCGAGTCGGCGTGATATCGAGG
TTCACATAGGGGACGTTTATGTCCTTTGGATGTCTCGCCAACTCCATTCTATCATTAGGT
TGGCGGTCAGCAGGATGGAGCATAGTCATGAGCTTGAGTACTGTGCGGCTCGGAAGAGGG
GCCGTATGGGTCTTCGGACAAACGTAGGTATTACAGGCCAAAAACGCTCAGAAAAAACGC
TATCTTAATGACCATTTGATAAACGTTCCCTTGCCGATTTAGAGTGACTTAGTGCAATGT
TGCGATTTCTACTACACTCAAGCTGTGTTAGGGATAATCCATAGCACAGGCCCGCTCGCC
CGTGCCCTGCCTTGCACGACAGGGCTAAGCGGCTCAAGAAGTTTCTACGCAACGTACCAC
GACCAGCTGGACCTACCGATAGACTTACCATATTCTAAGAATAAAACGGACCCTTATGTG
AGTGAGCGCAAGCAATATGGTTTGCCCGTTTGC
>Rosalind_6559
TGCGGCCTATGTGACGCTCAACCCGGGACGCACATGAGTACATCTTTCTTCAACGTCCGC
GAACACAGCCCAATTCGATTAATTGCCACGATTGTGTGGCACGCACTTACTGAAACCGGT
GGTGATAGGCATAGGTTTAGACCAGGGCTCGGGACAGCTGGTCTAGGTCGTGACTAATCA
ATGGTTTAAATGATGCACCCTTGTATCGGTATGCCTGTGTTTATCAGGAATGCCCATACA
TTTTGAGAAACGCTTATGGTTATTACACAAGCGAGGGAAGCGAGCTAGCGGCGTCCGAGA
ACTATAAAGGCAATCCTGACATGACGAGCGCAGAATCACCCCCTGAATCCCGGTTACGAT
ATGGGCCATTCGGGGAGCAAACGGCTGACTCTTCGGTAAAGTAATTTGCCAGGAACATTG
ATATATGCGCTGACCCTATTGATTATCCAACAAATACTACATTCAGCCCCAGGTCCCACG
TTAGGCGTAAGTTAAGAATTTATGTACGCAATCGCCAATATCCGCAAGACGTCCCCGCTG
ACAGTGAGGCTTAGTGGCGCCGATGGTATTCAGAAATGGAGCGCTCCTCTGTTGCACTCG
GCTCGTCAACCTTCTTGCTATTAACATATAAGAGTGAATGGGTGAGGTAGTAAACGTAAT
TGCCAAGATCGATAGAAGTGTTGGACGGAACATTGGAGCAAGGAACGCCGCTGAGCCAGG
GGACTAATGCCAGAGTGGAACCTGGTGGGAATTAAACACTTGTATACGTTGACAAGCTGA
GACATTCTAAAACACGTAATATAACATGCATCCACTAATGGATTCCCTTTCGCCTCTTGG
CTGGGATACATTTGCGCTTGGGAGCAGGAGATAGGGAGTCAAGGTGACATTGTGGGAATT
CACAAAGCTTCCTATCTAATGTTAGTACTTTAGCCACGGGTTAACCAGGACTGTTCTATA
GTTCCAACTCTCCATTATCCAAAACAAGCAGCA
>Rosalind_3098
GGCATAGGGACGCCGATGTAAGGAAATCCTCTAGTTTGAGCCCGGTGCTTACCGCAGTCC
TCGGCTTTCGTTCTGTTACAGACGTCCTAGGACTCAGTCGCCACCTACGCGGGGTGCATT
AGTCAGGTCCGAAGCCTCTATAGCGCTTTTTAGGAATGGAGCGTTTAAACGAGCCTGCGT
ATATTCGCTACCAAATCTCAGGGGCGGCTCAGATACAACGGGGTTCATCAGTTTGGATAT
CAGTGTTCGCTGGGTAAGTCTGACTCCGGCTCACAGATAGTTAGAAGGTCGCACATGATG
ATTACAACTTCTGCCGCTGACTTGGGAGTCTAGCGCTTGTCACAGACGCGCTAATGCGGC
ACATCTATTTCATAAAAGTACAAGCAATATGCCGCGAGGCCCCGTCGTATTTGATTCGAA
GGATTTAACTCATAACGCGGCCCTCAGCAGCTGCGGGCGTACGGAAGCCTCAACTTTGCG
ATTCTGTCGCACCTGCCTAGCTTAAGGAACCCCGATGCCGGTATCCACCGGACGTTTCGA
TTGCAAGATCTTGGCATGCCGCTACCTGTTGGAATTCAGTTATTAGTCCTACTCAGGAGG
GATAGCCGAACGGCACAAAGGCTTCGTTTGACAAGCACAGATGCATCTACTTAACTCGAT
AGCCTCAAAGAGTGTTTGCTCCGAGAGGGCCATCAGAGTAACTACCACGGCAAGAAGCGC
CTTCTTCCATGGCACACTCAAAAAGGTCATCTGAAGAGCCCATTTTTACCCACGGGATCC
CGCCACTAGACTCGTCACACTAAAACATAGAAGCAAGGCTGTAAGCGTACTCGGGTGTCC
CTAGCTACTTGACCCTGCGCTTTGATTTTCACCCAATCCAGCGCGTTAGCCAAACACCGG
CTCATGTGCGAGACACCTCTTGGACGGTACGAATACGCTTACTCCCACTCAGAACTGCTA
TCCGTGGGGTCCGTGGGGAGCCGGCGCAAAGAA
>Rosalind_2635
AACCTAAGCCACCTCGCGGTGTAACGCGCATCTGCAATCATCAGTTTCAGTCGGCGCAGC
GGAGCCCGGACAGCCTGTGCCGTACAAACCTGAAGCTGCTTACCTCGATTCATGCCAGGT
ATGAAGTATTCCGACGCTAATATCCTTTGGAATGGTTGCCAAGTCTCTACCAGCTACTCC
CATGACCGCATGACATATTCGACACGGTCTCTGAATGAGGTACGGTATTGCTTTCATTCT
AGTACGTTGCCCGACCTATGTACATCCGTCAACCACGGGGTGATCATACCTAAATTTGAA
TTAAAAAGTAGCGGAGCTACCGGACTGGTAGACTCCTCATCGCTCGGTTCAGTAGAAGGG
CTGGCCCTTTTCCTATCACTGTCCGTCCATTTCGTGTGTTTTAGGTGGTTTAGATATACC
TCTCATCGAAGAGTTGACCGTGTGATTAAATGAACGAACATTAAAGAGCGTGTGTTTAAA
TGCACGCAACACTAAAGGTGGAACATGGCGGTCGCCGTTATCGCATGGGTCTACTTGATC
GAAACTCAAGAGCATTGCAGACACAGGGACCCGTCAGGGTTTGTAAGCTGCGCGCTAATA
GTGCAACGTCCTAGGGTCGACTCCATGACGTAATGCAACTCTGGTTGACAATTCGTGAAG
TCGGAGTAAAGCTCCTGGCGCGCTGCACCCCCGGCTTCACCGTAGTTCCTACATTCTCGG
TCTAGTCGTGTGGGAATCACATCTGCTCCGAGGGTAAGGGGATTGGCATATAATGTGAGG
TAGCCGGCTAGGCGTATTAGCAACATCGTTGTCTATTGACTTGGAAGTTCTCTGTAGGAC
GTCGTCAGTCGGTAATCGCTGGTTTTAACTAAGGAGACACTGCTGGCACCGATGGCCGGG
GAGACCATTATGTATTCGGAGTGCCTCCGTTGTGGTGAATAACCAGGACTAATGAGGCCA
ACATAATACTAGACGTATACTATTTAGTGCGCT
>Rosalind_6087
ATTCGATGAATTTCCTCGATAGCGGCTCCGATTTAACACTACCTTGCCTTGACTCTCTAC
ACAGTAAGTACCCCCCGCAACTGGGGGACATTTTAGTGGCCCTTTGCGGAGTAGGGGTGT
TAGGTGTCGGCGTAAAGCGGATTCGATCAAACCCTGATCATCGGCTGAAATGGCCTCGAC
GGTGCTACTCTCAGTGACCTGCTGTTCCCGTAGCCTTTTAATACTCAATCCCTCGATCCG
CTATTCGACCAATCTCGAACTTGAATTCGGTGCGAATGAAACTCCAGTACGGTATGGCTT
GGACCGACGACGGAAGGAACTGCAACGTACCGACTTAATTTGGCTTCAATTCCTACCGAG
CATCATGCGGAAGCTACGCAATTGGATCTCAACAACCCCAAGAGACATTATAGTAGGACA
CACTTTATGGGATGCCGGGGACGGCATCTTCTGCAGGTTGGGAGGGCATCTTGCCTAGGT
GCCAACCTTCGGACGCTCAATGCTCTTACGGTCGGCAGGCTGTTCACGGAGGGCCTTATT
GGAAAAAGGTTATTTCACAAACGTTAAGTCCCTCAGATGACGTCTTGCGTCTCGCCAAGC
CTTTCTAGCTCCCGTCCAGGGCTTGAGCTTTCTTGACACGATAGCTTCCACGTTGACTCT
GAAAATCTCGAAAAACCGAAGGGGAGAGATGCGTCTTGGATCGTCCATAATGCTTCAGAC
GCTTCTAGCCTACCAGGTTGGTTAACAAGTTAATCCGCTAACTTATTGGCGCGTGAGCGA
CAGGACCGCGTCAGACTCATAGATACAGGGCTCATGGGGGCTATGTGTCTAATATGATCG
GCGACAAAGAGTTATGTAATGGCTTGGCTAGGAGACATAAAGGGGGACTTGATAGCGTTT
ACGAGCCTGTTCGGCCTCCCAAAGTTAACTAGATGAGACAGGATGTGCCCCGACACCCAC
GACTTCGTAAGGTAGAATAACGGACATAAGTCC
>Rosalind_4481
AAGGTGCTCAGAGACCTCGTTATGGATTGGTAACTATAGCAATTGCTTAAATCACGTTGT
TCAAATTTTGGGAACTGAATATGCTTCGGGCAATAGTATGAGTAGTCTAAATTGGGGAGT
GTAAGTGCGATTGGACACCACAAAGACAGGTAGTGAATGGGAGAGATTTGTTTGTAGCGC
GTTCGTGCGCGGGACGAGAAATGAATATCCTATTATCTGAAACCCGCCGCTGGGGCTGTA
GCGCCAAGAGCTTTCAGCGGGAGCTCCATGCGTGGAATCTTGCATCTACAATCACATATT
GGTAAGTAGCAACACTGACTGCAAGTACCACTCCCAGGAGAAGACTAGCCATTCAGTGTC
GCCGCTCACAAAGGGCGTAAAATGACATTCATGACGGCTAGCAGCGGACCACGATCCGTG
GCTCGCCGACACTCGGAACCATTCTTGTCTAATAGCTCAGCCCCAGGCTTTTCAACAGGG
GGCGACGCGACGAGCCTAATCGTTACGGATAAGGAGTGCGCACTAACTCGTCATCGGGGA
TAGACCAATTCTTGGAAAAGCAATCCTTAATATGATAGCTACTTGATGCATCTGTCGGCC
GGGGGACTGGACTGTCCTGAAATTGCTTAGGACTATATTTGAGCTTCCACTCCCACCCAG
GGGTGAGCAGATCCTGCCAAACGCGTATCCACTTAGATAAGCTCTTTAGCAAGGGGGCAG
CCTTTTTTCATCATGGTCTGCATTCGTGACTGAAATAATTCATCTCCACTGTACGTTACC
ATACCCTGACCACAATTTTTCCCAATGGGGTCATGCAAACGTACACACGTTTTGCGGCTG
GCTGAATTGCCGACTCATTTGTCCCGTATGCTAGCCCTGCTTGGATTCATAATTGTCTCG
CTCCGGACGTATTCGGGCCTGTGACAATCTTCCCACCTCATAGAACGCCCCAGAATACTC
GTTTTGCTGATGTCGCAGAACATTCTCCTCAGA
>Rosalind_0954
CTAATCTTGCGAATCAATCACAGGTGCGTTGATCCAGAGTCGTAGTTTTACAGTATGCAA
TGTATATTCTTTCTGATGGGACGAGTTTGCATGCAGTAGTTGGGTACTATGCCAGTGCGA
GACCGTCCCTCACCTAAATGCTATGCAGGGTTTCTCTACGATCAAATAGTCAAGTTGCTC
AGCCTCATCACATTGTGAATCACGGACAGACTGTAATTGTCAGCGTGTTCTCTAGGCAAA
TCGCCTTCCTTCTATCGACCTCCTTAGGTCCCCGTGAGGATCTCCTTATCCTGAAAAGTA
CAATCGGATACTTAGATTCTTCGCTCACTCTAATAGGTGGCTATACAGAAGTTTTATGGA
TAAGGGGTGTACGAAATCTTCGAGGGTGTATACCGCTGCTAGAACTCCATACATGATAAC
AACCAATCCTTAGCTAGTATACGAGGGATATGATAACGTTCCACCACCTCTTAAACTTTT
AAATTTGATCGCGGGTGGCCGTCGAAGTGTACGTATGAGATTGGGGCGGTTGTAGTTGCC
AGTGAAAGGCATATGCGGATGGCCTTTGGGTCCTGGTCATTCTTTCTCGCAGGTCGAGCC
AGTGCCTCAAATGAAATTTTCTCCTTAGCAACGACTCCTTAGTTAGAGAAACCAATCCCC
CCATGCCTGCGGATCGTGGTCAGCATGACGTCTGGTTGAACCCTTAGCTGAACAGATGGC
GTATTGCCGTACGAGGGGACCTTATAGGCGGCCTACCACACCAGACGAAGAGTCCGAAGG
TACGCCAAACGCATATTCAGGACGTAAGTGGGAGGACCCTGAGCCTCATTGCCGACTGAA
GGTGAATCGCTGGCCCACTGCTAGTTCCTCCCTTCGCTAATGGTCACGGGAATATCGCCA
CCTCGTCGATGACGCTCGATTAGACCTGTAGGAACACAACATACTAGGTGGACACGGGAC
ACCGATTTACCCACGCCGGACAGTCGTTCTTAT
>Rosalind_3750
ACAGTGTCATGGGATCTGGAGACGTATCCAAGCTAAACGCGCGTTCTATACAGACGTCGA
AACACGGGGGGCGAACTGCTTTAGCGACATGCTCTTACTGAAGTCTAGACGCTAAGGGCT
TTAGACAGCGAATAGTGGTTGATAGGTATTGAGCCATCCGTGTAGAGCGTTAGAAGGCCA
CGGCTTACTTGGTTAAAAGCTGATTTGGGCGGTTACATTCTGGGGTTTAAATACTATCGA
GTATCGATGCTTTTCTATGTATTGAAGACTGGTAAGCTTTCCCCGACCAGGTCGCGCCAT
CGTACCTTCTGGGGAAACTAATGCGGCTGAGTCGGCGACTTCAGGATGTCCCGATACACG
CAGCGTCACAGGTAAACTCGCCTTATAACGCGTCCCCGTCGATAAGGCCGACCCTTTCAG
ATGCGCGGTGCTCCTTCGATTGTTGACGACGCCATCCGAGGTCCAGACGTCTGAGGCCAC
GTGATCGGCCCCCTGTTACTGAGAAGCAGATTACCCCTAAGAATCGTCCGTCGCCTAGTA
GTTGCCGCAACCGACGATACTTCTCCAACATAATCTAGCGTATTTATCAAAGCGTCGTCG
TATCTAGCCTTACGGACGTAATACGAATACCCCCTGCTCAGTGGGCATGTAATACGCCAA
CCAAAAACACGCCAGTTACGAGGAGTGGCACTGCTATAAACCTAGATGAGATCGCTGATG
CCACGAGGAACCTTAGTTGAGTCCGCTGAACCCGCCAGTTGGCTTTGCAGGTCCGCGTTG
TTACTATGACTAAAATATATGATGGATACGCGGACCACTCCTACAGATGCTAAAAGTCAA
ACCGGCACCTATTAGATTTTTAACGGTGCACTTCTAACCGACATAGCCCGCGACCAGGGG
TGAAATTGCATTACATACGATATGATCGCTCCCAGGTCAATGACCACTTGACCTGTGAGT
TTGCTTATTAAGGTGGCTTTAGGCAGCGTAAGC
>Rosalind_9350
ATGAATTTTTAGCGCAAATGAACCGCCTGCTTCCATTAAGTCCCCGCTGCAGAAACCTCG
TTTGTATTCAGAAAGTTCACCTGACAACGGGGCATAGGGTAAATAGATGCTATGTAAATC
TTAGGGCTTACGCGGCGACTTTGACTTTTTCAGCGAACAGAGGCGAAGGCGACCAGCGTC
ATAGGTCTTCATACCGAAACAACAGGGGAGCATGGCCAATCACTGTCACTAACTCACGGG
ACTCCGCCTTGCTCGCCGGTGCCATATCGTACTGACGTAACTCATTGAATTCCATAGAAC
TTGGTTTAGGCCACCTCCGCCGAAACCCGTGGTGGTAAGTCAAGCGAGGACACCGGAAAT
TCCGACCCCGGTTCCCAACACAGGGCTATTCATCACATTTGGTGTACGTATTGATCCTTA
ATTGCCAGAGTCCTACTCGTTGATGTACGATCCACTTAAGTAAGGTCGGGCGTTCTACCG
CGCGGCGCATACCGGACATTATAGCTTAGGCCCCCCAGCTCTATTGTTATTACTATATCC
CTAATTCTAGAAGGGAAATTGTAAGATCAATTCCCGGCAGGTGGGCAGGAACAGACGTCG
AGCACCATTCGTAGTAAAGGTCTTTCTCGGTGTGTAGCGTTGACAAATCTGCAACCCAAC
CTTGTACTCTTCGCTGAACAATAGGTGCATTTCAAGACCGAGCTTGGCGCTGTTTCCTGA
CTGCAGCATGGGCAAAATTCTCGTAGGCAAGTGATCAATTAGCGGAACGCATTGGAAAAA
TTTGTTGGCACAATCCGGCACAGGTACTGATACCCCTCGATGTCGCAGTGCCGAGTCACC
CATCGCATGATCTGAGGTTGGTGCTGCCAGCGCTCTCCGAACAGGAGTCGTAGTTGCACT
CATGGCCGCTTTACGACGGGAGAAACTTACAGTAGCCTTGTAACAACTTTGTAAATCGTT
CATGGACTATCGTGAGGCAGACTTCTATTGTCC
>Rosalind_6074
CGAGGTAACAGTTGTCCGTTCTTTGTAGATTGCCTGGGGTGAAGGTACTAGTTAGCAATG
ATCAGAAGAAAATAGAGCCAGCCGGACTCTCGGGGCGGTACCAGGGTCGAGGAATCTGGG
TAAGTTTCCTATGTGATGAACAGGGTTTTCGATGGTAACGATGTGAACGACCCTGGGTCG
GGTTCAGCCCTCCTAACGAAACACGTGCTTCAGAAAAATAGTTGCAACCTGTTGTTGTCA
ACCTAGTCCTATAGAGTATGTTACTCGGCTATACTCAGGACCTATCCAGACCGCCACTCT
TTCTCTGTGTTAAAACCCCACCATATAAGATCCGTCCTCCCTTTTCACCGCCTTTACAGC
AGGGAGCCGTTGAGCAGGGCCAATGACGCCAAGACTTTACTAAAGTGACTGGTAGGTTCA
TTCTACCTATCCCTTTGCGTATTGATGTTTAGTCTGGTTTCAGGTACAGGTAAACCAGGT
GGCTGGTGCCATACTCGCTAAACAAATGTGGGGGCGCGAAAGATCTGGTGCAGGTTGACT
ACGATTTTATAGAGCAGTACACCGTGCTAGTCAGCATGAGTGGAGACACCTGAAATAAGT
GACGAGGTTGTCCAATGTATAGGACGACAGTTGCAGGGTGCACTGCAACAGAGTTATAAC
CATTACGTTGACTTAACACATGATTGTTAAAATGCTTCGACCCAAGACTCGGCGGGTCAA
AGTAAACCATTACGCGCGGGTGTCTGTAGCTACGGGTCAGCAGGGACCTAGCTATTACGA
GATAGGAAGGCCCACGTACCTAGGGGTCCCTTTTTCGGGTCTTTACCTGGTCAGCGAAGC
CCCGAAACGTGAACTCCAGTGATAACAGGTTAACGGCTTCTGGTGACGACTCTATCGAGT
TGTCAATGTAGCTTACAGGTACTATCGGGAATAATGTCGGGGGTGAACGTTGCGGTTTAA
AGTGGCTCAGCAAGCATATACACCTAGGTTGCG
Try using format strings:
f'{expression}'
str.join()
dict.items()
Code:
d = {'A': [5, 3, 3, 3, 1, 4, 2, 1, 2, 3], 'C': [2, 1, 3, 2, 1, 2, 2, 1, 3, 3], 'G': [1, 3, 2, 4, 3, 2, 1, 3, 3, 0], 'T': [2, 3, 2, 1, 5, 2, 5, 5, 1, 3]}
for k, v in d.items(): #loop over your output
g = " ".join(str(v) for v in v) #join list values
print(f'{k:}: {g:2}') #format text
Result:
A: 5 3 3 3 1 4 2 1 2 3
C: 2 1 3 2 1 2 2 1 3 3
G: 1 3 2 4 3 2 1 3 3 0
T: 2 3 2 1 5 2 5 5 1 3

prediction to actual label and export result to csv

import pandas as pd
import os
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
predictions = model.predict_generator(Br_test_generator, steps=test_steps_per_epoch)
predicted_classes = np.argmax(predictions, axis=1)
predicted_classes
output= array([3, 1, 0, 3, 5, 0, 0, 0, 6, 0, 0, 3, 6, 0, 1, 0, 0, 2, 2, 2, 2, 2,
1, 1, 0, 2, 2, 6, 0, 0, 0, 1, 1, 0, 0, 2, 0, 1, 1, 1, 1, 1, 1, 1,
6, 0, 5, 1, 3, 1, 0, 2, 2, 1, 1, 1, 1, 2, 2, 2, 4, 1, 5, 1, 0, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 5, 2, 2, 5, 2, 5, 5, 5, 2, 2, 2, 2,
1, 3, 5, 5, 2, 2, 5, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 2, 1, 2, 1, 5,
2, 2, 2, 5, 3, 1, 3, 3, 1, 3, 3, 3, 1, 1, 0, 1, 5, 0, 2, 5, 5, 4,
4, 4, 4, 4, 6, 4, 4, 4, 5, 0, 4, 4, 4, 4, 4, 5, 6, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 1, 2, 2, 5, 5, 6, 5,
5, 6, 1, 6, 4, 5, 4, 1, 4, 5, 0, 2, 5, 5, 5, 2, 2, 2, 6, 6, 5, 6,
6, 6, 6, 6, 4, 6, 2, 6, 6, 2, 0, 2, 5, 6, 6, 6, 4, 4, 0, 6])
true_classes = Bre_test_generator.classes
class_labels = list(Bre_test_generator.class_indices.keys())
class_labels
output=['1B', '2B', '3B', 'CA', 'FB', 'MB', 'NB']
I want my predicted_classes to match the corresponding class_labels and I also want to output the result in csv.
I want my csv to have two columns: the image ID and the predicted classs_labels

reshape numpy 3D array to 2D

I have a very big array with the shape = (32, 3, 1e6)
I need to reshape it to this shape = (3, 32e6)
On a snippet, how to go from this::
>>> m3_3_5
array([[[8, 4, 1, 0, 0],
[6, 8, 5, 5, 2],
[1, 1, 1, 1, 1]],
[[8, 7, 1, 0, 3],
[2, 8, 5, 5, 2],
[1, 1, 1, 1, 1]],
[[2, 4, 0, 2, 3],
[2, 5, 5, 3, 2],
[1, 1, 1, 1, 1]]])
to this::
>>> res3_15
array([[8, 4, 1, 0, 0, 8, 7, 1, 0, 3, 2, 4, 0, 2, 3],
[6, 8, 5, 5, 2, 2, 8, 5, 5, 2, 2, 5, 5, 3, 2],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])
I did try various combinations with reshape with no success::
>>> dd.T.reshape(3, 15)
array([[8, 8, 2, 6, 2, 2, 1, 1, 1, 4, 7, 4, 8, 8, 5],
[1, 1, 1, 1, 1, 0, 5, 5, 5, 1, 1, 1, 0, 0, 2],
[5, 5, 3, 1, 1, 1, 0, 3, 3, 2, 2, 2, 1, 1, 1]])
>>> dd.reshape(15, 3).T.reshape(3, 15)
array([[8, 0, 8, 2, 1, 8, 0, 8, 2, 1, 2, 2, 5, 2, 1],
[4, 0, 5, 1, 1, 7, 3, 5, 1, 1, 4, 3, 5, 1, 1],
[1, 6, 5, 1, 1, 1, 2, 5, 1, 1, 0, 2, 3, 1, 1]])
a.transpose([1,0,2]).reshape(3,15) will do what you want. (I am basically following comments by #hpaulj).
In [14]: a = np.array([[[8, 4, 1, 0, 0],
[6, 8, 5, 5, 2],
[1, 1, 1, 1, 1]],
[[8, 7, 1, 0, 3],
[2, 8, 5, 5, 2],
[1, 1, 1, 1, 1]],
[[2, 4, 0, 2, 3],
[2, 5, 5, 3, 2],
[1, 1, 1, 1, 1]]])
In [15]: a.transpose([1,0,2]).reshape(3,15)
Out[15]:
array([[8, 4, 1, 0, 0, 8, 7, 1, 0, 3, 2, 4, 0, 2, 3],
[6, 8, 5, 5, 2, 2, 8, 5, 5, 2, 2, 5, 5, 3, 2],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])
You can get the desired behavior with np.hstack
# g is your (3,3,5) array from above
reshaped = np.hstack(g[i,:,:] for i in range(3)) #uses a generator exp
reshaped_simpler = np.hstack(g) # this produces equivalent output to the above statmement
print reshaped # (3,30)
Output
array([[8, 4, 1, 0, 0, 8, 7, 1, 0, 3, 2, 4, 0, 2, 3],
[6, 8, 5, 5, 2, 2, 8, 5, 5, 2, 2, 5, 5, 3, 2],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])

Adding an array to the end of another Python

I'm very new to python and I have been faced with the task of taking several arrays into another array, this is inside of a loop.
So if you had
a = np.array([2,3,4,3,4,4,5,3,2,3,4])
and
b = np.array([1,1,1,1,1,2,23,2,3,3,3])
and
c = np.array([])
and wanted the result
c = [[2,3,4,3,4,4,5,3,2,3,4],
[1,1,1,1,1,2,23,2,3,3,3]]
so if I did c[0,:] I would get [2,3,4,3,4,4,5,3,2,3,4]
I tried using c = [c, np.array(a)] then next iteration you get c = [c, np.array(b)]
but I i do c[0,:] i get the error message list indices must be integers not tuples
EDIT:
When I get it to print out c it gives [array([2,3,4,3,4,4,5,3,2,3,4],dtype = unit8)]
Do you have any ideas?
In [10]: np.vstack((a,b))
Out[10]:
array([[ 2, 3, 4, 3, 4, 4, 5, 3, 2, 3, 4],
[ 1, 1, 1, 1, 1, 2, 23, 2, 3, 3, 3]])
EDIT: Here's an example of using it in a loop to gradually build a matrix:
In [14]: c = np.random.randint(0, 10, 10)
In [15]: c
Out[15]: array([9, 5, 9, 7, 3, 0, 1, 9, 2, 0])
In [16]: for _ in xrange(10):
....: c = np.vstack((c, np.random.randint(0, 10, 10)))
....:
In [17]: c
Out[17]:
array([[9, 5, 9, 7, 3, 0, 1, 9, 2, 0],
[0, 8, 1, 9, 7, 5, 4, 2, 1, 2],
[2, 1, 4, 2, 9, 6, 7, 1, 3, 2],
[6, 0, 7, 9, 1, 9, 8, 5, 9, 8],
[8, 1, 0, 9, 6, 6, 6, 4, 8, 5],
[0, 0, 5, 0, 6, 9, 9, 4, 6, 9],
[4, 0, 9, 8, 6, 0, 2, 2, 7, 0],
[1, 3, 4, 8, 2, 2, 8, 7, 7, 7],
[0, 0, 4, 8, 3, 6, 5, 6, 5, 7],
[7, 1, 3, 8, 6, 0, 0, 3, 9, 0],
[8, 5, 7, 4, 7, 2, 4, 8, 6, 7]])
Most numpythonic way is using np.array:
>>> c = np.array((a,b))
>>>
>>> c
array([[ 2, 3, 4, 3, 4, 4, 5, 3, 2, 3, 4],
[ 1, 1, 1, 1, 1, 2, 23, 2, 3, 3, 3]])
You may try this:
>>> c = [list(a), list(b)]
>>> c
[[2, 3, 4, 3, 4, 4, 5, 3, 2, 3, 4], [1, 1, 1, 1, 1, 2, 23, 2, 3, 3, 3]]
You can concatenate arrays in numpy. For this to work, they must have the same size in all dimensions except the concatenation direction.
If you just say
>>> c = np.concatenate([a,b])
you will get
>>> c
array([ 2, 3, 4, 3, 4, 4, 5, 3, 2, 3, 4, 1, 1, 1, 1, 1, 2,
23, 2, 3, 3, 3])
So in order to achieve what you want you first have to add another dimension to your vectors a and b like so
>>> a[None,:]
array([[2, 3, 4, 3, 4, 4, 5, 3, 2, 3, 4]])
or equivalently
>>> a[np.newaxis,:]
array([[2, 3, 4, 3, 4, 4, 5, 3, 2, 3, 4]])
So you could do the following:
>>> c = np.concatenate([a[None,:],b[None,:]],axis = 0)
>>> c
array([[ 2, 3, 4, 3, 4, 4, 5, 3, 2, 3, 4],
[ 1, 1, 1, 1, 1, 2, 23, 2, 3, 3, 3]])

convert matrix to image

How would I go about going converting a list of lists of ints into a matrix plot in Python?
The example data set is:
[[3, 5, 3, 5, 2, 3, 2, 4, 3, 0, 5, 0, 3, 2],
[5, 2, 2, 0, 0, 3, 2, 1, 0, 5, 3, 5, 0, 0],
[2, 5, 3, 1, 1, 3, 3, 0, 0, 5, 4, 4, 3, 3],
[4, 1, 4, 2, 1, 4, 5, 1, 2, 2, 0, 1, 2, 3],
[5, 1, 1, 1, 5, 2, 5, 0, 4, 0, 2, 4, 4, 5],
[5, 1, 0, 4, 5, 5, 4, 1, 3, 3, 1, 1, 0, 1],
[3, 2, 2, 4, 3, 1, 5, 5, 0, 4, 3, 2, 4, 1],
[4, 0, 1, 3, 2, 1, 2, 1, 0, 1, 5, 4, 2, 0],
[2, 0, 4, 0, 4, 5, 1, 2, 1, 0, 3, 4, 3, 1],
[2, 3, 4, 5, 4, 5, 0, 3, 3, 0, 2, 4, 4, 5],
[5, 2, 4, 3, 3, 0, 5, 4, 0, 3, 4, 3, 2, 1],
[3, 0, 4, 4, 4, 1, 4, 1, 3, 5, 1, 2, 1, 1],
[3, 4, 2, 5, 2, 5, 1, 3, 5, 1, 4, 3, 4, 1],
[0, 1, 1, 2, 3, 1, 2, 0, 1, 2, 4, 4, 2, 1]]
To give you an idea of what I'm looking for, the function MatrixPlot in Mathematica gives me this image for this data set:
Thanks!
You may try
from pylab import *
A = rand(5,5)
figure(1)
imshow(A, interpolation='nearest')
grid(True)
source
Perhaps matshow() from matplotlib is what you need.
You can also use pyplot from matplotlib, follows the code:
from matplotlib import pyplot as plt
plt.imshow(
[[3, 5, 3, 5, 2, 3, 2, 4, 3, 0, 5, 0, 3, 2],
[5, 2, 2, 0, 0, 3, 2, 1, 0, 5, 3, 5, 0, 0],
[2, 5, 3, 1, 1, 3, 3, 0, 0, 5, 4, 4, 3, 3],
[4, 1, 4, 2, 1, 4, 5, 1, 2, 2, 0, 1, 2, 3],
[5, 1, 1, 1, 5, 2, 5, 0, 4, 0, 2, 4, 4, 5],
[5, 1, 0, 4, 5, 5, 4, 1, 3, 3, 1, 1, 0, 1],
[3, 2, 2, 4, 3, 1, 5, 5, 0, 4, 3, 2, 4, 1],
[4, 0, 1, 3, 2, 1, 2, 1, 0, 1, 5, 4, 2, 0],
[2, 0, 4, 0, 4, 5, 1, 2, 1, 0, 3, 4, 3, 1],
[2, 3, 4, 5, 4, 5, 0, 3, 3, 0, 2, 4, 4, 5],
[5, 2, 4, 3, 3, 0, 5, 4, 0, 3, 4, 3, 2, 1],
[3, 0, 4, 4, 4, 1, 4, 1, 3, 5, 1, 2, 1, 1],
[3, 4, 2, 5, 2, 5, 1, 3, 5, 1, 4, 3, 4, 1],
[0, 1, 1, 2, 3, 1, 2, 0, 1, 2, 4, 4, 2, 1]], interpolation='nearest')
plt.show()
The output would be:

Categories