I have a function to build adjacency matrix. I want to improve matrix readability for humans, so I decided to print row index like this:
Now I want to print column index in the same way, but I can't do it properly. best result I get is this:
Any Ideas and suggestions how i can print column indexes neatly?
Source code here.
def generate_adjacency_matrix(vertices):
# Create empty Matrix
matrix = [['.' for _ in range(len(vertices))] for _ in range(len(vertices))]
# Fill Matrix
for row in range(len(matrix)):
for num in range(len(matrix)):
if num in vertices[row]:
matrix[row][num] = '1'
# Print column numbers
numbers = list(range(len(matrix)))
for i in range(len(numbers)):
numbers[i] = str(numbers[i])
print(' ', numbers)
#Print matrix and row numbers
for i in range(len(matrix)):
if len(str(i)) == 1:
print(str(i) + ' ', matrix[i])
else:
print(i, matrix[i])
If it matters Parameter in my function is a dictionary that looks like:
{0:[1],
1:[0,12,8],
2:[3,8,15]
....
20:[18]
}
If you know you're only going to 20, then just pad everything to 2 chars:
For the header row:
numbers[i] = str(numbers[i].zfill(2))
For the other rows, set to ". " or ".1" or something else that looks neat.
That would seem to be the easiest way.
Alternative way is to have 2 column headers, one above the other, first one is the tens value, second is the unit value. That allows you to keep the width of 1 in the table as well, which maybe you need.
Related
I am trying to apply the following code to this column:
Test
"Find the number behind the lines"
"Look at the sky"
"It is a such wonderful day today"
In the following code, docs are a list of documents; in my case they should be the rows in Test column.
D = np.zeros((len(docs), len(docs)))
for i in range(len(docs)):
for j in range(len(docs)):
if i == j:
continue
if i > j:
D[i, j] = D[j, i]
How can I apply it to my column?
In my code, I am assuming your list of strings/rows (each a list-of-words) is docs, to calculate the array of pairwise distances D with the code above. The problem is in applying it to a column.
The expected output (but I cannot determine with the code above, unfortunately) would be the similarity of reference sentence and other sentences. i,j are my indices and they run through each row in the column Test. The algorithm I am going to use is the mover's distance.
def f(docs):
D = np.zeros((len(docs), len(docs)))
for i in range(len(docs)):
for j in range(len(docs)):
if i == j:
continue
if i > j:
D[i, j] = D[j, i]
df.Test.apply(lambda x: f(x))
from your question i understood u want ro rename your row and column with strings of docs. if that's right try this
docs=["Find the number behind the lines","Look at the sky","It is a such wonderful day today"]
D = np.zeros((len(docs), len(docs)))
df=pd.DataFrame(D,columns=docs,index=docs)
print(df)
I need to check if the numbers in gradescale is in my NxM matrix as a numpy array, if example the number 8 is in my matrix, I would like to append the number to a empty list and the row number to another list
So how do i check if the number in my matrix isn't in gradescale, i have tried different types of loops, but they dont work.
wrongNumber = []
Rows = []
gradeScale = np.array([-3,0,2,4,7,10,12])
if there is a number i matrix which is not i gradeScale
wrongNumber.append[number]
Rows.append[rownumber]
print("the grade {} in line {} is out of range",format(wrongNumber),
format(Rows))
You can use numpy.ndarray.shape to go through your rows.
for row in range(matrix.shape[0]):
for x in matrix[row]:
if x not in gradeScale:
wrongNumber.append(x)
Rows.append(row)
In addition, you do not use format correctly. Your print statement should be
print("The grade {} in line {} is out of range".format(wrongNumber, Rows))
The following post has some more information on formatting String formatting in Python .
Example
import numpy as np
wrongNumber = []
Rows = []
matrix = np.array([[1,2],[3,4],[5,6],[7,8]])
gradeScale = [1,3,4,5,8]
for row in range(matrix.shape[0]):
for x in matrix[row]:
if x not in gradeScale:
wrongNumber.append(x)
Rows.append(row)
print("The grades {} in lines {} (respectively) are out of range.".format(wrongNumber, Rows))
Output
The grades [2, 6, 7] in lines [0, 2, 3] (respectively) are out of range
Probably a for loop with enumerate() is what you are looking for.
Example:
for rowNumber, number in enumerate(matrix)
if number not in gradeScale:
wrongNumber.append[number]
Rows.append[rowNumber]
I have a exercise ,Input data will contain the total count of pairs to process in the first line.
The following lines will contain pairs themselves - one pair at each line.
Answer should contain the results separated by spaces.
My code:
n = int(raw_input())
sum = 0
for i in range(n):
y = raw_input().split(" ")
for i in y:
sum = sum + int(i)
print sum
With my code , I come the Sum together, but I will that the results to come separated by spaces . Thanks for yours help .
with your current code what you get is the total sum of all the given numbers, to get the sum per line you need to initialize your counter in the outer loop, and then print it, and as you want to print all it in the same line there are several ways to do it, like save it in a list or telling print that don't print a new, line which is done by adding a , at the end like print x, with that in mind then the changes needed are
n = int(raw_input())
for i in range(n):
pairs = raw_input().split() #by default split use spaces
pair_sum = 0
for p in pairs:
pair_sum += int(p) # a += b is the same as a = a + b
print pair_sum,
print "" # to print a new line so any future print is not done in the same line as the previous one
that was the version with print per line, next is the version using list
n = int(raw_input())
resul_per_line = []
for i in range(n):
pairs = raw_input().split() #by default split use spaces
pair_sum = 0
for p in pairs:
pair_sum += int(p) # a += b is the same as a = a + b
resul_per_line.append( str(pair_sum) ) #conver each number to a string to use with join bellow
print " ".join(resul_per_line)
with either of the above let said for example that the input data is
3
1 2
40 50
600 700
then the result would be
3 90 1300
some parts of the above code can be simplify by using built in functions like map and sum, for example this part
pair_sum = 0
for p in pairs:
pair_sum += int(p)
can become
pair_sum = sum( map(int,pairs) )
Uh oh, it looks like you're reusing the same variable i in the inner loop as the outer loop -- this is bad practice and can lead to bugs down the road.
What you're doing currently is adding both elements in each pair to sum and then printing that at the end, you can fix this in two different ways.
You can sum each pair, convert the sum to a string, and then concatenate that with your the rest of the sums as strings, or
You can print the sum of each pair immediately after summing them with print sum, which will print the number without the newline so that you can print all the results on a single line.
So I have defined a two dimension list in python using:
column = 3
row = 2
Matrix = [['' for i in range(column)] for j in range(row)]
Then I started adding values to it:
Matrix[0][0] += 'A'
Matrix[1][0] += 'AB'
Matrix[2][0] += 'ABC'
Matrix[0][1] += 'X'
Matrix[1][1] += 'XY'
Matrix[2][1] += 'XYZ'
Then I started printing with hope of some sort of format:
for i in range(0, row, 1):
for j in range(0, column, 1):
print(Matrix[i][j] + '\t')
I was thinking of get result like
A AB ABC
X XY XYZ
But actually I got:
A
AB
ABC
X
XY
XYZ
Just wondering what is wrong with my code...
the print function adds a newline to the end.
a way to signal that you dont want a new line is to use add a comma at the end
Python 3
print(Matrix[i][j],"\t",)
Python 2.7
print Matrix[i][j],"\t",
You only want a new line each i, and not each j. Generally they're implicit, so you need to specify no newline:
Python 3:
for i in range(0, row, 1):
for j in range(0, column, 1):
print(Matrix[i][j] + '\t', end="") # <-- end="" means no newline
print('') # <-- implicit newline, only in row loop
Python 2:
for i in range(0, row, 1):
for j in range(0, column, 1):
print Matrix[i][j] + '\t', # <-- comma at the end means no newline
print('') # <-- implicit newline, only in row loop
You can use the sep argument in python3:
for row in zip(*Matrix):
print(*row, sep='\t')
You have the rows as columns in your matrix, so you'll need to zip it to get at the rows first.
Then, you can print the individual elements in the row, with a TAB between them
In python2, this would be:
import itertools
for row in itertools.izip(*Matrix):
print('\t'.join(row))
First you may want to check how you added your values. While
iterating over the row, you will receive an 'IndexError: list out of range'
Adding your values should look like this:
Matrix[0][0] = 'A'
Matrix[0][1] = 'AB'
Matrix[0][2] = 'ABC'
Matrix[1][0] = 'X'
Matrix[1][1] = 'XY'
Matrix[1][2] = 'XYZ'
After that, then all you have to do is iterate over your rows using the method join() on each row.
for i in range(row):
print '\t'.join(Matrix[i])
That will print your desired result:
A AB ABC
X XY XYZ
I have a nested list comprehension which has created a list of six lists of ~29,000 items. I'm trying to parse this list of final data, and create six separate dictionaries from it. Right now the code is very unpythonic, I need the right statement to properly accomplish the following:
1.) Create six dictionaries from a single statement.
2.) Scale to any length list, i.e., not hardcoding a counter shown as is.
I've run into multiple issues, and have tried the following:
1.) Using while loops
2.) Using break statements, will break out of the inner most loop, but then does not properly create other dictionaries. Also break statements set by a binary switch.
3.) if, else conditions for n number of indices, indices iterate from 1-29,000, then repeat.
Note the ellipses designate code omitted for brevity.
# Parse csv files for samples, creating a dictionary of key, value pairs and multiple lists.
with open('genes_1') as f:
cread_1 = list(csv.reader(f, delimiter = '\t'))
sample_1_values = [j for i, j in (sorted([x for x in {i: float(j)
for i, j in cread_1}.items()], key = lambda v: v[1]))]
sample_1_genes = [i for i, j in (sorted([x for x in {i: float(j)
for i, j in cread_1}.items()], key = lambda v: v[1]))]
...
# Compute row means.
mean_values = []
for i, (a, b, c, d, e, f) in enumerate(zip(sample_1_values, sample_2_values, sample_3_values, sample_4_values, sample_5_values, sample_6_values)):
mean_values.append((a + b + c + d + e + f)/6)
# Provide proper gene names for mean values and replace original data values by corresponding means.
sample_genes_list = [i for i in sample_1_genes, sample_2_genes, sample_3_genes, sample_4_genes, sample_5_genes, sample_6_genes]
sample_final_list = [sorted(zip(sg, mean_values)) for sg in sample_genes_list]
# Create multiple dictionaries from normalized values for each dataset.
class BreakIt(Exception): pass
try:
count = 1
for index, items in enumerate(sample_final_list):
sample_1_dict_normalized = {}
for index, (genes, values) in enumerate(items):
sample_1_dict_normalized[genes] = values
count = count + 1
if count == 29595:
raise BreakIt
except BreakIt:
pass
...
try:
count = 1
for index, items in enumerate(sample_final_list):
sample_6_dict_normalized = {}
for index, (genes, values) in enumerate(items):
if count > 147975:
sample_6_dict_normalized[genes] = values
count = count + 1
if count == 177570:
raise BreakIt
except BreakIt:
pass
# Pull expression values to qualify overexpressed proteins.
print 'ERG values:'
print 'Sample 1:', round(sample_1_dict_normalized.get('ERG'), 3)
print 'Sample 6:', round(sample_6_dict_normalized.get('ERG'), 3)
Your code is too long for me to give exact answer. I will answer very generally.
First, you are using enumerate for no reason. if you don't need both index and value, you probably don't need enumerate.
This part:
with open('genes.csv') as f:
cread_1 = list(csv.reader(f, delimiter = '\t'))
sample_1_dict = {i: float(j) for i, j in cread_1}
sample_1_list = [x for x in sample_1_dict.items()]
sample_1_values_sorted = sorted(sample_1_list, key=lambda expvalues: expvalues[1])
sample_1_genes = [i for i, j in sample_1_values_sorted]
sample_1_values = [j for i, j in sample_1_values_sorted]
sample_1_graph_raw = [float(j) for i, j in cread_1]
should be (a) using a list named samples and (b) much shorter, since you don't really need to extract all this information from sample_1_dict and move it around right now. It can be something like:
samples = [None] * 6
for k in range(6):
with open('genes.csv') as f: #but something specific to k
cread = list(csv.reader(f, delimiter = '\t'))
samples[k] = {i: float(j) for i, j in cread}
after that, calculating the sum and mean will be way more natural.
In this part:
class BreakIt(Exception): pass
try:
count = 1
for index, items in enumerate(sample_final_list):
sample_1_dict_normalized = {}
for index, (genes, values) in enumerate(items):
sample_1_dict_normalized[genes] = values
count = count + 1
if count == 29595:
raise BreakIt
except BreakIt:
pass
you should be (a) iterating of the samples list mentioned earlier, and (b) not using count at all, since you can iterate naturally over samples or sample[i].list or something like that.
Your code has several problems. You should put your code in functions that preferably do one thing each. Than you can call a function for each sample without repeating the same code six times (I assume that is what the ellipsis is hiding.). Give each function a self-describing name and a doc string that explains what it does. There is quite a bit unnecessary code. Some of this might become obvious once you have it in functions. Since functions take arguments you can hand in your 29595, for example.