Python-2: Assign line in text file to a variable - python

I'm currently trying to make a scoreboard for a competition from a text file for a uni project.
Each competitor has a competitor_no, competitor_name, and 3 scores they are all displayed on separate lines in the text file (so each competitors data takes 5 lines) eg:
1
Eliza Ianson
9.19
11.79
21.66
2
Cece Durant
17.03
7.02
17.72
I need to be able to add the 3 scores to get the overall score.I have 50 competitors and need to compare them all to display the top 3 competitors printing the competitors number,name and overall score. Is there a way to assign the line value to a variable?
Obviously I am going to need to compare 5 lines at a time to get the required information for each individual competitor and process the data this is how I have code set in my program.
file = open('veggies_2014.dat')
for line in file:
first_place = []
second_place = []
third_place = []
a = 1
b = 2
c = 3
d = 4
e = 5
if line == a:
competitor_no = line
elif line == b:
competitor_name = line
elif line == c:
cucumber = line
elif line == d:
carrot = line
elif line == e:
runner_bean = line
a += 5
b += 5
c += 5
d += 5
e += 5
score = float(cucumber) + float(carrot) + float(runner_bean)
print(score)
if score > first:
first = score
first_place.append(competitor_no)
first_place.append(competitor_name)
first_place.append(score)
elif score > second:
second = score
second_place.append(competitor_no)
second_place.append(competitor_name)
second_place.append(score)
elif score > third:
third = score
third_place.append(competitor_no)
third_place.append(competitor_name)
third_place.append(score)
file.close()
print (first_place)
print (second_place)
print (third_place)
I can get the score if statement to work when I am just dealing with a file containing numbers, but having to include the name is where I seem to be stumbling.
Any suggestions?

Since you know the order in which the information appears in the file, try reading the file in blocks of five lines at a time then process it accordingly. In below implementation I store the information in a 2D list which is then sorted by score, thus you get highest score at top.
data = []
count = 1
with open("veggies_2014.dat") as f:
line = f.readline()
while line and int(line) == count:
score = 0
entry = []
entry.append(f.readline()) #read line with c_name
for i in range(3):
score += float(f.readline()) #add scores
entry.append(score)
data.append(entry)
line = f.readline()
count += 1
print(data)
data = sorted(data, key = lambda l:l[1], reverse = True)
print()
print(data)

You could do something like this (not tested - but you get the idea):
basically , every time count goes to 0, you save the score in a personDict with the person's name as the key....
count = 0
c1_no = None;
personDict = dict()
with open("veggies_2014.dat") as f:
score = 0
for line in f:
if count%5==0:
if c1_no:
personDict[c1_no] = score
c1_no = line
score = 0
elif count%5 == 1:
c1_name = line
elif count%5 in (2,3,4):
score += float(line)
count += 1
#now do whatever you want with the personDict..you will have something like {Emily:10} assuming Emily's scores were 2,3,5 etc

Here is what I came up with:
file_content = '''1
Eliza Ianson
9.19
11.79
21.66
2
Cece Durant
17.03
7.02
17.72
3
Foo Bar
10
9.5
11.2
4
Superman
7.9
12.15
9.75
'''
def chunks(l, n):
"""Yield successive n-sized chunks from l."""
""" from http://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks-in-python """
for i in xrange(0, len(l), n):
yield l[i:i+n]
users = []
for user_data in chunks(file_content.splitlines(), 5):
user_id = int(user_data[0])
user_name = user_data[1]
scores = list(map(float, user_data[2:]))
overall_score = sum(scores)
users.append(dict(
user_id=user_id,
user_name=user_name,
scores=scores,
overall_score=overall_score
))
top_3 = sorted(users, key=lambda x: x['overall_score'], reverse=True)[:3]

The finished code I have used with help from some of the answers. Though I have chosen to separate it into different functions in my application so parts can be reused.
data = []
count = 1
print("")
print("For 2014 Results please enter: veggies_2014.txt ")
print("For 2015 Results please enter: veggies_2015.txt")
print("For 2016 Results please enter: veggies_2016.txt ")
fname = raw_input("Enter file name: ")
with open(fname) as f:
line = f.readline()
while line and float(line) == count:
score = 0
entry = []
entry.append(count)
entry.append(f.readline().strip())
for i in range(3):
score += float(f.readline()) #add scores
score = round(score, 2)
entry.append(score)
data.append(entry)
line = f.readline()
count += 1
data = sorted(data, key = lambda l:l[2], reverse = True)
print("")
final=[]
for line in data:
for i in line:
final.append(i)
I added this code to display the scoreboard more cleanly.
a=0
b=3
x=1
for i in range(0,3):
place=[]
place = final[a:b]
string = " ".join(str(x) for x in place)
print (str(x) + ". " + string)
a += 3
b += 3
x += 1
print("")

Related

Looking for resistance genes in water sample using kmers [Python]

I need some help with my code. I need to look for the presence of resistance genes in a water sample. That translates in having a huge file of reads coming from the water sample and a file of resistances genes. My problem is making the code run under 5 minutes, a thing that is not happening right now. Probably the issue relays on discarting reads as fast as possible, on having a smart method to only analyze meaningful reads. Do you have any suggestion? I cannot use any non standard python library
This is my code
import time
def build_lyb(TargetFile):
TargetFile = open(TargetFile)
res_gen = {}
for line in TargetFile:
if line.startswith(">"):
header = line[:-1]
res_gen[header] = ""
else:
res_gen[header] += line[:-1]
return res_gen
def build_kmers(sequence, k_size):
kmers = []
n_kmers = len(sequence) - k_size + 1
for i in range(n_kmers):
kmer = sequence[i:i + k_size]
kmers.append(kmer)
return kmers
def calculation(kmers, g):
matches = []
for i in range(0, len(genes[g])):
matches.append(0)
k = 0
while k < len(kmers):
if kmers[k] in genes[g]:
pos = genes[g].find(kmers[k])
for i in range(pos, pos+19):
matches[i] = 1
k += 19
else:
k += 1
return matches
def coverage(matches, g):
counter = 0
for i in matches[g]:
if i >= 1:
counter += 1
cov = counter/len(res_genes[g])*100
return cov
st = time.time()
genes = build_lyb("resistance_genes.fsa")
infile = open('test2.txt', 'r')
res_genes = {}
Flag = False
n_line = 0
for line in infile:
n_line += 1
if line.startswith("+"):
Flag = False
if Flag:
kmers = build_kmers(line[:-1], 19)
for g in genes:
counter = 18
k = 20
while k <= 41:
if kmers[k] in genes[g]:
counter += 19
k += 19
else:
k += 1
if counter >= 56:
print(n_line)
l1 = calculation(kmers, g)
if g in res_genes:
l2 = res_genes[g]
lr = [sum(i) for i in zip(l1, l2)]
res_genes[g] = lr
else:
res_genes[g] = l1
if line.startswith('#'):
Flag = True
for g in res_genes:
print(g)
for i in genes[g]:
print(i, " ", end='')
print('')
for i in res_genes[g]:
print(i, " ", end='')
print('')
print(coverage(res_genes, g))
et = time.time()
elapsed_time = et-st
print("Execution time:", elapsed_time, "s")

Counting Iterations in for loop [duplicate]

This question already has answers here:
Accessing the index in 'for' loops
(26 answers)
Closed 4 years ago.
I am writing a code where program read data from file and then it readlines and if a specific word exist in list it allows the user to modify its quantity. Well the issue is how can i get the iteration number (line number) in for loop while it is searching for word in line
My Code:
f= open('guru99.txt','r')
action = int(input('Enter number'))
if action == 1:
for line in lines:
if 'banana' in line:
print(line)
print('You already purchased banana! You can change the quantity')
edit = line.split()
qty = int(input('Enter new quanity'))
print(bline)
else:
continue
for example if the word banana found on line number 3. How to edit this program to show the iteration number or index of the for loop
Add a variable that keeps track of the number of iterations currently passed:
f= open('guru99.txt','r')
lines = f.readlines()
i =1
for line in lines:
print([i],line)
i = i+1
count = 0
action = int(input('Enter line number'))
if action == 1:
for line in lines:
count += 1
print(count)
if 'banana' in line:
print(line)
print('You already purchased banana! You can change the quantity')
edit = line.split()
qty = int(input('Enter new quanity'))
edit[-1] = (int(edit[-1]) / float(edit[3]))
bprice = (edit[-1])
edit[3] = str(qty)
edit[-1] = str(int(qty * bprice))
bline = ' '.join(edit)
print(bline)
else:
continue
f= open('guru99.txt','r')
lines = f.readlines()
i =1
line_counter = 0
for line in lines:
print([i],line)
i = i+1
action = int(input('Enter line number'))
if action == 1:
for line in lines:
line_counter += 1
if 'banana' in line:
print('You are on line: ' + str(line_counter)
print('You already purchased banana! You can change the quantity')
edit = line.split()
qty = int(input('Enter new quanity'))
edit[-1] = (int(edit[-1]) / float(edit[3]))
bprice = (edit[-1])
edit[3] = str(qty)
edit[-1] = str(int(qty * bprice))
bline = ' '.join(edit)
print(bline)
else:
continue

local variable 'moodsc' referenced before assignment

In the code below I get the following error:
"local variable 'moodsc' referenced before assignment"
I'm new to programming and python. I'm struggling with interpreting other questions on the similar topic. Any context around this specific code would be helpful.
import re
import json
import sys
def moodScore(sent, myTweets):
scores = {} # initialize an empty dictionary
new_mdsc = {} # intitalize an empty dictionary
txt = {}
for line in sent:
term, score = line.split("\t") # The file is tab-delimited. "\t" means "tab character"
scores[term] = int(score) # Convert the score to an integer.
data = [] # initialize an empty list
for line in myTweets:
tweet = json.loads(line)
if "text" in tweet and "lang" in tweet and tweet["lang"] == "en":
clean = re.compile("\W+")
clean_txt = clean.sub(" ", tweet["text"]).strip()
line = clean_txt.lower().split()
moodsc = 0
pos = 0
neg = 0
count = 1
for word in range(0, len(line)):
if line[word] in scores:
txt[word] = int(scores[line[word]])
else:
txt[word] = int(0)
moodsc += txt[word]
print txt
if any(v > 0 for v in txt.values()):
pos = 1
if any(v < 0 for v in txt.values()):
neg = 1
for word in range(0, len(line)): # score each word in line
if line[word] not in scores:
if str(line[word]) in new_mdsc.keys():
moodsc2 = new_mdsc[str(line[word])][0] + moodsc
pos2 = new_mdsc[str(line[word])][1] + pos
neg2 = new_mdsc[str(line[word])][2] + neg
count2 = new_mdsc[str(line[word])][3] + count
new_mdsc[str(line[word])] = [moodsc2, pos2, neg2, count2]
else:
new_mdsc[str(line[word])] = [moodsc, pos, neg, count]
def new_dict():
for val in new_mdsc.values():
comp = val[0] / val[3]
val.append(comp)
for key, val in new_mdsc.items():
print (key, val[4])
def main():
sent_file = open(sys.argv[1])
tweet_file = open(sys.argv[2])
moodScore(sent_file, tweet_file)
# new_dict()
if __name__ == '__main__':
main()
Ok #joshp, I think you need to globalise some variables, because the error is 'moodsc referenced before assignment', I think the code only gets as far as moodsc += txt[word] but you may also have trouble with pos and neg.
Try global moodsc and pos etc. before you define moodsc and pos etc. If this doesn't work try global moodsc before moodsc += txt[word] and so forth, you may need to use global in both places for it to work, I often find that this is needed in my code, to globalise it at definition and wherever else you use it (at the start of each function and statement where it is used).

Python Sudoku Checker 9X9

while True:
try:
file = input("Enter a filename: ")
fi = open(file, "r")
infile = fi.read()
grid = [list (i) for i in infile.split()] #Puts the sudoku puzzle into a list in order to check that the total number is valid
check = len(grid)
print("The total number in this puzzle is:",check) #Counts the amount of numbers in the sudoku puzzle
break
except FileNotFoundError:
print ("The inputted file does not exist")
def check(infile):
count = 0
for j in range (0,9):
for n in range(0,9):
if infile[j].count(infile[j][n]) <= 1:
count = count + 0
else:
count = count + 1
cols = [[row[i] for row in infile] for i in[0,1,2,3,4,5,6,7,8]]
leg = 0
for i in range(0,9):
for j in range(0,9):
if cols[i].count(cols[i][j]) <= 1:
leg = leg + 0
else:
leg = leg + 1
angel = []
for t in range(3):
ang = infile[t]
for u in range(3):
angel.append(ang[u])
foot = 0
for be in range(9):
if angel.count(angel[be]) <= 1:
foot = foot + 0
else:
foot = foot + 1
if count + leg + foot == 0:
print("Valid")
else:
print ("Invalid")
def inputs():
x = raw_input()
ls = []
while x != '':
x1 =x.split(' ')
ls.append(x1)
if len(infile) >=9:
print (check(infile))
infile = []
x = raw_input()
inputs()
actual error:
Traceback (most recent call last):
File "E:/Computer Programming/Assignment/check 2.py", line 22, in <module>
cols = [[row[i] for row in infile] for i in[0,1,2,3,4,5,6,7,8]]
File "E:/Computer Programming/Assignment/check 2.py", line 22, in <listcomp>
cols = [[row[i] for row in infile] for i in[0,1,2,3,4,5,6,7,8]]
File "E:/Computer Programming/Assignment/check 2.py", line 22, in <listcomp>
cols = [[row[i] for row in infile] for i in[0,1,2,3,4,5,6,7,8]]
IndexError: string index out of range
Why does it give an output to say that my string index is out of range, is there another way to create a sudoku 9x9 checker to check if there are any reoccurring numbers. I need to make sure that there are 9 numbers in each column and that they are between the numbers 1 and 9
first, a few comments:
never do:
cols = [[row[i] for row in infile] for i in[0,1,2,3,4,5,6,7,8]]
but do:
cols = [[row[i] for row in infile] for i in range(0,9)]
never call a variable the same name as a function you've defined in your code check and check()
don't write code at the module level, but embed everything in functions, and call the entry point function at the end of the file after the if __name__ == "__main__" condition (so in case you want to import your module in another module, you don't execute module level code).
don't open files without closing them, instead use the context manager: with open('myfile', 'r') as f: ...
your code features an useless use of while... or at least a wrong use (do you really mean to loop forever on an exception?) use command line arguments instead, that will make the shell help your user choose a file that does actually exists.
now I've made all that clear, here's about your actual question:
infile is a file object (if I can read correctly your mis-indented python code), thus every line - called row here - of infile is just a string.
So if you have an empty line or a line that is less than 9 columns, you're likely to get row[i] out of boundaries.
here's a take at refactoring your code, though I've left a number of wrong design over:
def check(infile):
count = 0
for j in range (0,9):
for n in range(0,9):
if infile[j].count(infile[j][n]) <= 1:
count = count + 0
else:
count = count + 1
def inputs():
x = raw_input()
ls = []
while x != '':
x1 =x.split(' ')
ls.append(x1)
if len(infile) >=9:
print (check(infile))
infile = []
x = raw_input()
def check_grid():
cols = [[row[i] for row in infile] for i in range(0,9)]
leg = 0
for i in range(0,9):
for j in range(0,9):
if cols[i].count(cols[i][j]) <= 1:
leg = leg + 0
else:
leg = leg + 1
angel = []
for t in range(3):
ang = infile[t]
for u in range(3):
angel.append(ang[u])
foot = 0
for be in range(9):
if angel.count(angel[be]) <= 1:
foot = foot + 0
else:
foot = foot + 1
if count + leg + foot == 0:
print("Valid")
else:
print ("Invalid")
inputs()
def sudoku_checker():
try:
file = input("Enter a filename: ")
fi = open(file, "r")
infile = fi.read()
grid = [list (i) for i in infile.split()] #Puts the sudoku puzzle into a list in order to check that the total number is valid
# Counts the amount of numbers in the sudoku puzzle
print("The total number in this puzzle is:",len(grid))
check_grid()
except FileNotFoundError:
print ("The inputted file does not exist")
if __name__ == "__main__":
sudoku_checker()

Parsing Data from live website in Python Enumerate problem!

The following script is supposed to fetch a specific line number and parse it from a live website. It works for like 30 loops but then it seems like enumerate(f) stops working correctly... the "i" in the for loop seems to stop at line 130 instead of like 200 something. Could this be due to the website I'm trying to fetch data from or something else? Thanks!!
import sgmllib
class MyParser(sgmllib.SGMLParser):
"A simple parser class."
def parse(self, s):
"Parse the given string 's'."
self.feed(s)
self.close()
def __init__(self, verbose=0):
"Initialise an object, passing 'verbose' to the superclass."
sgmllib.SGMLParser.__init__(self, verbose)
self.divs = []
self.descriptions = []
self.inside_div_element = 0
def start_div(self, attributes):
"Process a hyperlink and its 'attributes'."
for name, value in attributes:
if name == "id":
self.divs.append(value)
self.inside_div_element = 1
def end_div(self):
"Record the end of a hyperlink."
self.inside_div_element = 0
def handle_data(self, data):
"Handle the textual 'data'."
if self.inside_div_element:
self.descriptions.append(data)
def get_div(self):
"Return the list of hyperlinks."
return self.divs
def get_descriptions(self, check):
"Return a list of descriptions."
if check == 1:
self.descriptions.pop(0)
return self.descriptions
def rm_descriptions(self):
"Remove all descriptions."
self.descriptions.pop()
import urllib
import linecache
import sgmllib
tempLine = ""
tempStr = " "
tempStr2 = ""
myparser = MyParser()
count = 0
user = ['']
oldUser = ['none']
oldoldUser = [' ']
array = [" ", 0]
index = 0
found = 0
k = 0
j = 0
posIndex = 0
a = 0
firstCheck = 0
fCheck = 0
while a < 1000:
print a
f = urllib.urlopen("SITE")
a = a+1
for i, line in enumerate(f):
if i == 187:
print i
tempLine = line
print line
myparser.parse(line)
if fCheck == 1:
result = oldUser[0] is oldUser[1]
u1 = oldUser[0]
u2 = oldUser[1]
tempStr = oldUser[1]
if u1 == u2:
result = 1
else:
result = user is oldUser
fCheck = 1
user = myparser.get_descriptions(firstCheck)
tempStr = user[0]
firstCheck = 1
if result:
array[index+1] = array[index+1] +0
else:
j = 0
for z in array:
k = j+2
tempStr2 = user[0]
if k < len(array) and tempStr2 == array[k]:
array[j+3] = array[j+3] + 1
index = j+2
found = 1
break
j = j+1
if found == 0:
array.append(tempStr)
array.append(0)
oldUser = user
found = 0
print array
elif i > 200:
print "HERE"
break
print array
f.close()
Perhaps the number of lines on that web page are fewer than you think? What does this give you?:
print max(i for i, _ in enumerate(urllib.urlopen("SITE")))
Aside: Your indentation is stuffed after the while a < 1000: line. Excessive empty lines and one-letter names don't assist the understanding of your code.
enumerate is not broken. Instead of such speculation, inspect your data. Suggestion: replace
for i, line in enumerate(f):
by
lines = list(f)
print "=== a=%d linecount=%d === % (a, len(lines))
for i, line in enumerate(lines):
print " a=%d i=%d line=%r" % (a, i, line)
Examine the output carefully.

Categories