Sorting strings with integers and text in Python - python

I'm making a stupid little game that saves your score in a highscores.txt file.
My problem is sorting the lines. Here's what I have so far.
Maybe an alphanumeric sorter for python would help? Thanks.
import os.path
import string
def main():
#Check if the file exists
file_exists = os.path.exists("highscores.txt")
score = 500
name = "Nicholas"
#If the file doesn't exist, create one with the high scores format.
if file_exists == False:
f = open("highscores.txt", "w")
f.write('Guppies High Scores\n1000..........Name\n750..........Name\n600..........Name\n450..........Name\n300..........Name')
new_score = str(score) + ".........." + name
f = open("highscores.txt", "r+")
words = f.readlines()
print words
main()

after words = f.readlines(), try something like:
headers = words.pop(0)
def myway(aline):
i = 0
while aline[i].isdigit():
i += 1
score = int(aline[:i])
return score
words.sort(key=myway, reverse=True)
words.insert(0, headers)
The key (;-) idea is to make a function that returns the "sorting key" from each item (here, a line). I'm trying to write it in the simplest possible way: see how many leading digits there are, then turn them all into an int, and return that.

I'd like to encourage you to store your high scores in a more robust format. In particular I suggest JSON.
import simplejson as json # Python 2.x
# import json # Python 3.x
d = {}
d["version"] = 1
d["highscores"] = [[100, "Steve"], [200, "Ken"], [400, "Denise"]]
s = json.dumps(d)
print s
# prints:
# {"version": 1, "highscores": [[100, "Steve"], [200, "Ken"], [400, "Denise"]]}
d2 = json.loads(s)
for score, name in sorted(d2["highscores"], reverse=True):
print "%5d\t%s" % (score, name)
# prints:
# 400 Denise
# 200 Ken
# 100 Steve
Using JSON will keep you from having to write your own parser to recover data from saved files such as high score tables. You can just tuck everything into a dictionary and trivially get it all back.
Note that I tucked in a version number, the version number of your high score save format. If you ever change the save format of your data, having a version number in there will be a very good thing.

I guess something went wrong when you pasted from Alex's answer, so here is your code with a sort in there
import os.path
def main():
#Check if the file exists
file_exists = os.path.exists("highscores.txt")
score = 500
name = "Nicholas"
#If the file doesn't exist, create one with the high scores format.
if file_exists == False:
f = open("highscores.txt", "w")
f.write('Guppies High Scores\n1000..........Name\n750..........Name\n600..........Name\n450..........Name\n300..........Name')
new_score = str(score) + ".........." + name +"\n"
f = open("highscores.txt", "r+")
words = f.readlines()
headers = words.pop(0)
def anotherway(aline):
score=""
for c in aline:
if c.isdigit():
score+=c
else:
break
return int(score)
words.append(new_score)
words.sort(key=anotherway, reverse=True)
words.insert(0, headers)
print "".join(words)
main()

What you want is probably what's generally known as a "Natural Sort". Searching for "natural sort python" gives many results, but there's some good discussion on ASPN.

Doing a simple string sort on your
new_score = str(score) + ".........." + name
items isn't going to work since, for example str(1000) < str(500). In other words, 1000 will come before 500 in an alphanumeric sort.
Alex's answer is good in that it demonstrates the use of a sort key function, but here is another solution which is a bit simpler and has the added advantage of visuallaly aligning the high score displays.
What you need to do is right align your numbers in a fixed field of the maximum size of the scores, thus (assuming 5 digits max and ver < 3.0):
new_score = "%5d........%s" % (score, name)
or for Python ver 3.x:
new_score = "{0:5d}........{1}".format(score, name)
For each new_score append it to the words list (you could use a better name here) and sort it reversed before printing. Or you could use the bisect.insort library function rather than doing a list.append.
Also, a more Pythonic form than
if file_exists == False:
is:
if not file_exists:

Related

Is there any way to put a single result from a CSV into a variable?

I'm making a program in school where users are quizzed on certain topics and their results are saved into a csv file. I've managed to print off the row with the highest score, but this doesn't look very neat.
with open ('reportForFergusTwo.csv', 'r') as highScore:
highScoreFinder=highScore
valid3=False
for row in highScoreFinder:
if subjectInput in row:
if difficultyInput in row:
if ('10' or '9' or '8' or '7' or '6' or '5' or '4' or '3' or '2' or '1') in row:
valid3=True
print("The highest score for this quiz is:",row)
For example: it says, "The highest score for this quiz is: chemistry,easy,10,Luc16" but I would prefer it to say something like "The highest score for this quiz is: 10" and "This score was achieved by: Luc16", rather than just printing the whole row off, with unnecessary details like what the quiz was on.
My CSV file looks like this:
Subject,Difficulty,Score,Username
language,easy,10,Luc16
chemistry,easy,10,Luc16
maths,easy,9,Luc16
chemistry,easy,5,Eri15
chemistry,easy,6,Waf1
chemistry,easy,0,Eri15
I thought that maybe if I could find a way to take the individual results (the score and username) and put them into their own individual variables, then it would be much easier to present it the way I want, and be able to reference them later on in the function if I need them to be displayed again.
I'm just fairly new to coding and curious if this can be done, so I can improve the appearance of my code.
Edit: To solve the issue, I used str.split() to break up the indivudal fields in the rows of my CSV, so that they could be selected and held by a variable. The accepted answer shows the solution I used, but this is my final code in case this wasn't clear
with open ('details.csv', 'r') as stalking:
stalkingReader=csv.reader(stalking)
valid4=False
for column in stalkingReader:
if user in column[3]:
valid4=True
print("Here are the details for user {}... ".format(user))
splitter=row.split(',')
name=splitter[0]
age=splitter[1]
year=splitter[2]
print("Name: {}".format(name))
print("Age: {}".format(age))
print("Year Group: {}".format(year))
postReport()
if valid4==False:
print("Sorry Fergus, this user doesn't seem to be in our records.")
with open("reportForFergusTwo.csv", "r") as highScore:
subject = []
difficulty = []
score = []
name = []
for line in highScore:
subject.append(line.split(',')[0])
difficulty.append(line.split(',')[1])
score.append(line.split(',')[2])
name.append(line.split(',')[3])
ind = score.index(max(score)
print("The highest score for this quiz is: ", max(score))
print("This was achieved by ", name[ind])
with opens (and will close) the .csv file.
Then, four empty lists are created.
Next, I loop through every line in the file, and I split every line using a comma as the delimiter. This produces a list of four elements, which are appended to each list.
You can use str.split() to break up the rows of your CSV so that you can individually reference the fields:
split_row = row.split(',')
score = split_row[2]
user = split_row[3]
print("The highest score for this quiz is: " + score)
print("This score was achieved by: " + user)
You can use csv library
import csv
with open("data", "r") as f:
reader = csv.reader(f)
# skip header
reader.next()
# organize data in 2D array
data = [ [ sub, dif, int(score), name ] for sub, dif, score, name in reader ]
# sort by score
data.sort(key=lambda x: x[2], reverse=True)
# pretty print
print "The highest score for this quiz is:", data[0][2]
print "This score was achieved by:", data[0][3]
(Posted solution on behalf of the OP).
To solve the issue, I used str.split() to break up the indivudal fields in the rows of my CSV, so that they could be selected and held by a variable. The accepted answer shows the solution I used, but this is my final code in case this wasn't clear
with open ('details.csv', 'r') as stalking:
stalkingReader=csv.reader(stalking)
valid4=False
for column in stalkingReader:
if user in column[3]:
valid4=True
print("Here are the details for user {}... ".format(user))
splitter=row.split(',')
name=splitter[0]
age=splitter[1]
year=splitter[2]
print("Name: {}".format(name))
print("Age: {}".format(age))
print("Year Group: {}".format(year))
postReport()
if valid4==False:
print("Sorry Fergus, this user doesn't seem to be in our records.")

python newbie - where is my if/else wrong?

Complete beginner so I'm sorry if this is obvious!
I have a file which is name | +/- or IG_name | 0 in a long list like so -
S1 +
IG_1 0
S2 -
IG_S3 0
S3 +
S4 -
dnaA +
IG_dnaA 0
Everything which starts with IG_ has a corresponding name. I want to add the + or - to the IG_name. e.g. IG_S3 is + like S3 is.
The information is gene names and strand information, IG = intergenic region. Basically I want to know which strand the intergenic region is on.
What I think I want:
open file
for every line, if the line starts with IG_*
find the line with *
print("IG_" and the line it found)
else
print line
What I have:
with open(sys.argv[2]) as geneInfo:
with open(sys.argv[1]) as origin:
for line in origin:
if line.startswith("IG_"):
name = line.split("_")[1]
nname = name[:-3]
for newline in geneInfo:
if re.match(nname, newline):
print("IG_"+newline)
else:
print(line)
where origin is the mixed list and geneInfo has only the names not IG_names.
With this code I end up with a list containing only the else statements.
S1 +
S2 -
S3 +
S4 -
dnaA +
My problem is that I don't know what is wrong to search so I can (attempt) to fix it!
Below is some step-by-step annotated code that hopefully does what you want (though instead of using print I have aggregated the results into a list so you can actually make use of it). I'm not quite sure what happened with your existing code (especially how you're processing two files?)
s_dict = {}
ig_list = []
with open('genes.txt', 'r') as infile: # Simulating reading the file you pass in sys.argv
for line in infile:
if line.startswith('IG_'):
ig_list.append(line.split()[0]) # Collect all our IG values for later
else:
s_name, value = line.split() # Separate out the S value and its operator
s_dict[s_name] = value.strip() # Add to dictionary to map S to operator
# Now you can go back through your list of IG values and append the appropriate operator
pulled_together = []
for item in ig_list:
s_value = item.split('_')[1]
# The following will look for the operator mapped to the S value. If it is
# not found, it will instead give you 'not found'
corresponding_operator = s_dict.get(s_value, 'Not found')
pulled_together.append([item, corresponding_operator])
print ('List structure')
print (pulled_together)
print ('\n')
print('Printout of each item in list')
for item in pulled_together:
print(item[0] + '\t' + item[1])
nname = name[:-3]
Python's slicing through list is very powerful, but can be tricky to understand correctly.
When you write [:-3], you take everything except the last three items. The thing is, if you have less than three element in your list, it does not return you an error, but an empty list.
I think this is where things does not work, as there are not much elements per line, it returns you an empty list. If you could tell what do you exactly want it to return there, with an example or something, it would help a lot, as i don't really know what you're trying to get with your slicing.
Does this do what you want?
from __future__ import print_function
import sys
# Read and store all the gene info lines, keyed by name
gene_info = dict()
with open(sys.argv[2]) as gene_info_file:
for line in gene_info_file:
tokens = line.split()
name = tokens[0].strip()
gene_info[name] = line
# Read the other file and lookup the names
with open(sys.argv[1]) as origin_file:
for line in origin_file:
if line.startswith("IG_"):
name = line.split("_")[1]
nname = name[:-3].strip()
if nname in gene_info:
lookup_line = gene_info[nname]
print("IG_" + lookup_line)
else:
pass # what do you want to do in this case?
else:
print(line)

How can I create an average from a text file in Python 3.5.2?

I've been struggling with this for two days now and I can't seem to find any help. I need to search the file for a student ID (1001 is my test ID being used) and then add the numbers in each line that takes place below each occurrence of the student ID together in order to get an average.
filename = input("Enter file name: \n"
"Example: Grade Data.txt \n")
myFile = open(filename, "r")
selectSID = input("Enter SID: \n")
gradesNum = myFile.read().count(selectSID)
grades = myFile.read()
gradetotal = sum()
average = (gradetotal/gradesNum)
print(average)
The text file that is being opened looks like this:
1001
95
1002
99
1001
96
1002
0
1001
84
1002
25
1001
65
1002
19
This looks like homework so I don't want to write the code for you but here is a pseudo code (there are multiple ways to achieve what you want, this is just a simple beginner level code):
Open file to read
get two lines from the file
is the line1 interesting to me?
yes -> store value from line2 in an array
no -> ignore line2
close file
get average
Some useful references:
Python I/O
Powerful things in python to help with I/O
Built-in functions to help with basic operations like sum
from collections import defaultdict
# with open('filename') as f:
# file = [for i in f]
# in this case, it's the list below
file = [1001,95,1002,99,1001,96,1002,0,1001,84,1002,25,1001,65,1002,19]
infos = defaultdict(list)
sids = file[::2] # select sid info
grades = file[1::2] # select grade info
for sid,grade in zip(sids,grades):
infos[sid].append(grade)
print(infos[1001])
print(infos[1002])
out:
[95, 96, 84, 65]
[99, 0, 25, 19]
in this point, you can sum, average, max or min whatever you want.
Please don't use this code for your homework (use #Aditya's method); you need to learn the basics before using fancy libraries. However, I just learned about collections.defaultdict and I wanted to use it. Watch this video for a great demo on defaultdict.
import collections
import statistics
# This little guy will hold all of our grades
# https://youtu.be/lyDLAutA88s is a great video using it
grades = collections.defaultdict(list)
def get_next_num(file):
"""get the next line of a file,
remove any whitespace surrounding it,
and turn it into an integer"""
return int(next(file).strip())
with open('tmp.txt') as myfile:
while True:
try:
# seriously, watch the video
grades[get_next_num(myfile)].append(get_next_num(myfile))
except StopIteration: # end of file
break
student_id = int(input('Enter student ID. Choices: {} : '.format(list(grades.keys()))))
print(statistics.mean(grades[student_id]))
Updated Answer:
Okay, so I think I understand your question now... Same thing, except I suggest using a list, and as long as the file stays in the same format (SID, Score, so on...), this should work, and requires minimal understanding of Python (i.e No weird libraries like glob):
filename = input("Enter file name: \n"
"Example: Grade Data.txt \n")
myFile = open(filename, "r")
selectSID = input("Enter SID: \n")
raw = myFile.read() ## Raw contents of file.
val = raw.count( selectSID ) ## Returns number of occurences
print( "Occurrences: ", val ) ## Or do something else...
lines = raw.split("\n") ## Create a list containing each new line
scores = [] ## A list that will contain all your scores
while selectSID in lines:
val = lines.index( selectSID ) ## Returns where it is in the list,
score = lines[ val+1 ] ## Gets the item at that position (index) Because the score is one line after the SID
scores.append( int(score) ) ## Adds the score to the list. --Suggest you look into how to safely capturing "int"s (try, except, etc) so the program doesn't crash if the score isn't a number (Advance)
lines.remove( selectSID ) ## automatically removes first occurrence of the SID (cause that's the one we just used)
avg = sum(scores) / len(scores) ## sum() function is self explanatory (takes a list or tuple [a sequence] and adds all values (must be all numbers), THEN len() is just length.
This will return an integer, or with your file, will print:
Occurrences: 4
Regardless if this answered your question, my tip for learning basics is understanding file types and what they can do.
In your case, you will mainly need to focus on strings (text) and integers (whole numbers). Using Pythons IDLE, declare a variable, and type the name and a dot, and use tab to scroll through each functions available.
Example:
>>> myString = "Hello World"
>>> myString.[TAB] #--> [Context Menu Here]
Once you pick one form the list, enter an opening parenthesis "(", and it will give a brief description of what it does.
Hope that helps, and sorry for the lengthy reply (I was trying to explain and give pointers (tips) since you said you were a noob)

How to retrieve a specific string from a specific list from a file with JSON in Python

I got code that actually saves three different data strings in the same list. Each list is separated and I'm not even sure if that made sense... So I'm going to just paste the code:
filename = "datosdeusuario.txt"
leyendo = open(filename, 'r+')
buffer = leyendo.read()
leyendo.close()
if not user.name in buffer:
escribiendo = open(filename, 'a')
escribiendo.write(json.dumps([user.name, string2, string3])+"\n")
escribiendo.close()
print(str(horaactual)[:19] + ": Data from " + user.name + " added")
That code is saving information like this (and working almost perfect):
["saelyth", "thisisanexampleforstring2", "andthisisanexampleforstring3"]
["anothername", "thisisanothernexampleforstring2", "andthisisanotherexampleforstring3"]
["John Connor", "exampleforstring2", "exampleforstring3"]
So my actual question and problem is... What is the best way to get an specific string from that file?
For example, let's say I want to retrieve the first string, the second string and the third string only if the user.name is John Connor (I mean, all three values from the list where the name is found). I can't seem to find a proper way to do it googling. The expected code would be something like this:
if user.name is in found in any list(position1) of the file datosdeusuario.txt
then
retrieveddata1 = List where value1 is user.name(get value1)
retrieveddata2 = List where value1 is user.name(get value2)
retrieveddata3 = List where value1 is user.name(get value3)
I have no idea how to do that. That's why I just made up that code to explain the situation.
I am not sure if this is what you want but:
filename = "datosdeusuario.txt"
f = open(filename,"r")
filedata = f.read()
f.close()
sp1 = filedata.split("\n")
a = ""
for x in sp1:
if "joao m" in x:
a = x
if(len(a) > 0):
sp2 = a.split('"')
values = []
for x in sp2:
if not(x == ", " or x == "]" or x == "[" or len(x) == 0):
values.append(x)
print values[0] #should by the name
print values[1] #value 2
print values[2] #value 3
else: #No username in the file
#do something
pass
Maybe this would work:
retrieveddata1 = retrieveddata2 = retrieveddata3 = None
filename = "datosdeusuario.txt"
for line in open(filename, 'r'):
retrieved = json.loads(line)
if retrieved[0] == 'jonh connor':
retrieveddata1 = retrieved[0]
retrieveddata2 = retrieved[1]
retrieveddata3 = retrieved[2]
break
I suggest that you use the general solution to the problem of extracting data from JSON objects, namely jsonpath. PyPi has this library https://pypi.python.org/pypi/jsonpath/
if jsonpath.jsonpath(jsondata,'$[0]') == 'John Connor':
result = jsonpath.jsonpath(jsondata,'$[2]')
Yes, the documentation for the Python library sucks but the code is straightforward to read through, and JSONPATH expressions are documented fairly well by other implementors.
Consider making your code clearer by adding something like:
NAME = '$[0]'
STUFF = '$[1]'
OTHERSTUFF = '$[2]'

Sorting and aligning the contents of a text file in Python

In my program I have a text file that I read from and write to. However, I would like to display the contents of the text file in an aligned and sorted manner. The contents currently read:
Emily, 6
Sarah, 4
Jess, 7
This is my code where the text file in read and printed:
elif userCommand == 'V':
print "High Scores:"
scoresFile = open("scores1.txt", 'r')
scores = scoresFile.read().split("\n")
for score in scores:
print score
scoresFile.close()
Would I have to convert this information into lists in order to be able to do this? If so, how do I go about doing this?
When writing to the file, I have added a '\n' character to the end, as each record should be printed on a new line.
Thank you
You could use csv module, and then could use sorted to sort.
Let's says, scores1.txt have following
Richard,100
Michael,200
Ricky,150
Chaung,100
Test
import csv
reader=csv.reader(open("scores1.txt"),dialect='excel')
items=sorted(reader)
for x in items:
print x[0],x[1]
...
Emily 6
Jess 7
Sarah 4
Looks like nobody's answered the "aligned" part of your request. Also, it's not clear whether you want the results sorted alphabetically by name, or rather by score. In the first case, alphabetical order (assuming Python 2.6):
with open("scores1.txt", 'r') as scoresFile:
names_scores = [[x.strip() for x in l.split(',', 1)] for l in scoresFile]
# compute column widths
name_width = max(len(name) for name, score in names_scores)
score_width = max(len(score) for name, score in names_scores)
# sort and print
names_scores.sort()
for name, score in names_scores:
print "%*s %*s" % (name_width, name, score_width, score)
If you want descending order by score, just change the names_scores.sort() line to two:
def getscore_int(name_score): return int(name_score[1])
names_scores.sort(key=getscore_int, reverse=True)
to sort stuff in Python, you can use sort()/sorted().
to print, you can use print with format specifiers, str.rjust/str.ljust, pprint etc

Categories