Handling variable length command tuple with try...except

Handling variable length command tuple with try...except - python

I'm writing a Python 3 script that does tabulation for forestry timber counts.
The workers will radio the species, diameter, and height in logs of each tree they mark to the computer operator. The computer operator will then enter a command such as this:
OAK 14 2
which signifies that the program should increment the count of Oak trees of fourteen inches in diameter and two logs in height.
However, the workers also sometimes call in more than one of the same type of tree at a time. So the program must also be able to handle this command:
OAK 16 1 2
which would signify that we're increasing the count by two.
The way I have the parser set up is thus:
key=cmdtup[0]+"_"+cmdtup[1]+"_"+cmdtup[2]
try:
trees[key]=int(trees[key])+int(cmdtup[3])
except KeyError:
trees[key]=int(cmdtup[3])
except IndexError:
trees[key]=int(trees[key])+1
If the program is commanded to store a tree it hasn't stored before, a KeyError will go off, and the handler will set the dict entry instead of increasing it. If the third parameter is omitted, an IndexError will be raised, and the handler will treat it as if the third parameter was 1.
Issues occur, however, if we're in both situations at once; the program hasn't heard of Oak trees yet, and the operator hasn't specified a count. KeyError goes off, but then generates an IndexError of its own, and Python doesn't like it when exceptions happen in exception handlers.
I suppose the easiest way would be to simply remove one or the other except and have its functionality be done in another way. I'd like to know if there's a more elegant, Pythonic way to do it, though. Is there?

I would do something like this:
def parse(cmd, trees):
res = cmd.split() # split the string by spaces, yielding a list of strings
if len(res) == 3: # if we got 3 parameters, set the fourth to 1
res.append(1)
for i in range(1,4): # convert parameters 1-3 to integers
res[i] = int(res[i])
key = tuple(res[x] for x in range(3)) # convert list to tuple, as lists cannot be dictionary indexes
trees[key] = trees.get(key,0) + res[3] # increase the number of entries, creating it if needed
trees={}
# test data
parse("OAK 14 2", trees)
parse("OAK 16 1 2", trees)
parse("OAK 14 2", trees)
parse("OAK 14 2", trees)
# print result
for tree in trees:
print(tree, "=", trees[tree])
yielding
('OAK', 16, 1) = 2
('OAK', 14, 2) = 3
Some notes:
no error handling here, you should handle the case when a value supposed to be a number isn't or the input is wrong in any other way
instead of strings, I use tuples as a dictionary index

You could use collections.Counter, which returns 0 rather than a KeyError if the key isn't in the dictionary.
Counter Documentation:
Counter objects have a dictionary interface except that they return a zero count for missing items instead of raising a KeyError
Something like this:
from collections import Counter
counts = Counter()
def update_counts(counts, cmd):
cmd_list = cmd.split()
if len(cmd_list) == 3:
tree = tuple(cmd_list)
n = 1
else:
*tree, n = tuple(cmd_list)
counts[tree] += n
Same notes apply as in uselpa's answer. Another nice thing with Counter is that if you want to, e.g., look at weekly counts, you just do something like sum(daily_counts).
Counter works even better if you're starting from a list of commands:
from collections import Counter
from itertools import repeat
raw_commands = get_commands() # perhaps read a file
command_lists = [c.split() for c in raw_commands]
counts = Counter(parse(command_lists))
def parse(commands):
for c in commands:
if len(c) == 3:
yield tuple(c)
elif len(c) == 4
yield from repeat(tuple(c[0:2]), times=c[3])
From there you can use the update_counts function above to add new trees, or you can start collecting the commands in another text file and then generate a second Counter object for the next day, the next week, etc.

In the end, the best way was to simply remove the IndexError handler, change cmdtup to a list, and insert the following:
if len(cmdtup) >= 3:
cmdtup.append(1)

Related

Variable table width with .format

I'm trying to display data from a csv in a text table. I've got to the point where it displays everything that I need, however the table width still has to be set, meaning if the data is longer than the number set then issues begin.
I currently print the table using .format to sort out formatting, is there a way to set the width of the data to a variable that is dependant on the length of the longest piece of data?
for i in range(len(list_l)):
if i == 0:
print(h_dashes)
print('{:^1s}{:^26s}{:^1s}{:^26s}{:^1s}{:^26s}{:^1s}{:^26s}{:^1s}'.format('|', (list_l[i][0].upper()),'|', (list_l[i][1].upper()),'|',(list_l[i][2].upper()),'|', (list_l[i][3].upper()),'|'))
print(h_dashes)
else:
print('{:^1s}{:^26s}{:^1s}{:^26s}{:^1s}{:^26s}{:^1s}{:^26s}{:^1s}'.format('|', list_l[i][0], '|', list_l[i][1], '|', list_l[i][2],'|', list_l[i][3],'|'))
I realise that the code is far from perfect, however I'm still a newbie so it's piecemeal from various tutorials

You can actually use a two-pass approach to first get the correct lengths. As per your example with four fields per line, the following shows the basic idea you can use.
What follows is an example of the two-pass approach, first to get the maximum lengths for each field, the other to do what you're currently doing (with the calculated rather than fixed lengths):
# Can set MINIMUM lengths here if desired, eg: lengths = [10, 0, 41, 7]
lengths = [0] * 4
fmtstr = None
for pass in range(2):
for i in range(len(list_l)):
if pass == 0:
# First pass sets lengths as per data.
for field in range(4):
lengths[field] = max(lengths[field], len(list_l[i][field])
else:
# Second pass prints the data.
# First, set format string if not yet set.
if fmtstr is None:
fmtstr = '|'
for item in lengths:
fmtstr += '{:^%ds}|' % (item)
# Now print item (and header stuff if first item).
if i == 0: print(h_dashes)
print(fmtstr.format(list_l[i][0].upper(), list_l[i][1].upper(), list_l[i][2].upper(), list_l[i][3].upper()))
if i == 0: print(h_dashes)
The construction of the format string is done the first time you process an item in pass two.
It does so by taking a collection like [31,41,59] and giving you the string:
|{:^31s}|{:^41s}|{:^59s}|
There's little point using all those {:^1s} format specifiers when the | is not actually a varying item - you may as well code it directly into the format string.

Removing a concrete value from a set using function

(Newbie question) I want to write a Python program that removes a concrete item from a set if it is present in the set.
When the set is predefined, the code goes like this:
set = {1,2,3,4,4,5,6,10}
set.discard(4)
print(set)
What would be a way to write this, so that it applies to any set of values not known beforehand? I tried the following but it didn´t work. Is there a method along those lines that does?
def set(items):
if i in items == 4:
set.discard(4)
else:
print("The number 4 is not in the set.")
print(set({1,2,4,6}))

This will discard the 4 in any set passed to the function:
def discard4(items):
if 4 in items:
items.discard(4)
return "Discarded"
else:
return "The number 4 is not in the set."
print(discard4({1,2,6})) # will print "The number 4 is not in the set."
print(discard4({1,2,4,6})) # will print "Discarded"

Assign variable if list index out of range python error

How to pass a string to a variable if an index error is found? Consider the code:
for l1, l2 in zip(open('file1.list'), open ('file2.list')):
a=fasta1[int(l1)]
b=fasta2[int(l2)]
alignments = pairwise2.align.globalxx(a,b)
top_aln = alignments[0]
aln_a, aln_b, score, begin, end = top_aln
print aln_a+'\n'+aln_b
outfast1 = aln_a
outfast2 = aln_b
A number of these functions must be imported (pairwise2 align),
but the file.lists are single column text files with one sequence id (text and numbers) per line, that are used to extract from the fasta1 and fasta2 text files.
Basically, I want to try: each list command ( a=fasta1[int(l1)]) and if there is no error (the id is in range), do as normal (assign variables a and b for that iteration), but if NOT, assign the 'a' variable some placeholder text like 'GGG':
for l1, l2 in zip(open('file1.list'), open ('file2.list')):
try:
a=fasta1[int(l1)]
except IndexError,e:
a="GGG"
continue
try:
b=fasta2[int(l2)]
except (IndexError):
b="CCC"
continue
This code doesn't quite work (when integrated with above code), which isn't surprising given my lack of python prowess, but I don't quite know why. I actually get no text output, despite the print calls... Am I thinking about this right? If there is NO error in the index, I just want it to go on and do the pairwise alignment (with the first a and b variables) and then print some text to stdout.
Any ideas?

Python's conditional (aka ternary) expressions can one-line this for you. They're often criticized for lack of readability, but I think this example reads well enough.
a = fasta1[int(l1)] if int(l1) < len(fasta1) else "GGG"

You don't need continue, because it will skip that iteration of the loop. Consider the following:
for l1, l2 in zip(open('file1.list'), open ('file2.list')):
a = 'GGG'
b = 'CCC'
try:
a = fasta1[int(l1)]
b = fasta2[int(l2)]
except IndexError:
pass

Python: creating a dictionary that writes high scores to a file

First: you don't have to code this for me, unless you're a super awesome nice guy. But since you're all great at programming and understand it so much better than me and all, it might just be easier (since it's probably not too many lines of code) than writing paragraph after paragraph trying to make me understand it.
So - I need to make a list of high scores that updates itself upon new entries. So here it goes:
First step - done
I have player-entered input, which has been taken as a data for a few calculations:
import time
import datetime
print "Current time:", time1.strftime("%d.%m.%Y, %H:%M")
time1 = datetime.datetime.now()
a = raw_input("Enter weight: ")
b = raw_input("Enter height: ")
c = a/b
Second step - making high score list
Here, I would need some sort of a dictionary or a thing that would read the previous entries and check if the score (c) is (at least) better than the score of the last one in "high scores", and if it is, it would prompt you to enter your name.
After you entered your name, it would post your name, your a, b, c, and time in a high score list.
This is what I came up with, and it definitely doesn't work:
list = [("CPU", 200, 100, 2, time1)]
player = "CPU"
a = 200
b = 100
c = 2
time1 = "20.12.2012, 21:38"
list.append((player, a, b, c, time1))
list.sort()
import pickle
scores = open("scores", "w")
pickle.dump(list[-5:], scores)
scores.close()
scores = open("scores", "r")
oldscores = pickle.load(scores)
scores.close()
print oldscores()
I know I did something terribly stupid, but anyways, thanks for reading this and I hope you can help me out with this one. :-)

First, don't use list as a variable name. It shadows the built-in list object. Second, avoid using just plain date strings, since it is much easier to work with datetime objects, which support proper comparisons and easy conversions.
Here is a full example of your code, with individual functions to help divide up the steps. I am trying not to use any more advanced modules or functionality, since you are obviously just learning:
import os
import datetime
import cPickle
# just a constants we can use to define our score file location
SCORES_FILE = "scores.pickle"
def get_user_data():
time1 = datetime.datetime.now()
print "Current time:", time1.strftime("%d.%m.%Y, %H:%M")
a = None
while True:
a = raw_input("Enter weight: ")
try:
a = float(a)
except:
continue
else:
break
b = None
while True:
b = raw_input("Enter height: ")
try:
b = float(b)
except:
continue
else:
break
c = a/b
return ['', a, b, c, time1]
def read_high_scores():
# initialize an empty score file if it does
# not exist already, and return an empty list
if not os.path.isfile(SCORES_FILE):
write_high_scores([])
return []
with open(SCORES_FILE, 'r') as f:
scores = cPickle.load(f)
return scores
def write_high_scores(scores):
with open(SCORES_FILE, 'w') as f:
cPickle.dump(scores, f)
def update_scores(newScore, highScores):
# reuse an anonymous function for looking
# up the `c` (4th item) score from the object
key = lambda item: item[3]
# make a local copy of the scores
highScores = highScores[:]
lowest = None
if highScores:
lowest = min(highScores, key=key)
# only add the new score if the high scores
# are empty, or it beats the lowest one
if lowest is None or (newScore[3] > lowest[3]):
newScore[0] = raw_input("Enter name: ")
highScores.append(newScore)
# take only the highest 5 scores and return them
highScores.sort(key=key, reverse=True)
return highScores[:5]
def print_high_scores(scores):
# loop over scores using enumerate to also
# get an int counter for printing
for i, score in enumerate(scores):
name, a, b, c, time1 = score
# #1 50.0 jdi (20.12.2012, 15:02)
print "#%d\t%s\t%s\t(%s)" % \
(i+1, c, name, time1.strftime("%d.%m.%Y, %H:%M"))
def main():
score = get_user_data()
highScores = read_high_scores()
highScores = update_scores(score, highScores)
write_high_scores(highScores)
print_high_scores(highScores)
if __name__ == "__main__":
main()
What it does now is only add new scores if there were no high scores or it beats the lowest. You could modify it to always add a new score if there are less than 5 previous scores, instead of requiring it to beat the lowest one. And then just perform the lowest check after the size of highscores >= 5

The first thing I noticed is that you did not tell list.sort() that the sorting should be based on the last element of each entry. By default, list.sort() will use Python's default sorting order, which will sort entries based on the first element of each entry (i.e. the name), then mode on to the second element, the third element and so on. So, you have to tell list.sort() which item to use for sorting:
from operator import itemgetter
[...]
list.sort(key=itemgetter(3))
This will sort entries based on the item with index 3 in each tuple, i.e. the fourth item.
Also, print oldscores() will definitely not work since oldscores is not a function, hence you cannot call it with the () operator. print oldscores is probably better.

Here are the things I notice.
These lines seem to be in the wrong order:
print "Current time:", time1.strftime("%d.%m.%Y, %H:%M")
time1 = datetime.datetime.now()
When the user enters the height and weight, they are going to be read in as strings, not integers, so you will get a TypeError on this line:
c = a/b
You could solve this by casting a and b to float like so:
a = float(raw_input("Enter weight: "))
But you'll probably need to wrap this in a try/catch block, in case the user puts in garbage, basically anything that can't be cast to a float. Put the whole thing in a while block until they get it right.
So, something like this:
b = None
while b == None:
try:
b = float(raw_input("Enter height: "))
except:
print "Weight should be entered using only digits, like '187'"
So, on to the second part, you shouldn't use list as a variable name, since it's a builtin, I'll use high_scores.
# Add one default entry to the list
high_scores = [("CPU", 200, 100, 2, "20.12.2012, 4:20")]
You say you want to check the player score against the high score, to see if it's best, but if that's the case, why a list? Why not just a single entry? Anyhow, that's confusing me, not sure if you really want a high score list, or just one high score.
So, let's just add the score, no matter what:
Assume you've gotten their name into the name variable.
high_score.append((name, a, b, c, time1))
Then apply the other answer from #Tamás

You definitely don't want a dictionary here. The whole point of a dictionary is to be able to map keys to values, without any sorting. What you want is a sorted list. And you've already got that.
Well, as Tamás points out, you've actually got a list sorted by the player name, not the score. On top of that, you want to sort in downward order, not upward. You could use the decorate-sort-undecorate pattern, or a key function, or whatever, but you need to do something. Also, you've put it in a variable named list, which is a very bad idea, because that's already the name of the list type.
Anyway, you can find out whether to add something into a sorted list, and where to insert it if so, using the bisect module in the standard library. But it's probably simpler to just use something like SortedCollection or blist.
Here's an example:
highscores = SortedCollection(scores, key=lambda x: -x[3])
Now, when you finish the game:
highscores.insert_right((player, a, b, newscore, time1))
del highscores[-1]
That's it. If you were actually not in the top 10, you'll be added at #11, then removed. If you were in the top 10, you'll be added, and the old #10 will now be #11 and be removed.
If you don't want to prepopulate the list with 10 fake scores the way old arcade games used to, just change it to this:
highscores.insert_right((player, a, b, newscore, time1))
del highscores[10:]
Now, if there were already 10 scores, when you get added, #11 will get deleted, but if there were only 3, nothing gets deleted, and now there are 4.
Meanwhile, I'm not sure why you're writing the new scores out to a pickle file, and then reading the same thing back in. You probably want to do the reading before adding the highscore to the list, and then do the writing after adding it.
You also asked how to "beautify the list". Well, there are three sides to that.
First of all, in the code, (player, a, b, c, time1) isn't very meaningful. Giving the variables better names would help, of course, but ultimately you still come down to the fact that when accessing list, you have to do entry[3] to get the score or entry[4] to get the time.
There are at least three ways to solve this:
Store a list (or SortedCollection) of dicts instead of tuples. The code gets a bit more verbose, but a lot more readable. You write {'player': player, 'height': a, 'weight': b, 'score': c, 'time': time1}, and then when accessing the list, you do entry['score'] instead of entry[3].
Use a collection of namedtuples. Now you can actually just insert ScoreEntry(player, a, b, c, time1), or you can insert ScoreEntry(player=player, height=a, weight=b, score=c, time=time1), whichever is more readable in a given case, and they both work the same way. And you can access entry.score or as entry[3], again using whichever is more readable.
Write an explicit class for score entries. This is pretty similar to the previous one, but there's more code to write, and you can't do indexed access anymore, but on the plus side you don't have to understand namedtuple.
Second, if you just print the entries, they look like a mess. The way to deal with that is string formatting. Instead of print scores, you do something like this:
print '\n'.join("{}: height {}, weight {}, score {} at {}".format(entry)
for entry in highscores)
If you're using a class or namedtuple instead of just a tuple, you can even format by name instead of by position, making the code much more readable.
Finally, the highscore file itself is an unreadable mess, because pickle is not meant for human consumption. If you want it to be human-readable, you have to pick a format, and write the code to serialize that format. Fortunately, the CSV format is pretty human-readable, and most of the code is already written for you in the csv module. (You may want to look at the DictReader and DictWriter classes, especially if you want to write a header line. Again, there's the tradeoff of a bit more code for a lot more readability.)

Matching Binary operators in Tuples to Dictionary Items

So, I'm working on a Pybrain-type project and I'm stuck on part of it.
So far the program takes in a tuple and assigns a variable to it using 'one of them fancy vars()['string'] statements. Specifically, it takes in a tuple of numbers and assigns it to a 'layerx' value, where x is the number of the layer (in order, layer 1, 2, 3, etc), such that the numbers are the dimensions of that layer.
The part of the program I desperately and humbly come to you for help in is what should be the next step in the program; it takes in a tuple of tuples (the number of tuples must = the number of layers), and the tuples contain 1/0's.
It is supposed to determine what type of Pybrain Layer to use in what layer, and then plugs in that layer's dimension value and, essentially, creates that layer-variable. I've...played with it for a while, and I've gotten a really...twisted...confusing block of code.
Please pardon the convoluted variable names, I thought I was being smart by making them somewhat specific:
moduleconbuff = 0
modulebuffer = 'module'
correspondinglayerbuff = 0
moduleconfigcopy = tuple(moduleconfig)
try: #Always triggers except, but it's pretty screwed up
while correspondinglayerbuff <= len(self.layers): #keeps track of how many layer/module pairs have been assigned
for elm in moduleconfigcopy:
for x in elm:
if x == 1:
moduledimmension = [layerbuff+'%s'%(correspondinglayerbuff)]
modulesdict = {1: pybrain.GaussianLayer(moduledimmension), 2: pybrain.LinearLayer(moduledimmension),\
3: pybrain.LSTMLayer(moduledimmension),4: pybrain.SigmoidLayer(moduledimmension),5: pybrain.TanhLayer(moduledimmension)} #this dict pairs integers with pybrain modules
vars()[modulebuffer +'%s'%(correspondinglayerbuff)]=modulesdict(moduleconbuff) #should return something like 'Module1 = pybrain.GaussianLayer(5) when complete
print vars()[modulebuffer+'%s'%(correspondinglayerbuff)]
moduleconbuff=0
correspondinglayerbuff+=1
print 'Valid: ', moduleconfigcopy, elm
continue
else:
elm = elm[1:]
print 'Invalid: ', moduleconfigcopy, elm
moduleconbuff+=1
except:
print 'Invalid!!!'
I honestly lost track of what was going on in it. The tuple "moduleconfig" in the beginning
was supposed to be a tuple of tuples (nested tuples) with binary operators, it was supposed to stop when one of the tuples has a 1, match that operator with the right module in Pybrain, and then plug this in so the corresponding layer = that module with the dimmensions already listed.
Obviously something went terribly wrong, and it's so fargone that my brain can't make any sense of it...it's lost all it's reason and every time I look at it I get scared...please help me or tell me I created an abomination or something, I guess...

One huge hindrance that's affecting code readability for you is variable naming and style. I've tried to clean it up a little bit for you. It still might not work, but now it's a LOT easier to see what's going on. Please refer to PEP 8, the Python style guide
For starters, I renamed some variables, below. Note that in python, variables should be all lowercase, with separate words connected by an underscore. Constants should be ALL_UPPERCASE:
assigned_layers = correspondinglayerbuff = 0
tuple_of_tuples = moduleconfigcopy = ((0, 1), (0, 0, 1), (0, 1))
dimension = moduledimension
MOD_BUFFER = modulebuffer = 'buffer'
c_buff = moduleconbuff = 0
And here is the while loop (with variable names replaced, and properly indented, with the try... except block removed:
while assigned_layers <= len(self.layers):
for element_tuple in tuple_of_tuples:
for item in element_tuple:
if item: # in python, 0 is treated as boolean False, 1 or any other value is treated as boolean True.
dimension = [layerbuff + str(assigned_layers)] #what is layerbuff?
modules_dict = {
1: pybrain.GaussianLayer(dimension),
2: pybrain.LinearLayer(dimension),
3: pybrain.LSTMLayer(dimension),
4: pybrain.SigmoidLayer(dimension),
5: pybrain.TanhLayer(dimension)
} # Notice how this dict is much easier to read.
vars()[MOD_BUFFER + str(assigned_layers)] = modules_dict[c_buff] #modules_dict is a dict and not a callable object
c_buff = 0
assigned_layers +=1
#No need for continue here, since that's what the if...else does here.
else:
element_tuple = element_tuple[1:] #what is this for?
print 'Invalid: ', tuple_of_tuples, element_tuple
I'm not sure exactly what you are trying to do in this line:
vars()[MOD_BUFFER + str(assigned_layers)] = modules_dict[c_buff] #modules_dict is a dict and not a callable object
Also, you originally had modules_dict(moduleconbuff) which will raise a TypeError as a dict is not a callable object. I'm assuming you meant to retrieve a value by key.
As I said, I'm not quite sure what your trying to do here (probably because I haven't seen the rest of your code), but renaming your variables and using good style should go a long way towards you being able to debug your code. I will continue to edit if you answer my questions/comment.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.