Range for nargs in argparse - python

I have a script which merges multiple video and audio files. Now I have a parameter which allows four values:
# -A FILENAME LANGUAGE POSITION SPEED
$ script.py [... more parameters ...] -A audio.mp3 eng -1 1 [... more parameters ...]
Now I want the third and fourth to be optional. Currently I have two ideas but maybe there is a better solution:
Set nargs to + and throw an error if 1 or more than 4 parameters are supplied. Maybe the type parameter can catch this. Problem would be that it isn't visible in the help that 2 to 4 values are required.
Have 4 different parameters for all combinations. This would allow to have the position optional. Problem is that I then need four parameter names.
The parameter also might appear multiple times (action is append).

I would suggest having -A take a single, comma-separated string (or use the delimiter of your choice), and supply a custom metavar for the help message.
def av_file_type(str):
data = tuple(str.split(","))
n = len(data)
if n < 2:
raise ArgumentError("Too few arguments")
elif n == 2:
return data + (default_position, default_speed)
elif n == 3:
return data + (default_speed,)
elif n == 4:
return data
else:
return ArgumentError("Too many arguments")
p.add_argument("-A", action='append', type=av_file_type,
metavar='filename,language[,position[,speed]]')
With nargs='+', it would be extremely non-trivial to format the help string the way you like.

I think the things you want to happen are:
allow the user to input 2, 3 or 4 arguments. '+' allows that.
tell the user how many arguments they can give. If the code doesn't do what you want, you can always give a custom usage, description, or help.
object if they enter 1 or more than 4. You can test entries in 3 places - with a custom type, a custom action, or after parse_args.
type won't help you here, because it handles each argument separately. If I enter p.parse_args('-A one two three'.split()), the type function is called 3 times, once for each of the argument strings. It does not see all the strings together.
action might work, since it sees all the argument values that parse_args thinks -A wants. This would all the strings between one -A and the next -A (or other flag). But since you want to append, you need to model your custom action on the argparse._AppendAction class.
checking the namespace after the fact may be your best choice. You'll have a list of lists, and you can check the number of elements in each of the sublists. You can use parse.error(your_message) to generate an argparse style message.
There is a Python bug issue about enabling a nargs range value http://bugs.python.org/issue11354. I proposed a patch that would accept nargs='{m,n}' which is modeled on the re feature. In fact it ends up using re matching to allocated strings to various actions. Read that issue if you want to know more about what SethMMorton is talking about.

Based on chepner's answer I developed a more advanced “subparser”:
audio_parameters = [ "f", "l", "p", "s", "b", "o" ]
def audio_parser(value):
data = {
"l": None,
"p": -1,
"s": 1,
"b": None,
"o": 0,
}
found = set()
if value[0] in audio_parameters and value[1] == "=":
start = 0
while start >= 0:
end = start
parameter = value[start]
found.add(parameter)
#search for the next ',x=' block, where x is an audio_parameter
while end >= 0:
# try next ',' after the last found
end = value.find(",", end + 2)
# exit loop, when find, (or after non found)
if end >= 0 and value[end + 1] in audio_parameters and value[end + 1] not in found and value[end + 2] == "=":
end += 1
break
if parameter in audio_parameters:
parameter_value = value[start + 2:end - 1 if end > 0 else len(value)]
if parameter_value != "":
data[parameter] = parameter_value
start = end
else:
i = 0
for splitted in value.split(","):
if i >= len(audio_parameters):
return ArgumentTypeError("Too many arguments")
if len(splitted) > 0:
data[audio_parameters[i]] = splitted
i += 1
if "f" in data:
return data
else:
raise argparse.ArgumentTypeError("Too few arguments")
This allows the proposed file[,lang[,pos[,speed]]] but also more advanced selecting specific values. For example to set only the file, language and speed f=file,s=speed,l=lang does work, and this in any order. It also allows something which might look like a parameter name, but which doesn't exist or was already used. Both might have been parsed by the simple version (f=file,x=stillname,s=speed,l=lang). The f parameter there is then file,x=stillname. It also allows something like f=file,f=overwrites because it accepts only the first occurrence. So if the file name contains ,b= you can simply write b=,f=file,b=haha.
A mixed mode like file,l=lang is not possible. And as you might have seen, that parameter got way more complex and has now 6 subparameters which makes it almost impossible to use one parameter name for each combination. And a structure like '{n,m}' is also not as flexible as you can't easily omit values.
One thing I noticed though, a metavar with [] doesn't work.

Related

Same input gives different output for a program

Novice programmer here. I am trying to write a program wherein it will take UIDs from user and validate them based on certain rules. The rules are:
It must contain at least 2 uppercase English alphabet characters.
It must contain at least 3 digits ( 0-9 ).
3.It should only contain alphanumeric characters (A -Z ,a -z & 0 -9 ).
No character should repeat.
There must be exactly characters in a valid UID.
I am putting in the code. Also apologies for this big code (I am a newbie)
# UID Validation
n=int(input()) #for iterations
uid=[]
# char=[]
valid=1
upper=0
numeric=0
# take input first of everycase
for x in range (0,n):
num=input()
uid.append(num)
# print (uid)
for i in uid:
# print (i)
# to count each word and number
count={}
for char in i:
count[char]=count.get(char,0)+1
for j in i:
if j.isupper():
upper=upper+1
elif j.isnumeric():
numeric=numeric+1
# print('numeric =', numeric)
# print('upper =', upper)
# Check conditions
while valid==1:
if len(i)!= 10:
valid= 0
# print('invalid for word count')
elif i.isalnum()== False: #alphanumeric
valid=0
# print('invalid for alnum')
elif upper<2: #minimum alphabet and numbers
valid=0
# print('invalid for min alphabet')
elif numeric<3:
valid=0
# print('invalid for min numeric')
else:
for k,v in count.items(): #no repitation
if v>1:
valid=0
# to check if given UID is valid or not
if valid==1:
print ('Valid')
elif valid==0:
print('Invalid')
valid=1
break
I have written the code but it seems that I am facing problem on one input only that is to check UID tag: 2TB1YVIGNM
It is an invalid tag. My program shows the same when is I run it alone or first in a batch of many. But, Lets say I run the program and input 2 tags, with "2TB1YVIGNM" being second one, it will show is as "Valid". Mind you, this is only happening in this particular tag
There are several other tags which run fine. Some of them are mentioned here:
77yS77UXtS
d72MJ4Rerf
OA778K96P2
2TB1YVIGNM "EXCEPT THIS TAG"
9JC86fM1L7
3w2F84OSw5
GOeGU49JDw
8428COZZ9C
WOPOX413H2
1h5dS6K3X8
Fq6FN44C6P
The output should be:
Invalid
Valid
Invalid
Invalid
Valid
Invalid
Invalid
Invalid
Invalid
Valid
Invalid
My output is this:
Invalid
Valid
Invalid
Valid
Valid
Invalid
Invalid
Invalid
Invalid
Valid
Invalid
To solve your problem you need to set upper and numeric back to 0 for each uid:
for i in uid:
upper = 0
numeric = 0
count={}
P.S: As for you newbie I would suggest you to read PEP 8 it will make your code more readable and prettier
P.S.S: There is no need to count manually how many times each character meet in string, such operation already implemented in Python look at the Counter for more details
And I agree with comment that for such type of tasks it is better to use regex
You could extract pieces of logic into functions and call them:
#It must contain at least 2 uppercase English alphabet characters.
def has_at_least_two_uppercase(potential_uid):
return sum([char.upper() == char for char in potential_uid])>= 2
#No character should repeat.
def has_unique_chars(potential_uid):
return len(set(potential_uid)) == len(potential_uid)
#There must be exactly characters in a valid UID.
def is_proper_length(potential_uid:str, proper_length:int = 10)-> bool:
return len(potential_uid) == proper_length
#It must contain at least 3 digits ( 0-9 ).
def has_three_digits(potential_uid):
return sum([char.isnumeric() for char in potential_uid])>=3
#It should only contain alphanumeric characters (A -Z ,a -z & 0 -9 )
# Defining a function for this may be an overkill to be honest
def is_alphanumeric(potential_uid):
return potential_uid.isalnum()
def is_valid_uid(potential_uid):
if has_at_least_two_uppercase(potential_uid) is False:
return False
if has_unique_chars(potential_uid) is False:
return False
if is_proper_length(potential_uid) is False:
return False
if has_three_digits(potential_uid) is False:
return False
if is_alphanumeric(potential_uid) is False:
return False
return True
Side notes:
use is to check for True/False
use True/False and not 1/0 for boolean conditions (like valid variable)
[OPTIONAL / homework]
use docstrings instead of comments
add add type hints (see is_proper_length as an example)
you can use all() and pass all the calls into it, but the ifs will short circuit from the function without checking all the conditions (all depends on a problem, like number of conditions, length of the UID, number of UIDs to be checked etc.) and you can play around with order of the checks e.g. if the length is not right there's no need to check the rest (but it's a pre-optimization in a way, which is discouraged in general)
parametrize your functions further if need be, define params for number of upper to check, numeric and so on

Variable table width with .format

I'm trying to display data from a csv in a text table. I've got to the point where it displays everything that I need, however the table width still has to be set, meaning if the data is longer than the number set then issues begin.
I currently print the table using .format to sort out formatting, is there a way to set the width of the data to a variable that is dependant on the length of the longest piece of data?
for i in range(len(list_l)):
if i == 0:
print(h_dashes)
print('{:^1s}{:^26s}{:^1s}{:^26s}{:^1s}{:^26s}{:^1s}{:^26s}{:^1s}'.format('|', (list_l[i][0].upper()),'|', (list_l[i][1].upper()),'|',(list_l[i][2].upper()),'|', (list_l[i][3].upper()),'|'))
print(h_dashes)
else:
print('{:^1s}{:^26s}{:^1s}{:^26s}{:^1s}{:^26s}{:^1s}{:^26s}{:^1s}'.format('|', list_l[i][0], '|', list_l[i][1], '|', list_l[i][2],'|', list_l[i][3],'|'))
I realise that the code is far from perfect, however I'm still a newbie so it's piecemeal from various tutorials
You can actually use a two-pass approach to first get the correct lengths. As per your example with four fields per line, the following shows the basic idea you can use.
What follows is an example of the two-pass approach, first to get the maximum lengths for each field, the other to do what you're currently doing (with the calculated rather than fixed lengths):
# Can set MINIMUM lengths here if desired, eg: lengths = [10, 0, 41, 7]
lengths = [0] * 4
fmtstr = None
for pass in range(2):
for i in range(len(list_l)):
if pass == 0:
# First pass sets lengths as per data.
for field in range(4):
lengths[field] = max(lengths[field], len(list_l[i][field])
else:
# Second pass prints the data.
# First, set format string if not yet set.
if fmtstr is None:
fmtstr = '|'
for item in lengths:
fmtstr += '{:^%ds}|' % (item)
# Now print item (and header stuff if first item).
if i == 0: print(h_dashes)
print(fmtstr.format(list_l[i][0].upper(), list_l[i][1].upper(), list_l[i][2].upper(), list_l[i][3].upper()))
if i == 0: print(h_dashes)
The construction of the format string is done the first time you process an item in pass two.
It does so by taking a collection like [31,41,59] and giving you the string:
|{:^31s}|{:^41s}|{:^59s}|
There's little point using all those {:^1s} format specifiers when the | is not actually a varying item - you may as well code it directly into the format string.

GET Request Flask

I have written something that works, but I am 100% sure that there is an even more efficient and faster way of doing what I did.
The code that I have written, essentially uses OpenBayes' library and creates a network with its nodes, relationships between nodes, and the probabilities and distributions associated with each of the nodes. Now, I was creating a GET request using Flask, in order to process the conditional probabilities by simply sending the request.
I will send some evidence (given values), and set the node in which I want its probability (observed value). Mathematically it looks like this:
Observed Value = O and Evidence = En, where n > 1
P( O | E1, E2, ..., En)
My final goal would be to have a client/server ping the server hosting this code(with the right parameters) and constantly give me the final values of the observed probability, given the evidence (which could be 1 or more values). The code I have written so far for the GET request portion is:
#app.route('/evidence/evidence=<evidence>&observed=<obv>', methods=['GET'])
def get_evidence(evidence, obv):
# Take <evidence> and <obv> split them up. For example:
# 'cloudy1rain0sprinkler1' to 'cloudy1', 'rain0' and 'sprinkler1', all in a nice list.
analyzeEvidence, observedNode = evidence.upper().strip(), obv.upper().strip()
string, count, newCount, listOfEvidence = "", 0, 0, {}
counter = sum(character.isdigit() for character in analyzeEvidence)
# This portion is to set up all the evidences.
for y in xrange(0, counter):
string, newCount = "", count
for x in xrange(newCount, len(analyzeEvidence)):
count += 1
if analyzeEvidence[x].isalpha() == True:
string += str(analyzeEvidence[x])
elif analyzeEvidence[x].isdigit() == True and string in allNodes:
if int(analyzeEvidence[x]) == 1 or int(analyzeEvidence[x]) == 0:
listOfEvidence[string] = int(analyzeEvidence[x])
break
else: abort(400)
break
else: abort(400)
net.SetObs(listOfEvidence) # This would set the evidence like this: {"CLOUDY": 1, "RAIN":0}
# This portion is to set up one single observed value
string = ""
for x in xrange(0, len(observedNode)):
if observedNode[x].isalpha() == True:
string += str(observedNode[x])
if string == "WETGRASS":
string = "WET GRASS"
elif observedNode[x].isdigit() == True and string in allNodes:
if int(observedNode[x]) == 1 or int(observedNode[x]) == 0:
observedValue = int(observedNode[x])
observedNode = string
break
else: abort(400)
else: abort(400)
return str(net.Marginalise(observedNode)[observedValue]) # Output returned is the value like: 0.7452
Given my code, is there any way to optimize it? Also, Is there a better way of passing these parameters that doesn't take so many lines like my code does? I was planning on setting fixed key parameters, but because my number of evidence can change per request, I thought this would be one way in doing so.
You can easily split your evidence input into a list of strings with this:
import re
# 'cloudy1rain0sprinkler1' => ['cloudy1', 'rain0' and 'sprinkler1'].
evidence_dict = {}
input_evidence = 'cloudy1rain0sprinkler1'
# looks for a sequence of alphabets followed by any number of digits
evidence_list = re.findall('([a-z]+\d+)', input_evidence.lower())
for evidence in evidence_list:
name, val, _ = re.split('(\d+)', evidence)
if name in allNodes:
evidence_dict[name] = val
# evidence_dict = {'cloudy': 1, 'rain': 0, 'sprinkler': 1}
You should be able to do something similar with the observations.
I would suggest you use an HTTP POST. That way you can send a JSON object which will already have the separation of variable names and values done for you, all you'll have to do is check that the variable names sent are valid in allNodes. It will also allow your variable list to grow somewhat arbitrarily.

Nested Loop 'If'' Statement Won't Print Value of Tuple

Current assignment is building a basic text adventure. I'm having trouble with the following code. The current assignment uses only functions, and that is the way the rules of the assignment state it must be done.
def make_selections(response):
repeat = True
while repeat == True:
selection = raw_input('-> ')
for i, v in enumerate(response):
i +=1 # adds 1 to the index to make list indices correlate to a regular 1,2,3 style list
if selection == i:
print v[1]
else:
print "There's an error man, what are you doing?!?!?"
firstResponse = 'You chose option one.'
secondResponse = 'You chose option two.'
thirdResponse = 'You chose option three.'
responses = [(0, firstResponse), (1, secondResponse),( 0, thirdResponse)]
make_selections(responses)
My intention in that code is to make it so if the user selects a 1, it will return firstResponse, if the user selects 2 it will return secondResponse, etc.
I am basically just bug testing the code to make sure it produces the appropriate response, hence the "Error man..." string, but for some reason it just loops through the error message without printing the appropriate response string. Why is this?
I know that this code is enumerating the list of tuples and I can call them properly, as I can change the code to the following and get the expected output:
for i, v in enumerate(response):
i += 1 # adds 1 to the index to make list indices correlate to a regular 1,2,3 style list
print i, v
Also, two quick asides before anyone asks:
I know there is currently no way to get out of this while loop. I'm just making sure each part of my code works before I move on to the next part. Which brings me to the point of the tuples.
When I get the code working, a 0 will produce the response message and loop again, asking the user to make a different selection, whereas a 1 will produce the appropriate response, break out of the loop, and move on to the next 'room' in the story... this way I can have as many 'rooms' for as long of a story as I want, the player does not have to 'die' each time they make an incorrect selection, and each 'room' can have any arbitrary amount of options and possible responses to choose from and I don't need to keep writing separate loops for each room.
There are a few problems here.
First, there's no good reason to iterate through all the numbers just to see if one of them matches selection; you already know that will be true if 1 <= selection <= len(response), and you can then just do response[selection-1] to get the v. (If you know anything about dicts, you might be able to see an even more convenient way to write this whole thing… but if not, don't worry about it.)
But if you really want to do this exhaustive search, you shouldn't print out There is an error man after any mismatch, because then you're always going to print it at least twice. Instead, you want to only print it if all of them failed to match. You can do this by keeping track of a "matched" flag, or by using a break and an else: clause on your for loop, whichever seems simpler, but you have to do something. See break and continue Statements, and else Clauses on Loops in the tutorial for more details.
But the biggest problem is that raw_input returns a string, and there's no way a string is ever going to be equal to a number. For example, try '1' == 1 in your interactive interpreter, and it'll say False. So, what you need to do is convert the user's input into a number so you can compare it. You can do that like this:
try:
selection = int(selection)
except ValueError:
print "That's not a number!"
continue
Seems like this is a job for dictionaries in python. Not sure if your assignment allows this, but here's my code:
def make_selections(response):
selection = raw_input('-> ')
print response.get(selection, err_msg)
resp_dict = {
'1':'You chose option one.',
'2':'You chose option two.',
'3':'You chose option three.'
}
err_msg = 'Sorry, you must pick one of these choices: %s'%sorted(resp_dict.keys())
make_selections(resp_dict)
The problem is that you are comparing a string to an integer. Selection is raw input, so it comes in as a str. Convert it to an int and it will evaluate as you expect.
You can check the type of a variable by using type(var). For example, print type(selection) after you take the input will return type 'str'.
def make_selections(response):
repeat = True
while repeat == True:
selection = raw_input('-> ')
for i, v in enumerate(response):
i +=1 # adds 1 to the index to make list indices correlate to a regular 1,2,3 style list
if int(selection) == i:
print v[1]
else:
print "There's an error man, what are you doing?!?!?"

Handling variable length command tuple with try...except

I'm writing a Python 3 script that does tabulation for forestry timber counts.
The workers will radio the species, diameter, and height in logs of each tree they mark to the computer operator. The computer operator will then enter a command such as this:
OAK 14 2
which signifies that the program should increment the count of Oak trees of fourteen inches in diameter and two logs in height.
However, the workers also sometimes call in more than one of the same type of tree at a time. So the program must also be able to handle this command:
OAK 16 1 2
which would signify that we're increasing the count by two.
The way I have the parser set up is thus:
key=cmdtup[0]+"_"+cmdtup[1]+"_"+cmdtup[2]
try:
trees[key]=int(trees[key])+int(cmdtup[3])
except KeyError:
trees[key]=int(cmdtup[3])
except IndexError:
trees[key]=int(trees[key])+1
If the program is commanded to store a tree it hasn't stored before, a KeyError will go off, and the handler will set the dict entry instead of increasing it. If the third parameter is omitted, an IndexError will be raised, and the handler will treat it as if the third parameter was 1.
Issues occur, however, if we're in both situations at once; the program hasn't heard of Oak trees yet, and the operator hasn't specified a count. KeyError goes off, but then generates an IndexError of its own, and Python doesn't like it when exceptions happen in exception handlers.
I suppose the easiest way would be to simply remove one or the other except and have its functionality be done in another way. I'd like to know if there's a more elegant, Pythonic way to do it, though. Is there?
I would do something like this:
def parse(cmd, trees):
res = cmd.split() # split the string by spaces, yielding a list of strings
if len(res) == 3: # if we got 3 parameters, set the fourth to 1
res.append(1)
for i in range(1,4): # convert parameters 1-3 to integers
res[i] = int(res[i])
key = tuple(res[x] for x in range(3)) # convert list to tuple, as lists cannot be dictionary indexes
trees[key] = trees.get(key,0) + res[3] # increase the number of entries, creating it if needed
trees={}
# test data
parse("OAK 14 2", trees)
parse("OAK 16 1 2", trees)
parse("OAK 14 2", trees)
parse("OAK 14 2", trees)
# print result
for tree in trees:
print(tree, "=", trees[tree])
yielding
('OAK', 16, 1) = 2
('OAK', 14, 2) = 3
Some notes:
no error handling here, you should handle the case when a value supposed to be a number isn't or the input is wrong in any other way
instead of strings, I use tuples as a dictionary index
You could use collections.Counter, which returns 0 rather than a KeyError if the key isn't in the dictionary.
Counter Documentation:
Counter objects have a dictionary interface except that they return a zero count for missing items instead of raising a KeyError
Something like this:
from collections import Counter
counts = Counter()
def update_counts(counts, cmd):
cmd_list = cmd.split()
if len(cmd_list) == 3:
tree = tuple(cmd_list)
n = 1
else:
*tree, n = tuple(cmd_list)
counts[tree] += n
Same notes apply as in uselpa's answer. Another nice thing with Counter is that if you want to, e.g., look at weekly counts, you just do something like sum(daily_counts).
Counter works even better if you're starting from a list of commands:
from collections import Counter
from itertools import repeat
raw_commands = get_commands() # perhaps read a file
command_lists = [c.split() for c in raw_commands]
counts = Counter(parse(command_lists))
def parse(commands):
for c in commands:
if len(c) == 3:
yield tuple(c)
elif len(c) == 4
yield from repeat(tuple(c[0:2]), times=c[3])
From there you can use the update_counts function above to add new trees, or you can start collecting the commands in another text file and then generate a second Counter object for the next day, the next week, etc.
In the end, the best way was to simply remove the IndexError handler, change cmdtup to a list, and insert the following:
if len(cmdtup) >= 3:
cmdtup.append(1)

Categories