I'm trying to get this script to take the contents of the tuple and cycle through using a for loop (I'm not sure where to put it in my code) and put the contents of the tuple in a command. For this example I've used find as the command. Depending on which option the executor uses sp1 or sp2 will determine how much of the tuple will be used.
import sys, subprocess, os, string
cmd = '/bin/find '
tuple = ('apple', 'banana', 'cat', 'dog')
sp1 = tuple[0:1]
sp2 = tuple[2:3]
def find():
find_cmd = subprocess.Popen(cmd + " * -name {}".format(type)),
stdout=subprocess.PIPE,
stderr=subprocess.PIPE, shell=True)
output, err = find_cmd.communicate()
find = output
find_temp = []
find_temp = string.split(find)
if find_temp[0] == ' ':
print("Found nothing")
else:
print("Found {}".format(find_temp))
type_input = input("Are you looking for fruit or animals? ")
if type_input == "fruit":
type = sp1
elif type_input == "animals":
type = sp2
print("syntax error")
exit()
find()
You're close, but you don't do what you're trying to do, that's just silly. You could do better.
Rather than doing this weird tuple slicing thing, just give them a real name:
import sys, subprocess, os, string
# No need for the trailing space here
FIND = '/bin/find'
fruit = ('apple', 'banana')
animals = ('cat', 'dog')
Alternatively, you could use a dictionary:
find_params = {
'fruit': ['apple', 'banana'],
'animals': ['cat', 'dog'],
}
In your comment you mentioned:
my tuple is a bit larger and both variables use some of the same values...
This would keep me from typing many of the same values into two separate lists.
You can still take a nice approach:
cheeses = ('Wensleydale', 'Edam', 'Cheddar', 'Gouda')
spam = ('Spam with eggs', 'Spam on eggs', 'Spam')
confectionaries = ('crunchy frog', 'spring surprise', 'cockroach cluster',
'anthrax ripple', 'cherry fondue')
food = cheeses + spam + confectionaries
Even if you just need a subset, you can still do something like:
food = cheeses + spam[:2] + confectionaries[-1:]
You should take parameter(s) for your find command instead. Also, no need to concatenate and then use a format string. Just use a format string for all the things:
def find(what, cmd=FIND):
find_cmd = subprocess.Popen('{cmd} * -name {what}'.format(cmd=cmd, what=what),
stdout=subprocess.PIPE,
stderr=subprocess.PIPE, shell=True)
output, err = find_cmd.communicate()
find = output
find_temp = []
find_temp = string.split(find)
if find_temp[0] == ' ':
print("Found nothing")
else:
print("Found {}".format(find_temp))
Now you can either use the variables, or what they asked for:
type_input = input("Are you looking for fruit or animals? ")
try:
find(find_params[type_input])
except KeyError:
sys.exit('Unknown parameter {!r}'.format(type_input))
# Or use the variables
if type_input == "fruit":
find(fruit)
elif type_input == "animals":
find(animals)
else:
sys.exit('Unknown parameter {!r}'.format(type_input))
Related
last time I've gotten some help on making a website name generator. I feel bad but i'm stuck at the moment and I need some help again to improve it. in my code there's a .txt file called combined which included these lines.
After that i created a variable to add to the domain
web = 'web'
suffix = 'co.id'
And then i write it out so that the it would print the line output to the Combined.txt
output_count = 50
subdomain_count = 2
for i in range(output_count):
out = []
for j in range(subdomain_count):
out.append(random.choice(Test))
out.append(web)
out.append(suffix)
Example.write('.'.join(out)+"\n")
with open("dictionaries/examples.txt") as f:
websamples = [line.rstrip() for line in f]
I want the output where instead of just login.download.web.co.id there would be more variety like login-download.web.co.id or login.download-web.co.id In the code i used Example.write('.'.join(out)+"\n") so that the. would be a separator for each characters. I was thinking of adding more, by making a similar code line and save it to a different .txt files but I feel like it would be too long. Is there a way where I can variate each character separation with this symbol - or _ instead of just a . in the output?
Thanks!
Sure just iterate through a list of delimiters to add each of them to the output.
web = 'web'
suffix = 'co.id'
output_count = 50
subdomain_count = 2
delimeters = [ '-', '.']
for i in range(output_count):
out = []
for j in range(subdomain_count):
out.append(random.choice(Test))
for delimeter in delimeters:
addr = delimeter.join(out)
addrs = '.'.join([addr, web, suffix])
print(addrs)
Example.write(addrs + '\n')
output
my_pay.web.co.id
my-pay.web.co.id
my.pay.web.co.id
pay_download.web.co.id
pay-download.web.co.id
pay.download.web.co.id
group_login.web.co.id
group-login.web.co.id
group.login.web.co.id
install_group.web.co.id
install-group.web.co.id
install.group.web.co.id
...
...
update
import itertools
Test = ['download', 'login', 'my', 'ip', 'site', 'ssl', 'pay', 'install']
delimeters = [ '-', '.']
web = 'web'
suffix = 'co.id'
output_count = 50
subdomain_count = 2
for combo in itertools.combinations(Test, 2):
out = ''
for i, d in enumerate(delimeters):
out = d.join(combo)
out = delimeters[i-1].join([out, web])
addr = '.'.join([out, suffix])
print(addr)
# Example.write(addr+'\n')
output
download-login.web.co.id
download.login-web.co.id
download-my.web.co.id
download.my-web.co.id
download-ip.web.co.id
download.ip-web.co.id
download-site.web.co.id
download.site-web.co.id
download-ssl.web.co.id
download.ssl-web.co.id
download-pay.web.co.id
download.pay-web.co.id
download-install.web.co.id
download.install-web.co.id
login-my.web.co.id
login.my-web.co.id
login-ip.web.co.id
login.ip-web.co.id
login-site.web.co.id
login.site-web.co.id
login-ssl.web.co.id
login.ssl-web.co.id
login-pay.web.co.id
login.pay-web.co.id
login-install.web.co.id
login.install-web.co.id
my-ip.web.co.id
my.ip-web.co.id
my-site.web.co.id
my.site-web.co.id
my-ssl.web.co.id
my.ssl-web.co.id
my-pay.web.co.id
my.pay-web.co.id
my-install.web.co.id
my.install-web.co.id
ip-site.web.co.id
ip.site-web.co.id
ip-ssl.web.co.id
ip.ssl-web.co.id
ip-pay.web.co.id
ip.pay-web.co.id
ip-install.web.co.id
ip.install-web.co.id
site-ssl.web.co.id
site.ssl-web.co.id
site-pay.web.co.id
site.pay-web.co.id
site-install.web.co.id
site.install-web.co.id
ssl-pay.web.co.id
ssl.pay-web.co.id
ssl-install.web.co.id
ssl.install-web.co.id
pay-install.web.co.id
pay.install-web.co.id
As an alternative of replacing the final output, you could make the seperator random:
import random
seperators = ['-', '_', '.']
Example.write(random.choice(seperators).join(out)+"\n")
In order to ensure compliance with RFC 1035 I would suggest:
from random import choices as CHOICES, choice as CHOICE
output_count = 50
subdomain_count = 2
web = 'web'
suffix = 'co.id'
dotdash = '.-'
filename = 'output.txt'
Test = [
'auth',
'access',
'account',
'admin'
# etc
]
with open(filename, 'w') as output:
for _ in range(output_count):
sd = CHOICE(dotdash).join(CHOICES(Test, k=subdomain_count))
print('.'.join((sd, web, suffix)), file=output)
Below is my example code:
from fuzzywuzzy import fuzz
import json
from itertools import zip_longest
synonyms = open("synonyms.json","r")
synonyms = json.loads(synonyms.read())
vendor_data = ["i7 processor","solid state","Corei5 :1135G7 (11th
Generation)","hard
drive","ddr 8gb","something1", "something2",
"something3","HT (100W) DDR4-2400"]
buyer_data = ["i7 processor 12 generation","corei7:latest technology"]
vendor = []
buyer = []
for item,value in synonyms.items():
for k,k2 in zip_longest(vendor_data,buyer_data):
for v in value:
if fuzz.token_set_ratio(k,v) > 70:
if item in k:
vendor.append(k)
else:
vendor.append(item+" "+k)
else:
#didnt get only "something" strings here !
if fuzz.token_set_ratio(k2,v) > 70:
if item in k2:
buyer.append(k2)
else:
buyer.append(item+" "+k2)
vendor = list(set(vendor))
buyer = list(set(buyer))
vendor,buyer
Note: "something" string can be anything like "battery" or "display"etc
synonyms json
{
"processor":["corei5","core","corei7","i5","i7","ryzen5","i5 processor","i7
processor","processor i5","processor i7","core generation","core gen"],
"ram":["DDR4","memory","DDR3","DDR","DDR 8gb","DDR 8 gb","DDR 16gb","DDR 16 gb","DDR
32gb","DDR 32 gb","DDR4-"],
"ssd":["solid state drive","solid drive"],
"hdd":["Hard Drive"]
}
what do i need ?
I want to add all "something" string inside vendor list dynamically.
! NOTE -- "something" string can be anything in future.
I want to add "something" string in vendor array which is not a matched value in fuzz>70! I want to basically add left out data also.
for example like below:
current output
['processor Corei5 :1135G7 (11th Generation)',
'i7 processor',
'ram HT (100W) DDR4-2400',
'ram ddr 8gb',
'hdd hard drive',
'ssd solid state']
expected output below
['processor Corei5 :1135G7 (11th Generation)',
'i7 processor',
'ram HT (100W) DDR4-2400',
'ram ddr 8gb',
'hdd hard drive',
'ssd solid state',
'something1',
'something2'
'something3'] #something string need to be added in vendor list dynamically.
what silly mistake am I doing ? Thank you.
Here's my attempt:
from fuzzywuzzy import process, fuzz
synonyms = {'processor': ['corei5', 'core', 'corei7', 'i5', 'i7', 'ryzen5', 'i5 processor', 'i7 processor', 'processor i5', 'processor i7', 'core generation', 'core gen'], 'ram': ['DDR4', 'memory', 'DDR3', 'DDR', 'DDR 8gb', 'DDR 8 gb', 'DDR 16gb', 'DDR 16 gb', 'DDR 32gb', 'DDR 32 gb', 'DDR4-'], 'ssd': ['solid state drive', 'solid drive'], 'hdd': ['Hard Drive']}
vendor_data = ['i7 processor', 'solid state', 'Corei5 :1135G7 (11th Generation)', 'hard drive', 'ddr 8gb', 'something1', 'something2', 'something3', 'HT (100W) DDR4-2400']
buyer_data = ['i7 processor 12 generation', 'corei7:latest technology']
def find_synonym(s: str, min_score: int = 60):
results = process.extractBests(s, choices=synonyms, score_cutoff=min_score)
if not results:
return None
return results[0][-1]
def process_data(l: list, min_score: int = 60):
matches = []
no_matches = []
for item in l:
syn = find_synonym(item, min_score=min_score)
if syn is not None:
new_item = f'{syn} {item}' if syn not in item else item
matches.append(new_item)
elif any(fuzz.partial_ratio(s, item) >= min_score for s in synonyms.keys()):
# one of the synonyms is already in the item string
matches.append(item)
else:
no_matches.append(item)
return matches, no_matches
For process_data(vendor_data) we get:
(['i7 processor',
'ssd solid state',
'processor Corei5 :1135G7 (11th Generation)',
'hdd hard drive',
'ram ddr 8gb',
'ram HT (100W) DDR4-2400'],
['something1', 'something2', 'something3'])
And for process_data(buyer_data):
(['i7 processor 12 generation', 'processor corei7:latest technology'], [])
I had to lower the cut-off score to 60 to also get results for ddr 8gb. The process_data function returns 2 lists: One with matches with words from the synonyms dict and one with items without matches. If you want exactly the output you listed in your question, just concatenate the two lists like this:
matches, no_matches = process_data(vendor_data)
matches + no_matches # ['i7 processor', 'ssd solid state', 'processor Corei5 :1135G7 (11th Generation)', 'hdd hard drive', 'ram ddr 8gb', 'ram HT (100W) DDR4-2400', 'something1', 'something2', 'something3']
I have tried to come up with a decent answer (certainly not the cleanest one)
import json
from itertools import zip_longest
from fuzzywuzzy import fuzz
synonyms = open("synonyms.json", "r")
synonyms = json.loads(synonyms.read())
vendor_data = ["i7 processor", "solid state", "Corei5 :1135G7 (11thGeneration)", "hard drive", "ddr 8gb", "something1",
"something2",
"something3", "HT (100W) DDR4-2400"]
buyer_data = ["i7 processor 12 generation", "corei7:latest technology"]
vendor = []
buyer = []
for k, k2 in zip_longest(vendor_data, buyer_data):
has_matched = False
for item, value in synonyms.items():
for v in value:
if fuzz.token_set_ratio(k, v) > 70:
if item in k:
vendor.append(k)
else:
vendor.append(item + " " + k)
if has_matched or k2 is None:
break
else:
has_matched = True
if fuzz.token_set_ratio(k2, v) > 70:
if item in k2:
buyer.append(k2)
else:
buyer.append(item + " " + k2)
if has_matched or k is None:
break
else:
has_matched = True
else:
continue # match not found
break # match is found
else: # only evaluates on normal loop end
# Only something strings
# do something with the new input values
continue
vendor = list(set(vendor))
buyer = list(set(buyer))
I hope you can achieve what you want with this code. Check the docs if you don't know what a for else loop does. TLDR: the else clause executes when the loop terminates normally (not with a break). Note that I put the synonyms loop inside the data loop. This is because we can't certainly know in which synonym group the data belongs, also somethimes the vendor data entry is a processor while the buyer data is memory. Also note that I have assumed an item can't match more than 1 time. If this could be the case you would need to make a more advanced check (just make a counter and break when the counter equals 2 for example).
EDIT:
I took another look at the question and came up with maybe a better answer:
v_dict = dict()
for spec in vendor_data[:]:
for item, choices in synonyms.items():
if process.extractOne(spec, choices)[1] > 70: # don't forget to import process from fuzzywuzzy
v_dict[spec] = item
break
else:
v_dict[spec] = "Something new"
This code matches the strings to the correct type. for example {'i7 processor': 'processor', 'solid state': 'ssd', 'Corei5 :1135G7 (11thGeneration)': 'processor', 'hard drive': 'ssd', 'ddr 8gb': 'ram', 'something1': 'Something new', 'something2': 'Something new', 'something3': 'Something new', 'HT (100W) DDR4-2400': 'ram'}. You can change the "Something new" with watherver you like. You could also do: v_dict[spec] = 0 (on a match) and v_dict[spec] = 1 (on no match). You could then sort the dict ->
it = iter(v_dict.values())
print(sorted(v_dict.keys(), key=lambda x: next(it)))
Which would give the wanted results (more or less), all the recognised items will be first, and then all the unrecognised items. You could do some more advanced sorting on this dict if you want. I think this code gives you enough flexibility to reach your goal.
If I understand correctly, what you are trying to do is match keywords specified by a customer and/or vendor against a predefined database of keywords you have.
First, I would highly recommend using a reversed mapping of the synonyms, so it's faster to lookup, especially when the dataset will grow.
Second, considering the fuzzywuzzy API, it looks like you simply want the best match, so extractOne is a solid choice for that.
Now, extractOne returns the best match and a score:
>>> process.extractOne("cowboys", choices)
("Dallas Cowboys", 90)
I would split the algorithm into two:
A generic part that simply gets the best match, which should always exist (even if it's not a great one)
A filter, where you could adjust the sensitivity of the algorithm, based on different criteria of your application. This sensitivity threshold should set the minimal match quality. If you're below this threshold, just use "untagged" for the category for example.
Here is the final code, which I think is very simple and easy to understand and expand:
import json
from fuzzywuzzy import process
def load_synonyms():
with open('synonyms.json') as fin:
synonyms = json.load(fin)
# Reversing the map makes it much easier to lookup
reversed_synonyms = {}
for key, values in synonyms.items():
for value in values:
reversed_synonyms[value] = key
return reversed_synonyms
def load_vendor_data():
return [
"i7 processor",
"solid state",
"Corei5 :1135G7 (11thGeneration)",
"hard drive",
"ddr 8gb",
"something1",
"something2",
"something3",
"HT (100W) DDR4-2400"
]
def load_customer_data():
return [
"i7 processor 12 generation",
"corei7:latest technology"
]
def get_tag(keyword, synonyms):
THRESHOLD = 80
DEFAULT = 'general'
tag, score = process.extractOne(keyword, synonyms.keys())
return synonyms[tag] if score > THRESHOLD else DEFAULT
def main():
synonyms = load_synonyms()
customer_data = load_customer_data()
vendor_data = load_vendor_data()
data = customer_data + vendor_data
tags_dict = { keyword: get_tag(keyword, synonyms) for keyword in data }
print(json.dumps(tags_dict, indent=4))
if __name__ == '__main__':
main()
When running with the specified inputs, the output is:
{
"i7 processor 12 generation": "processor",
"corei7:latest technology": "processor",
"i7 processor": "processor",
"solid state": "ssd",
"Corei5 :1135G7 (11thGeneration)": "processor",
"hard drive": "hdd",
"ddr 8gb": "ram",
"something1": "general",
"something2": "general",
"something3": "general",
"HT (100W) DDR4-2400": "ram"
}
text="Brand.*/Smart Planet.#/Color.*/Yellow.#/Type.*/Sandwich Maker.#/Power Source.*/Electrical."
I have this kind of string. I am facing the problem which splits it to 2 lists. Output will be approximately like this :
name = ['Brand','Color','Type','Power Source']
value = ['Smart Plane','Yellow','Sandwich Maker','Electrical']
Is there any solution for this.
name = []
value = []
text = text.split('.#/')
for i in text:
i = i.split('.*/')
name.append(i[0])
value.append(i[1])
This is one approach using re.split and list slicing.
Ex:
import re
text="Brand.*/Smart Planet.#/Color.*/Yellow.#/Type.*/Sandwich Maker.#/Power Source.*/Electrical."
data = [i for i in re.split("[^A-Za-z\s]+", text) if i]
name = data[::2]
value = data[1::2]
print(name)
print(value)
Output:
['Brand', 'Color', 'Type', 'Power Source']
['Smart Planet', 'Yellow', 'Sandwich Maker', 'Electrical']
You can use regex to split the text, and populate the lists in a loop.
Using regex you protect your code from invalid input.
import re
name, value = [], []
for ele in re.split(r'\.#\/', text):
k, v = ele.split('.*/')
name.append(k)
value.append(v)
>>> print(name, val)
['Brand', 'Color', 'Type', 'Power Source'] ['Smart Planet', 'Yellow', 'Sandwich Maker', 'Electrical.']
text="Brand.*/Smart Planet.#/Color.*/Yellow.#/Type.*/Sandwich Maker.#/Power Source.*/Electrical."
name=[]
value=[]
word=''
for i in range(len(text)):
temp=i
if text[i]!='.' and text[i]!='/' and text[i]!='*' and text[i]!='#':
word=word+''.join(text[i])
elif temp+1<len(text) and temp+2<=len(text):
if text[i]=='.' and text[temp+1]=='*' and text[temp+2]=='/':
name.append(word)
word=''
elif text[i]=='.' and text[temp+1]=='#' and text[temp+2]=='/':
value.append(word)
word=''
else:
value.append(word)
print(name)
print(value)
this will be work...
I know that you can use split() to split a user input into two, but how would you split input that consists of multiple variables ? For example:
User input:
Shawn=14:soccer#2991842
What I would like to do:
name = Shawn
age = 14
course = soccer
idnumber = 2991842
What's the best way to do such thing ?
str = 'Shawn=14:soccer#2991842'
keys = ['name', 'age', 'course', 'idnumber']
values = re.split('[=:#]', str)
print dict(zip(keys, values))
Out[114]: {'age': '14', 'course': 'soccer', 'idnumber': '2991842', 'name': 'Shawn'}
I think Regex will work best here:
>>> from re import split
>>> mystr = "Shawn=14:soccer#2991842"
>>> split("\W", mystr)
['Shawn', '14', 'soccer', '2991842']
>>> lst = split("\W", mystr)
>>> name = lst[0]
>>> name
'Shawn'
>>> age = lst[1]
>>> age
'14'
>>> course = lst[2]
>>> course
'soccer'
>>> idnumber = lst[3]
>>> idnumber
'2991842'
>>>
Also, the above is a step-by-step demonstration. You can actually just do:
name, age, course, idnumber = split("\W", mystr)
Here's how I would do it.
def splitStr(str):
temp = str.split(':')
temp_nameAge = temp[0].split('=')
temp_courseId = temp[1].split('#')
name = temp_nameAge[0]
age = int(temp_nameAge[1])
course = temp_courseId[0]
idnumber = int(temp_courseId[1])
print 'Name = %s, age = %i, course = %s, id_number = %i' % (name, age, course, idnumber)
Another thing you can do is use split like: string.split(":").
Then you can change the format to "name:age:course:number"
You could just keep splitting the splits...
text2split = "Shawn=14:soccer#2991842"
name = text2split.split('=')[0]
age = text2split.split('=')[1].split(':')[0]
course = text2split.split('=')[1].split(':')[1].split('#')[0]
idnumber = text2split.split('=')[1].split(':')[1].split('#')[1]
This isn't the most elegant way to do it, but it'll work so long as text2split always has the same delimeters.
If you are ok with storing them under dictionary keys, you could use named group references
import re
x='shawn=14:soccer#2991842'
re.match(r'(?P<name>.*?)=(?P<age>.*):(?P<course>.*?)#(?P<idnumber>.*)', x).groupdict()
{'idnumber': '2991842', 'course': 'soccer', 'age': '14', 'name': 'shawn
I have some problems with writing some Gerrit http://code.google.com/p/gerrit/ hooks.
http://gerrit.googlecode.com/svn/documentation/2.2.0/config-hooks.html
If I parse the command line for
patchset-created --change --change-url --project --branch --uploader --commit --patchset
def main():
if (len(sys.argv) < 2):
showUsage()
exit()
if (sys.argv[1] == 'update-projects'):
updateProjects()
exit()
need = ['action=', 'change=', 'change-url=', 'commit=', 'project=', 'branch=', 'uploader=',
'patchset=', 'abandoner=', 'reason=', 'submitter=', 'comment=', 'CRVW=', 'VRIF=' , 'patchset=' , 'restorer=', 'author=']
print sys.argv[1:]
print '-----'
optlist, args = getopt.getopt(sys.argv[1:], '', need)
id = url = hash = who = comment = reason = codeReview = verified = restorer = ''
print optlist
for o, a in optlist:
if o == '--change': id = a
elif o == '--change-url': url = a
elif o == '--commit': hash = a
elif o == '--action': what = a
elif o == '--uploader': who = a
elif o == '--submitter': who = a
elif o == '--abandoner': who = a
elif o == '--author' : who = a
elif o == '--branch': branch = a
elif o == '--comment': comment = a
elif o == '--CRVW' : codeReview = a
elif o == '--VRIF' : verified = a
elif o == '--patchset' : patchset = a
elif o == '--restorer' : who = a
elif o == '--reason' : reason = a
Command line input:
--change I87f7802d438d5640779daa9ac8196aeb3eec8c2a
--change-url http://<hostname>:8080/308
--project private/bar
--branch master
--uploader xxxxxxx-xxxxx xxxxxxx (xxxxxxxxxxxxx.xxxxxxx#xxx-xxxx.xx)
--commit 49aae9befaf27a5fede51b498f0660199f47b899 --patchset 1
print sys.argv[1:]
['--action', 'new',
'--change','I87f7802d438d5640779daa9ac8196aeb3eec8c2a',
'--change-url',
'http://<hostname>:8080/308',
'--project', 'private/bar',
'--branch', 'master',
'--uploader', 'xxxxxxx-xxxxx', 'xxxxxxx', '(xxxxxxxxxxxxx.xxxxxxx#xxx-xxxx.xx)',
'--commit', '49aae9befaf27a5fede51b498f0660199f47b899',
'--patchset', '1']
print optlist
[('--action', 'new'),
('--change', 'I87f7802d438d5640779daa9ac8196aeb3eec8c2a'),
('--change-url', 'http://<hostname>:8080/308'),
('--project', 'private/bar'),
('--branch', 'master'),
('--uploader', 'xxxxxxx-xxxxx')]
I don't know why the script generates
'--uploader', 'xxxxxxx-xxxxx', 'xxxxxxx', '(xxxxxxxxxxxxx.xxxxxxx#xxx-xxxx.xx)'
and not
'--uploader', 'xxxxxxx-xxxxx xxxxxxx (xxxxxxxxxxxxx.xxxxxxx#xxx-xxxx.xx)'
because so the script dont't parse --commit --patchset ...
When I parse comment-added all things works:
Command line input:
-change I87f7802d438d5640779daa9ac8196aeb3eec8c2a
--change-url http://<hostname>.intra:8080/308
--project private/bar
--branch master
--author xxxxxxx-xxxxx xxxxxxx (xxxxxxxxxxxxx.xxxxxxx#xxx-xxxx.xx)
--commit 49aae9befaf27a5fede51b498f0660199f47b899
--comment asdf
--CRVW 0
--VRIF 0
print sys.argv[1:]
'--action', 'comment',
'--change', 'I87f7802d438d5640779daa9ac8196aeb3eec8c2a',
'--change-url',
'http://<hostname>:8080/308',
'--project', 'private/bar',
'--branch', 'master',
'--author', 'xxxxxxx-xxxxx xxxxxxx (xxxxxxxxxxxxx.xxxxxxx#xxx-xxxx.xx)', <<< That's right!
'--commit', '49aae9befaf27a5fede51b498f0660199f47b899',
'--comment', 'asdf',
'--CRVW', '0',
'--VRIF', '0']
As the options names and values are space-separated, you have to put the values in quotes if they contain spaces themselves.
If you write --uploader xxxxxxx-xxxxx xxxxxxx (xxxxxxxxxxxxx.xxxxxxx#xxx-xxxx.xx), the last two strings will actually end up in args from the line
optlist, args = getopt.getopt(sys.argv[1:], '', need)
as they are not associated with --uploader
You should quote an argument, if it contains spaces, like for all commandline tools:
--uploader "xxxxxxx-xxxxx xxxxxxx (xxxxxxxxxxxxx.xxxxxxx#xxx-xxxx.xx)"
You may also consider using gnu_getopt() as it would allow you to mix option and non-option arguments.
From the Documentation
The getopt() function stops processing options as soon as a non-option argument is encountered
If you use gnu_getopt, the rest of the options namely commit and pathset will still be parsed correctly even though the uploader argument has missing quotes