I have some troubles understanding the way the format() method of string works.
Suppose that I set a string variable with keywords arguments:
s = '{hello} {person_name}'
I could either assign this value to another variable or print it. In the latter case, the result would be {hello} {person_name}.
I could also use the format() method while printing s and assign some values to the keywords:
print(s.format(hello='hello', person_name='Alice'))
In this case, the result is hello Alice. Of course, I could also assign it to a new variable.
My problem arises when I want to use format only on one keyword:
print(s.format(hello='hello'))
or
a = s.format(hello='hello')
Both of them throw an error:
KeyError: 'person_name'
I want to be able to run something like :
s = '{hello} {person_name}'
a = s.format(hello='hello')
if something:
b = a.format(person_name='Alice')
else:
b = a.format(person_name='Bob')
print(b)
Is something like this possible or should I set all keywords when I use format()?
In your use case, you might consider escaping the {person} in the string:
# double brace the person_name to escape it for the first format
s = '{hello} {{person_name}}'
a = s.format(hello='hello')
# a = 'hello {person_name}'
if something:
b = a.format(person_name='Alice')
# b = 'hello Alice'
else:
b = a.format(person_name='Bob')
# b = 'hello Bob'
print(b)
With this method however you will need to follow the explicit order in which you escaped your variables. i.e. you must assign hello first and then person_name. If you need to be flexible about the order of things, I would suggest using a dict to construct the variables before passing it altogether:
# dict approach
s = '{hello} {person_name}'
# determine the first variable
d = {'hello':'hello'}
... do something
d.update({'person': 'Alice'})
# unpack the dictionary as kwargs into your format method
b = s.format(**d)
# b = 'hello Alice'
This gives you a bit more flexibility on the order of things. But you must only call .format() once all the variables are provided in your dict (at least it must have a default value), otherwise it'll still raise an error.
If you want to be more fancy and want the ability to print the field names at the absence of the variable, you can make your own wrapper function as well:
# wrapper approach
# We'll make use of regex to keep things simple and versatile
import re
def my_format(message, **kwargs):
# build a regex pattern to catch words+digits within the braces {}
pat = re.compile('{[\w\d]+}')
# build a dictionary based on the identified variables within the message provided
msg_args = {v.strip('{}'): v for v in pat.findall(message)}
# update the dictionary with provided keyword args
msg_args.update(kwargs)
# ... and of course, print it
print(message.format(**msg_args))
s = 'Why {hello} there {person}'
my_format(s, hello='hey')
# Why hey there {person}
my_format(s, person='Alice')
# Why {hello} there Alice
You can determine the default display (at the absence of a variable) you want by modifying the v in dictionary comprehension.
As per PEP 3101 If the index or keyword refers to an item that does not exist, then an IndexError/KeyError should be raised.
But you can create your own custom formatter class like this.
from string import Formatter
class MyStringFormatter(Formatter):
def get_value(self, key, args, kwds):
try:
return super().get_value(key, args, kwds)
except KeyError:
return "{%s}" % key
fmt = MyStringFormatter()
DEMO
s = "{hello} {person_name}"
keywords = {'hello': 'hello'}
a = fmt.format(s, **keywords)
print(a)
# This will print hello {person_name}
something = False
if something:
person_name = {'person_name': 'Alice'}
else:
person_name = {'person_name': 'Bob'}
b = fmt.format(a, **person_name)
print(b)
# This will print `hello Bob` if `something` is False, 'hello Alice' otherwise.
Is something like this possible or should I set all keywords when I use format()?
PEP-3101 says:
If the index or keyword refers to an item that does not exist, then an IndexError/KeyError should be raised.
So yes, if you are going to use keywords you would have to specify them all.
I think you have to define all the keywords while using format().
I would suggest a different approach using *args:
def printHello(*args):
print(' '.join([arg for arg in args]))
printHello('hello', 'Alice')
# hello Alice
printHello('hello')
# hello
You can send any number of words into this function.
I have two similar codes that need to be parsed and I'm not sure of the most pythonic way to accomplish this.
Suppose I have two similar "codes"
secret_code_1 = 'asdf|qwer-sdfg-wert$$otherthing'
secret_code_2 = 'qwersdfg-qw|er$$otherthing'
both codes end with $$otherthing and contain a number of values separated by -
At first I thought of using functools.wrap to separate some of the common logic from the logic specific to each type of code, something like this:
from functools import wraps
def parse_secret(f):
#wraps(f)
def wrapper(code, *args):
_code = code.split('$$')[0]
return f(code, *_code.split('-'))
return wrapper
#parse_secret
def parse_code_1b(code, a, b, c):
a = a.split('|')[0]
return (a,b,c)
#parse_secret
def parse_code_2b(code, a, b):
b = b.split('|')[1]
return (a,b)
However doing it this way makes it kind of confusing what parameters you should actually pass to the parse_code_* functions i.e.
parse_code_1b(secret_code_1)
parse_code_2b(secret_code_2)
So to keep the formal parameters of the function easier to reason about I changed the logic to something like this:
def _parse_secret(parse_func, code):
_code = code.split('$$')[0]
return parse_func(code, *_code.split('-'))
def _parse_code_1(code, a, b, c):
"""
a, b, and c are descriptive parameters that explain
the different components in the secret code
returns a tuple of the decoded parts
"""
a = a.split('|')[0]
return (a,b,c)
def _parse_code_2(code, a, b):
"""
a and b are descriptive parameters that explain
the different components in the secret code
returns a tuple of the decoded parts
"""
b = b.split('|')[1]
return (a,b)
def parse_code_1(code):
return _parse_secret(_parse_code_1, code)
def parse_code_2(code):
return _parse_secret(_parse_code_2, code)
Now it's easier to reason about what you pass to the functions:
parse_code_1(secret_code_1)
parse_code_2(secret_code_2)
However this code is significantly more verbose.
Is there a better way to do this? Would an object-oriented approach with classes make more sense here?
repl.it example
repl.it example
Functional approaches are more concise and make more sense.
We can start from expressing concepts in pure functions, the form that is easiest to compose.
Strip $$otherthing and split values:
parse_secret = lambda code: code.split('$$')[0].split('-')
Take one of inner values:
take = lambda value, index: value.split('|')[index]
Replace one of the values with its inner value:
parse_code = lambda values, p, q: \
[take(v, q) if p == i else v for (i, v) in enumerate(values)]
These 2 types of codes have 3 differences:
Number of values
Position to parse "inner" values
Position of "inner" values to take
And we can compose parse functions by describing these differences. Split values are keep packed so that things are easier to compose.
compose = lambda length, p, q: \
lambda code: parse_code(parse_secret(code)[:length], p, q)
parse_code_1 = compose(3, 0, 0)
parse_code_2 = compose(2, 1, 1)
And use composed functions:
secret_code_1 = 'asdf|qwer-sdfg-wert$$otherthing'
secret_code_2 = 'qwersdfg-qw|er$$otherthing'
results = [parse_code_1(secret_code_1), parse_code_2(secret_code_2)]
print(results)
I believe something like this could work:
secret_codes = ['asdf|qwer-sdfg-wert$$otherthing', 'qwersdfg-qw|er$$otherthing']
def parse_code(code):
_code = code.split('$$')
if '-' in _code[0]:
return _parse_secrets(_code[1], *_code[0].split('-'))
return _parse_secrets(_code[0], *_code[1].split('-'))
def _parse_secrets(code, a, b, c=None):
"""
a, b, and c are descriptive parameters that explain
the different components in the secret code
returns a tuple of the decoded parts
"""
if c is not None:
return a.split('|')[0], b, c
return a, b.split('|')[1]
for secret_code in secret_codes:
print(parse_code(secret_code))
Output:
('asdf', 'sdfg', 'wert')
('qwersdfg', 'er')
I'm not sure about your secret data structure but if you used the index of the position of elements with data that has | in it and had an appropriate number of secret data you could also do something like this and have an infinite(well almost) amount of secrets potentially:
def _parse_secrets(code, *data):
"""
data is descriptive parameters that explain
the different components in the secret code
returns a tuple of the decoded parts
"""
i = 0
decoded_secrets = []
for secret in data:
if '|' in secret:
decoded_secrets.append(secret.split('|')[i])
else:
decoded_secrets.append(secret)
i += 1
return tuple(decoded_secrets)
I'm really not sure what exactly you mean. But I came with idea which might be what you are looking for.
What about using a simple function like this:
def split_secret_code(code):
return [code] + code[:code.find("$$")].split("-")
And than just use:
parse_code_1(*split_secret_code(secret_code_1))
I'm not sure exactly what constraints you're working with, but it looks like:
There are different types of codes with different rules
The number of dash separated args can vary
Which arg has a pipe can vary
Straightforward Example
This is not too hard to solve, and you don't need fancy wrappers, so I would just drop them because it adds reading complexity.
def pre_parse(code):
dash_code, otherthing = code.split('$$')
return dash_code.split('-')
def parse_type_1(code):
dash_args = pre_parse(code)
dash_args[0], toss = dash_args[0].split('|')
return dash_args
def parse_type_2(code):
dash_args = pre_parse(code)
toss, dash_args[1] = dash_args[1].split('|')
return dash_args
# Example call
parse_type_1(secret_code_1)
Trying to answer question as stated
You can supply arguments in this way by using python's native decorator pattern combined with *, which rolls/unrolls positional arguments into a tuple, so you don't need to know exactly how many there are.
def dash_args(code):
dash_code, otherthing = code.split('$$')
return dash_code.split('-')
def pre_parse(f):
def wrapper(code):
# HERE is where the outer function, the wrapper,
# supplies arguments to the inner function.
return f(code, *dash_args(code))
return wrapper
#pre_parse
def parse_type_1(code, *args):
new_args = list(args)
new_args[0], toss = args[0].split('|')
return new_args
#pre_parse
def parse_type_2(code, *args):
new_args = list(args)
toss, new_args[1] = args[1].split('|')
return new_args
# Example call:
parse_type_1(secret_code_1)
More Extendable Example
If for some reason you needed to support many variations on this kind of parsing, you could use a simple OOP setup, like
class BaseParser(object):
def get_dash_args(self, code):
dash_code, otherthing = code.split('$$')
return dash_code.split('-')
class PipeParser(BaseParser):
def __init__(self, arg_index, split_index):
self.arg_index = arg_index
self.split_index = split_index
def parse(self, code):
args = self.get_dash_args(code)
pipe_arg = args[self.arg_index]
args[self.arg_index] = pipe_arg.split('|')[self.split_index]
return args
# Example call
pipe_parser_1 = PipeParser(0, 0)
pipe_parser_1.parse(secret_code_1)
pipe_parser_2 = PipeParser(1, 1)
pipe_parser_2.parse(secret_code_2)
My suggestion attempts the following:
to be non-verbose enough
to separate common and specific logic in a clear way
to be sufficiently extensible
Basically, it separates common and specific logic into different functions (you could do the same using OOP). The thing is that it uses a mapper variable that contains the logic to select a specific parser, according to each code's content. Here it goes:
def parse_common(code):
"""
Provides common parsing logic.
"""
encoded_components = code.split('$$')[0].split('-')
return encoded_components
def parse_code_1(code, components):
"""
Specific parsing for type-1 codes.
"""
components[0] = components[0].split('|')[0] # decoding some type-1 component
return tuple([c for c in components])
def parse_code_2(code, components):
"""
Specific parsing for type-2 codes.
"""
components[1] = components[1].split('|')[1] # decoding some type-2 component
return tuple([c for c in components])
def parse_code_3(code, components):
"""
Specific parsing for type-3 codes.
"""
components[2] = components[2].split('||')[0] # decoding some type-3 component
return tuple([c for c in components])
# ... and so on, if more codes need to be added ...
# Maps specific parser, according to the number of components
CODE_PARSER_SELECTOR = [
(3, parse_code_1),
(2, parse_code_2),
(4, parse_code_3)
]
def parse_code(code):
# executes common parsing
components = parse_common(code)
# selects specific parser
parser_info = [s for s in CODE_PARSER_SELECTOR if len(components) == s[0]]
if parser_info is not None and len(parser_info) > 0:
parse_func = parser_info[0][1]
return parse_func(code, components)
else:
raise RuntimeError('No parser found for code: %s' % code)
secret_codes = [
'asdf|qwer-sdfg-wert$$otherthing', # type 1
'qwersdfg-qw|er$$otherthing', # type 2
'qwersdfg-hjkl-yui||poiuy-rtyu$$otherthing' # type 3
]
print [parse_code(c) for c in secret_codes]
Are you married to the string parsing? If you are passing variables with values and are in no need for variable names you can "pack" them into integer.
If you are working with cryptography you can formulate a long hexadecimal number of characters and then pass it as int with "stop" bytes (0000 for example since "0" is actually 48 try: chr(48) ) and if you are married to a string I would suggest a lower character byte identifier for example ( 1 -> aka try: chr(1) ) so you can scan the integer and bit shift it by 8 to get bytes with 8 bit mask ( this would look like (secret_code>>8)&0xf.
Hashing works in similar manner since one variable with somename and somevalue, somename and somevalue can be parsed as integer and then joined with stop module, then retrieved when needed.
Let me give you an example for hashing
# lets say
a = 1
# of sort hashing would be
hash = ord('a')+(0b00<<8)+(1<<16)
#where a hashed would be 65633 in integer value on 64 bit computer
# and then you just need to find a 0b00 aka separator
if you want to use only variables ( names don't matter ) then you need to hash only variable value so the size of parsed value is a lot smaller ( not name part and no need for separator (0b00) and you can use separator cleverly to divide necessary data one fold (0b00) twofolds (0b00, 0b00<<8) etc.
a = 1
hash = a<<8 #if you want to shift it 1 byte
But if you want to hide it and you need cryptography example, you can do the above methods and then scramble, shift ( a->b ) or just convert it to another type later. You just need to figure out the order of operations you are doing. Since a-STOP-b-PASS-c is not equal to a-PASS-b-STOP-c.
You can find bitwise operators here binary operators
But have in mind that 65 is number and 65 is a character as well it only matters where are those bytes sent, if they are sent to graphics card they are pixels, if they are sent to audiocard they are sounds and if they are sent to mathematical processing they are numbers, and as programmers that is our playground.
But if this is not answering your problem, you can always use map.
def mapProcces(proccesList,listToMap):
currentProcces = proccesList.pop(0)
listToMap = map( currentProcces, listToMap )
if proccesList != []:
return mapProcces( proccesList, listToMap )
else:
return list( listToMap )
then you could map it:
mapProcces([str.lower,str.upper,str.title],"stackowerflow")
or you can simply replace every definitive separator with space and then split space.
secret_code_1 = 'asdf|qwer-sdfg-wert$$otherthing'
separ = "|,-,$".split(",")
secret_code_1 = [x if x not in separ else " " for x in secret_code_1]# replaces separators with empty chars
secret_code_1 = "".join(secret_code_1) #coverts list to a string
secret_code_1 = secret_code_1.split(" ") #it splited them to list
secret_code_1 = filter(None,secret_code_1) # filter empty chars ''
first,second,third,fourth,other = secret_code_1
And there you have it, your secret_code_1 is split and assigned to definitive amount of variables. Of course " " is used as declaration, you can use whatever you want, you can replace every separator with "someseparator" if you want and then split with "someseparator". You can also use str.replace function to make it clearer.
I hope this helps
I think you need to provide more information of exactly what you're trying to achieve, and what the clear constraints are. For instance, how many times can $$ occur? Will there always be a | dividor? That kind of thing.
To answer your question broadly, an elegant pythonic way to do this is to use python's unpacking feature, combined with split. for example
secret_code_1 = 'asdf|qwer-sdfg-wert$$otherthing'
first_$$_part, last_$$_part = secret_code_1.split('$$')
By using this technique, in addition to simple if blocks, you should be able to write an elegant parser.
If I understand it correctly, you want to be able to define your functions as if the parsed arguments are passed, but want to pass the unparsed code to the functions instead.
You can do that very similarly to the first solution you presented.
from functools import wraps
def parse_secret(f):
#wraps(f)
def wrapper(code):
args = code.split('$$')[0].split('-')
return f(*args)
return wrapper
#parse_secret
def parse_code_1(a, b, c):
a = a.split('|')[0]
return (a,b,c)
#parse_secret
def parse_code_2(a, b):
b = b.split('|')[1]
return (a,b)
For the secret codes mentioned in the examples,
secret_code_1 = 'asdf|qwer-sdfg-wert$$otherthing'
print (parse_code_1(secret_code_1))
>> ('asdf', 'sdfg', 'wert')
secret_code_2 = 'qwersdfg-qw|er$$otherthing'
print (parse_code_2(secret_code_2))
>> ('qwersdfg', 'er')
I haven't understood anything of your question, neither your code, but maybe a simple way to do it is by regular expression?
import re
secret_code_1 = 'asdf|qwer-sdfg-wert$$otherthing'
secret_code_2 = 'qwersdfg-qw|er$$otherthing'
def parse_code(code):
regex = re.search('([\w-]+)\|([\w-]+)\$\$([\w]+)', code) # regular expression
return regex.group(3), regex.group(1).split("-"), regex.group(2).split("-")
otherthing, first_group, second_group = parse_code(secret_code_2)
print(otherthing) # otherthing, string
print(first_group) # first group, list
print(second_group) # second group, list
The output:
otherthing
['qwersdfg', 'qw']
['er']
I came up with this situation while writing code in python for my project and began to think on this problem.
The problem is given a string containing a function name with its arguments how do we get the arguments and the function name given the number of arguments in the function.
My first thought was:
s = 'func(a,b)'
index = s.find('(')
if(index != -1):
arg_list = s[index+1:-1].split(',')
func_name = s[:index]
But as I began to think more I realised what if function is specified within functions which has its own arguments?
func1(func2(a,b,c),func3(d,e))
With my above approach I will end up with right function name but arg_list will contain
["func2(a","b","c)","func3(","d","e)"]
How to generically solve this situation?
If your language looks sufficiently like Python, use ast.parse():
import ast
def parse(s):
tree = ast.parse(s)
print ast.dump(tree)
parse('f(a,b)')
All the information you need will be encoded in tree.
>>> import pyparsing as pyp
>>> def get_name_and_args(a_string):
index = a_string.find('(')
if index == -1:
raise Exception("No '(' found")
else:
root_function, a_string = a_string[:index], a_string[index:]
data = {}
data[root_function] = pyp.nestedExpr().parseString(a_string).asList()[0][0].split(',')
return data
>>> print get_name_and_args("func(a,b)")
{'func': ['a', 'b']}
This solves the simpler example you gave using the pyparsing module. I wasn't sure exactly how you wanted the output formatted, and it doesn't work for the nested example. However this should be enough to get you started
In Python, I'm trying to implement a pseudo-ternary operator within a template string. A value is inserted into a string if kwargs has a specific key.
re module has a way do exactly what I need in re.sub(), you can pass a function to be called on matches. What I can't do is to pass **kwargs to it. Code follows
import re
template_string = "some text (pseudo_test?val_if_true:val_if_false) some text"
def process_pseudo_ternary(match, **kwargs):
if match.groups()[0] in kwargs:
return match.groups()[1]
else:
return match.groups()[2]
def process_template(ts, **kwargs):
m = re.compile('\((.*)\?(.*):(.*)\)')
return m.sub(process_pseudo_ternary, ts)
print process_template(template_string, **{'pseudo_test':'yes-whatever', 'other_value':42})
line if match.groups()[0] in kwargs: is the problem of course, as process_pseudo_ternary's kwargs are empty.
Any ideas on how to pass these? m.sub(function, string) doesn't take arguments.
The final string is to be: some text val_if_true some text (because the dictionary has the key named 'pseudo_test').
Feel free to redirect me to a different implementation of ternary operator in a string. I'm aware of Python conditional string formatting . I need the ternary to be in the string, not in the string's formatting tuple/dict.
If i understand it correctly, you could use something like http://docs.python.org/library/functools.html#functools.partial
return m.sub(partial(process_pseudo_ternary, custom_1=True, custom_2=True), ts)
EDIT: Changed a little, to match your code better.
I have a small python script which i use everyday......it basically reads a file and for each line i basically apply different string functions like strip(), replace() etc....im constanstly editing the file and commenting to change the functions. Depending on the file I'm dealing with, I use different functions. For example I got a file where for each line, i need to use line.replace(' ','') and line.strip()...
What's the best way to make all of these as part of my script? So I can just say assign numbers to each functions and just say apply function 1 and 4 for each line.
First of all, many string functions – including strip and replace – are deprecated. The following answer uses string methods instead. (Instead of string.strip(" Hello "), I use the equivalent of " Hello ".strip().)
Here's some code that will simplify the job for you. The following code assumes that whatever methods you call on your string, that method will return another string.
class O(object):
c = str.capitalize
r = str.replace
s = str.strip
def process_line(line, *ops):
i = iter(ops)
while True:
try:
op = i.next()
args = i.next()
except StopIteration:
break
line = op(line, *args)
return line
The O class exists so that your highly abbreviated method names don't pollute your namespace. When you want to add more string methods, you add them to O in the same format as those given.
The process_line function is where all the interesting things happen. First, here is a description of the argument format:
The first argument is the string to be processed.
The remaining arguments must be given in pairs.
The first argument of the pair is a string method. Use the shortened method names here.
The second argument of the pair is a list representing the arguments to that particular string method.
The process_line function returns the string that emerges after all these operations have performed.
Here is some example code showing how you would use the above code in your own scripts. I've separated the arguments of process_line across multiple lines to show the grouping of the arguments. Of course, if you're just hacking away and using this code in day-to-day scripts, you can compress all the arguments onto one line; this actually makes it a little easier to read.
f = open("parrot_sketch.txt")
for line in f:
p = process_line(
line,
O.r, ["He's resting...", "This is an ex-parrot!"],
O.c, [],
O.s, []
)
print p
Of course, if you very specifically wanted to use numerals, you could name your functions O.f1, O.f2, O.f3… but I'm assuming that wasn't the spirit of your question.
If you insist on numbers, you can't do much better than a dict (as gimel suggests) or list of functions (with indices zero and up). With names, though, you don't necessarily need an auxiliary data structure (such as gimel's suggested dict), since you can simply use getattr to retrieve the method to call from the object itself or its type. E.g.:
def all_lines(somefile, methods):
"""Apply a sequence of methods to all lines of some file and yield the results.
Args:
somefile: an open file or other iterable yielding lines
methods: a string that's a whitespace-separated sequence of method names.
(note that the methods must be callable without arguments beyond the
str to which they're being applied)
"""
tobecalled = [getattr(str, name) for name in methods.split()]
for line in somefile:
for tocall in tobecalled: line = tocall(line)
yield line
It is possible to map string operations to numbers:
>>> import string
>>> ops = {1:string.split, 2:string.replace}
>>> my = "a,b,c"
>>> ops[1](",", my)
[',']
>>> ops[1](my, ",")
['a', 'b', 'c']
>>> ops[2](my, ",", "-")
'a-b-c'
>>>
But maybe string descriptions of the operations will be more readable.
>>> ops2={"split":string.split, "replace":string.replace}
>>> ops2["split"](my, ",")
['a', 'b', 'c']
>>>
Note:
Instead of using the string module, you can use the str type for the same effect.
>>> ops={1:str.split, 2:str.replace}
To map names (or numbers) to different string operations, I'd do something like
OPERATIONS = dict(
strip = str.strip,
lower = str.lower,
removespaces = lambda s: s.replace(' ', ''),
maketitle = lamdba s: s.title().center(80, '-'),
# etc
)
def process(myfile, ops):
for line in myfile:
for op in ops:
line = OPERATIONS[op](line)
yield line
which you use like this
for line in process(afile, ['strip', 'removespaces']):
...