How to make it shorter (Pythonic)? - python

I have to check a lot of worlds if they are in string... code looks like:
if "string_1" in var_string or "string_2" in var_string or "string_3" in var_string or "string_n" in var_string:
do_something()
how to make it more readable and more clear?

This is one way:
words = ['string_1', 'string_2', ...]
if any(word in var_string for word in words):
do_something()
Reference: any()
Update:
For completeness, if you want to execute the function only if all words are contained in the string, you can use all() instead of any().
Also note that this construct won't do any unnecessary computations as any will return if it encounters a true value and a generator expression is used to create the Boolean values. So you also have some kind of short-circuit evaluation that is normally used when evaluating Boolean expressions.

import re
if re.search("string_1|string_2|string_n", var_strings): print True
The beauty of python regex it that it returns either a regex object (that gives informations on what matched) or None, that can be used as a "false" value in a test.

With regex that would be:
import re
words = ['string_1', 'string_2', ...]
if re.search('|'.join([re.escape(w) for w in words]), var_string):
blahblah

Have you looked at filter?
filter( lambda x: x in var_string, ["myString", "nextString"])
which then can be combined with map to get this
map( doSomething(), filter(lambda x: x in var_string, ["myString", "nextString"] ) )
EDIT:
of course that doesn't do what you want. Go with the any solution. For some reason I thought you wanted it done every time instead of just once.

>>> import re
>>> string="word1testword2andword3last"
>>> c=re.compile("word1|word2|word3")
>>> c.search(string)
<_sre.SRE_Match object at 0xb7715d40>
>>> string="blahblah"
>>> c.search(string)
>>>

one more way to achieve this
check = lambda a: any(y for y in ['string_%s'%x for x in xrange(0,10)] if y in a)
print check('hello string_1')

Related

During string concatenation, how to add delimiter only if variable is set?

How to add the delimiter only if that variable has a value, in the below code, I am trying to avoid 2 underscores like: foo_bar__baz, a,b,d will be always set, only c is optional, is there a more pythonic way?
>>> a_must='foo'
>>> b_must='bar'
>>> c_optional=''
>>> d_must='baz'
>>>
>>> f'{a_must}_{b_must}_{c_optional}_{d_must}' if c_optional else
f'{a_must}_{b_must}_{d_must}'
'foo_bar_baz'
Its in python3.6
You can write the conditional inside the f-string itself:
f'{a_must}_{b_must}_{c_optional+"_" if c_optional else ""}{d_must}'
Output:
'foo_bar_baz'
To be a little more flexible, something like this would work:
variables = [a_must, b_must, c_optional, d_must]
'_'.join([x for x in variables if x])
You can build a list of tokens and use str.join to join the list into a string with _ as the delimiter:
tokens = [a_must, b_must]
if c_optional:
tokens.append(c_optional)
tokens.append(d_must)
print('_'.join(tokens))
Your solution works fine, it just needed a little formatting. I added the print statement for testing.
a_must='foo'
b_must='bar'
c_optional=''
d_must='baz'
if c_optional:
result = f'{a_must}_{b_must}_{c_optional}_{d_must}'
else:
result = f'{a_must}_{b_must}_{d_must}'
print(result)

Easier way to write this adhering to pylint

Is there a better way to write something like this ?
if 'legal' in href_link or 'disclaimer' in href_link or 'contact' in href_link or 'faq' in href_link or 'terms' in href_link or 'log' in href_link:
continue
preferably in a single line...Where do I look?
Use the built-in any:
items = ('legal', 'disclaimer', 'contact', 'faq', 'terms', 'log')
if any(x in href_link for x in items):
continue
You can use the iterable directly in any to have a true one-liner, but then its more readable this way.
You could build a regular expression. I'm not sure about efficiency, you'd have to compare with #MosesKoledoye's nice answer.
To match against alternatives you use the pipe |. You'd need something like legal|disclaimer|contact|faq|terms|log as a pattern.
You can build that by joining a string '|' with the values:
>>> values = {'legal', 'disclaimer', 'contact', 'faq', 'terms', 'log'}
>>> pattern = '|'.join(values)
>>> pattern
'terms|log|faq|legal|contact|disclaimer'
Using the re (regular expression) module:
>>> import re
>>> href_link = 'link_to_disclaimer.html'
>>> if re.search(pattern, href_link):
... print('matches')
matches
#MosesKoledoye's answer is probably the best one for you: it certainly makes much better code to condense the six uniform tests into one iteration.
But you might instead have been asking "How can I break a long conditional to fit into 79 characters?". In other words, you might have been asking about code formatting rather than how to code. In which case my preferred answer is to format it something like this:
if (a in b or
c in d or
e not in f or
g not in h):
continue

Python: Dividing a string into substrings

I have a bunch of mathematical expressions stored as strings. Here's a short one:
stringy = "((2+2)-(3+5)-6)"
I want to break this string up into a list that contains ONLY the information in each "sub-parenthetical phrase" (I'm sure there's a better way to phrase that.) So my yield would be:
['2+2','3+5']
I have a couple of ideas about how to do this, but I keep running into a "okay, now what" issue.
For example:
for x in stringy:
substring = stringy[stringy.find('('+1 : stringy.find(')')+1]
stringlist.append(substring)
Works just peachy to return 2+2, but that's about as far as it goes, and I am completely blanking on how to move through the remainder...
One way using regex:
import re
stringy = "((2+2)-(3+5)-6)"
for exp in re.findall("\(([\s\d+*/-]+)\)", stringy):
print exp
Output
2+2
3+5
You could use regular expressions like the following:
import re
x = "((2+2)-(3+5)-6)"
re.findall(r"(?<=\()[0-9+/*-]+(?=\))", x)
Result:
['2+2', '3+5']

How do I write a regex to replace a word but keep its case in Python?

Is this even possible?
Basically, I want to turn these two calls to sub into a single call:
re.sub(r'\bAword\b', 'Bword', mystring)
re.sub(r'\baword\b', 'bword', mystring)
What I'd really like is some sort of conditional substitution notation like:
re.sub(r'\b([Aa])word\b', '(?1=A:B,a:b)word')
I only care about the capitalization of the first character. None of the others.
You can have functions to parse every match:
>>> def f(match):
return chr(ord(match.group(0)[0]) + 1) + match.group(0)[1:]
>>> re.sub(r'\b[aA]word\b', f, 'aword Aword')
'bword Bword'
OK, here's the solution I came up with, thanks to the suggestions to use a replace function.
re.sub(r'\b[Aa]word\b', lambda x: ('B' if x.group()[0].isupper() else 'b') + 'word', 'Aword aword.')
You can pass a lambda function which uses the Match object as a parameter as the replacement function:
import re
re.sub(r'\baword\b',
lambda m: m.group(0)[0].lower() == m.group(0)[0] and 'bword' or 'Bword',
'Aword aword',
flags=re.I)
# returns: 'Bword bword'
Use capture groups (r'\1'):
re.sub(r'\b([Aa])word\b', r'\1word', "hello Aword")

Python: Use Regular expression to remove something

I've got a string looks like this
ABC(a =2,b=3,c=5,d=5,e=Something)
I want the result to be like
ABC(a =2,b=3,c=5)
What's the best way to do this? I prefer to use regular expression in Python.
Sorry, something changed, the raw string changed to
ABC(a =2,b=3,c=5,dddd=5,eeee=Something)
longer = "ABC(a =2,b=3,c=5,d=5,e=Something)"
shorter = re.sub(r',\s*d=\d+,\s*e=[^)]+', '', longer)
# shorter: 'ABC(a =2,b=3,c=5)'
When the OP finally knows how many elements are there in the list, he can also use:
shorter = re.sub(r',\s*d=[^)]+', '', longer)
it cuts the , d= and everything after it, but not the right parenthesis.
Non regex
>>> s="ABC(a =2,b=3,c=5,d=5,e=Something)"
>>> ','.join(s.split(",")[:-2])+")"
'ABC(a =2,b=3,c=5)'
If you want regex to get rid always the last 2
>>> s="ABC(a =2,b=3,c=5,d=5,e=6,f=7,g=Something)"
>>> re.sub("(.*)(,.[^,]*,.[^,]*)\Z","\\1)",s)
'ABC(a =2,b=3,c=5,d=5,e=6)'
>>> s="ABC(a =2,b=3,c=5,d=5,e=Something)"
>>> re.sub("(.*)(,.[^,]*,.[^,]*)\Z","\\1)",s)
'ABC(a =2,b=3,c=5)'
If its always the first 3,
>>> s="ABC(a =2,b=3,c=5,d=5,e=Something)"
>>> re.sub("([^,]+,[^,]+,[^,]+)(,.*)","\\1)",s)
'ABC(a =2,b=3,c=5)'
>>> s="ABC(q =2,z=3,d=5,d=5,e=Something)"
>>> re.sub("([^,]+,[^,]+,[^,]+)(,.*)","\\1)",s)
'ABC(q =2,z=3,d=5)'
import re
re.sub(r',d=\d*,e=[^\)]*','', your_string)

Categories