I'm writing this code
for n in filtered_list:
for a in range(1,3):
duplicate = str(filenameRegex.group(1) + "(" + n + ")" + filenameRegex.group(2))
I've been wondering is there a more concise way to write this? I mean the "(" + n + ")" part. I was thinking about something like %s s = n, but I don't know and couldn't trial-and-error how to use it in this case.
In this case, you have to use %d instead of %s because n is an integer, not a string !
for n in filtered_list:
for a in range(1, 3):
duplicate = "%s(%d)%s" % (filenameRegex.group(1), n, filenameRegex.group(2))
This is old-school formatting, though. In Python 3 you can use f-strings:
for n in filtered_list:
for a in range(1, 3):
duplicate = f"{filenameRegex.group(1)}({n}){filenameRegex.group(2)}"
you can try like this:
duplicate = "%s(%s)%s"%(filenameRegex.group(1),n,filenameRegex.group(2))
or
duplicate = "{0}({1}){2}".format(filenameRegex.group(1),n,filenameRegex.group(2))
Related
I want to output something that looks like this (with 20 items per line, and two variables are needed, one for indexing, i, and one for the book name, bookname, in this case "Psalm"):
\hyperlink{Psalm1}{1} & \hyperlink{Psalm2}{2} & \hyperlink{Psalm3}{3} ...
Using Python (simplification of my loop, but enough to show the key error) I attempted:
for indx in range(1, 150, 20):
line = r""" \\hline
\\hyperlink{{{bn}{i}}}{{{i}}} & \\hyperlink{{{bn}{i+1}}}{{{i+1}}} & \\hyperlink{{{bn}{i+2}}}{{{i+2}}} ...
""".format( i=indx, bn = bookname)
What's the best way to recode to avoid the i+1 key error ?
Here is example of string generation (\hyperlink{Psalm1}{1}) using different methods:
i = 1
# string concatenation
formatted = r"\hyperlink{Psalm" + str(i) + "}{" + str(i) + "}"
# old-style formatting
formatted = r"\hyperlink{Psalm%d}{%d}" % (i, i))
# str.format
formatted = r"\hyperlink{{Psalm{0}}}{{{0}}}".format(i)
# f-string
formatted = rf"\hyperlink{{Psalm{i}}}{{{i}}}"
For this particular case I find old-style formatting more "clean" as it doesn't require doubling curly brackets.
To print 20 strings in each line you can pass generator which will produce formatted strings into str.join().
Full code:
stop = 150
step = 20
for i in range(1, stop + 1, step):
print(
" & ".join(
r"\hyperlink{Psalm%d}{%d}" % (n, n)
for n in range(i, min(i + step, stop + 1))
)
)
Or you can also use "one-liner":
stop = 150
step = 20
print(
"\n".join( # use " &\n".join(...) if you need trailing '&'
" & ".join(
r"\hyperlink{Psalm%d}{%d}" % (n, n)
for n in range(i, min(i + step, stop + 1))
)
for i in range(1, stop + 1, step)
)
)
Try to use f string
for i in range(1, 150, 20):
print(f"\\hyperlinkPPsalm{i} & \\hyperlinkPsalm{i+1} & \\hyperlinkPsalm{i+2}")
I have a string:
a = babababbaaaaababbbab
And it needs to be shortened so it looks like this:
(ba)3(b)2(a)5ba(b)3ab
So basically it needs to take all repeating characters and write how many times they are repeating instead of printing them.
I managed to do half of this:
from itertools import groupby
a = 'babababbaaaaababbbab'
grouped = ["".join(grp) for patt,grp in groupby(a)]
solved = [str(len(i)) + i[0] for i in grouped if len(i) >= 2]
but this only does this for characters that are repeating but not patterns. I get it that I could do this by finding 'ab' pattern in string but this needs to be viable for every possible string. Has anyone encountered something similar?
You can easily do this with regex:
>>> repl= lambda match:'({}){}'.format(match.group(1), len(match.group())//len(match.group(1)))
>>> re.sub(r'(.+?)\1+', repl, 'babababbaaaaababbbab')
'(ba)3(b)2(a)5ba(b)3ab'
Not much to explain here. The pattern (.+?)\1+ matches repeating character sequences, and the lambda function rewrites them to the form (sequence)number.
This is what I came up with, the code is a mess, but I just wanted to have a quick fun, so I let it be like this
a = 'babababbaaaaababbbab'
def compress(text):
for i in range(1, len(text) // 2):
for j, c in enumerate(text[:-i if i > 0 else len(text)]):
pattern = text[j:i+j]
new_text = pattern_repeats_processor(pattern, text, j)
if new_text != text:
return compress(new_text)
return text
def pattern_repeats_processor(pattern, text, i):
chunk = pattern
count = 1
while chunk == pattern and i + (count + 1) * len(pattern) < len(text):
chunk = text[i + count * len(pattern): i + (count + 1) * len(pattern)]
if chunk == pattern:
count = count + 1
else:
break
if count > 1:
return text[:i] + '(' + pattern + ')' + str(count) + text[i + (count + 0) * len(pattern):]
return text
print(compress(a))
print(a)
It makes
babababbaaaaababbbab =>
(ba)3(b)2(a)5ba(b)3ab
P.S. Of course answer of Rowing is miles better, pretty impressive even
I'm not sure what exactly you're looking for but here hope this helps.
A=a.count('a')
B=a.count('b')
AB=a.count('ab')
BAB=a.count('bab')
BA=a.count('ba')
print(A,'(a)',B,'(b)',AB,'(ab)',BAB,'(bab)',BA,'(ba)')
i need to produce output with specified number of spaces. It is a table with some columns, for saving output in to the file i use line:
save_line = ('%8s' % label[each_atom + str(int(k) + 1)] +
'%10s' % str(int(i) + 1) +
'\n' +
'%2s' % x[i] +
'%20s' % y[i] +
'%20s' %z[i] +
'\n')
but the '%2s'%x[i] doesn't produce two spaces in output. I cant use +" "+ there. Any ideas what I can do?
Here is output of my code:
C1 1
2.482705 1.332897 13.175184
And finally here's how my output should looks (it is example from another input, my task is to produce my basing on this):
C1 1
2.42416980 4.14117720 4.71196000
It is no problem to change any number of spaces between any columns. The only one that doesn't work is the first one in every second row. It doesn't mater that the numbers don't mach. The problem is in the spaces.
Please combine those templates
save_line = "%8s%10s\n%2s%20s%20s\n" % (
label[each_atom + str(int(k) + 1)],
str(int(i) + 1),
x[i],
y[i],
z[i])
The right side of the % operator should be a tuple. None of your values are tuples (from what I can see) and that's a great way to get output you do not expect. If you only want to format one item:
print "Hello %s!" % ("world",)
Note the trailing comma. This is because
("World")
is a string (with parenthesis around it), while
("World",)
is a tuple containing one item.
but the '%2s'%x[i] doesn't produce two spaces in output. I cant use +" "+ there. Any ideas what I can do?
Why?
The number after % is the size of a new string field.
print '%3s' % 'why'
>>why
print '%4s' % 'why'
>> why
The size is 4 symbols for 'why' string = ' why'.
You have only 1 left space for 3 letters word if you use '%4s'.
Also, you can use '-'.
If your string is 's' and you use '%-2s', you will get new string 's ', with 1 space after your string. You can use it also in your string formatting tasks.
>>> print "%-5s%s" % ("N", "Q")
N Q
>>> print "%5s%s" % ("N", "Q")
NQ
>>> print "%5s%5s" % ("N", "Q")
N Q
>>> '%.1s' % 'Hello!'
'H'
>>> '%.2s' % 'Hello!'
'He'
And you can use value for tuple:
>>> '%.*s' % (2, 'Hello!')
'He'
Thank you,
I am writing a four loop in my program that writes data to a file. I'm wanting for the output to be formatted as follows
frame001 + K.1
frame002 + K.2
...
frame099 + K.99
frame100 + K.100
So far I am doing
for f in range(1, 100):
file.write('frame' + str(f) + ' + K.' + str(f) + '\n')
I have no problem having the K part come out correctly as K.1-K.100, but I don't know how to have prefix zeros/have it output also frame00F to frameFFF with the appropriate amount of preceding zeros.
Using str.format:
>>> 'frame{0:03d} + K.{0}\n'.format(1)
'frame001 + K.1\n'
>>> 'frame{0:03d} + K.{0}\n'.format(100)
'frame100 + K.100\n'
BTW, range(1, 100) will not yield 100. If you want 100 to be included, that should be range(1, 101).
If you are using old version of Python (Python 2.5-), use % operator (String formatting operator) instead (need to specify multiple argument unlike str.format)
>>> 'frame%03d + K.%d\n' % (1, 1)
'frame001 + K.1\n'
>>> 'frame%03d + K.%d\n' % (100, 100)
'frame100 + K.100\n'
If you don't want to repeat arguments, you can pass mapping instead with slightly different format specifier:
>>> 'frame%(i)03d + K.%(i)d\n' % {'i': 1}
'frame001 + K.1\n'
I want to be able to parse something like "10.[3-25].0.X" into the actual list of ip addresses described by this rule, so for the above example rule the list would be [10.3.0.0, 10.3.0.1....10.25.0.255]. What's the best way to do it?
So far the only thing I was able to come out with is the following awful-looking function:
wc = ''.join(wc.split()).upper()
wc = re.sub(r'(?<![\[-])(\d+)(?![\]-])', r'[\1-\1]', wc)
wc = re.sub(r'X', r'[0-255]', wc).split('.')
ips = []
for i in range(int(re.findall(r'(\d+)-(\d+)', wc[0])[0][0]), int(re.findall(r'(\d+)-(\d+)', wc[0])[0][1]) + 1):
for j in range(int(re.findall(r'(\d+)-(\d+)', wc[1])[0][0]), int(re.findall(r'(\d+)-(\d+)', wc[1])[0][1]) + 1):
for k in range(int(re.findall(r'(\d+)-(\d+)', wc[2])[0][0]), int(re.findall(r'(\d+)-(\d+)', wc[2])[0][1]) + 1):
for p in range(int(re.findall(r'(\d+)-(\d+)', wc[3])[0][0]), int(re.findall(r'(\d+)-(\d+)', wc[3])[0][1]) + 1):
ips.append(str(i) + '.' + str(j) + '.' + str(k) + '.' + str(p))
return ips
Any improvement ideas would be greatly appreciated.
You could make this a lot simpler.
First, instead of writing the exact same thing four times, use a loop or a listcomp:
ranges = [range(int(re.findall(r'(\d+)-(\d+)', wc[i])[0][0]),
int(re.findall(r'(\d+)-(\d+)', wc[i])[0][1]) + 1)
for i in range(4)]
You can also turn the nested loop into a flat loop over the cartesian product:
for i, j, k, p in itertools.product(*ranges):
And you can turn that long string-concatenation mess into a simple format or join call:
ips.append('{}.{}.{}.{}'.format(i, j, k, p)) # OR
ips.append('.'.join(map(str, (i, j, k, p))))
And that means you don't need to split out the 4 components in the first place:
for components in itertools.product(*ranges):
ips.append('{}.{}.{}.{}'.format(*components)) # OR
ips.append('.'.join(map(str, components)))
And now that the loop is so trivial, you can turn it into a listcomp:
ips = ['{}.{}.{}.{}'.format(*components)
for components in itertools.product(*ranges)]
Here's a possible example using itertools.product. The idea is to first evaluate the "template" (e.g. 1.5.123.2-5, 23.10-20.X.12, ...) octet by octet (each yielding a list of values) and then take the cartesian product of those lists.
import itertools
import re
import sys
def octet(s):
"""
Takes a string which represents a single octet template.
Returns a list of values. Basic sanity checks.
"""
if s == 'X':
return xrange(256)
try:
low, high = [int(val) for val in s.strip('[]').split('-')]
if low > high or low < 0 or high > 255:
raise RuntimeError('That is no valid range.')
return xrange(low, high + 1)
except ValueError as err:
number = int(s)
if not 0 <= number <= 255:
raise ValueError('Only 0-255 allowed.')
return [number]
if __name__ == '__main__':
try:
template = sys.argv[1]
octets = [octet(s) for s in template.split('.')]
for parts in itertools.product(*octets):
print('.'.join(map(str, parts)))
except IndexError as err:
print('Usage: %s IP-TEMPLATE' % (sys.argv[0]))
sys.exit(1)
(Small) Examples:
$ python ipregex.py '1.5.123.[2-5]'
1.5.123.2
1.5.123.3
1.5.123.4
1.5.123.5
$ python ipregex.py '23.[19-20].[200-240].X'
23.19.200.0
23.19.200.1
23.19.200.2
...
23.20.240.253
23.20.240.254
23.20.240.255
ip= re.search(r'(\d{1,3}.){3}\d{1,3}','192.168.1.100')
print(ip.group())
o/p==>192.168.1.100
case:2
ips= re.findall(r'(\d{1,3}.){3}\d{1,3}','192.168.1.100')
print(ips)
o/p==> ['1.']
case:3
ips= re.findall(r'(?:\d{1,3}.){3}\d{1,3}','192.168.1.100')
print(ips)
o/p==>['192.168.1.100']
why the re for case1(search) didnt work for case2(findall)