Aligning integers(Basic python) - python

Hey guys I need some help aligning my integers. I will show you what my code is, what my output is, and what I want my output to be. Thanks!
Code:
test_sign='#'
test_numbers=[100000,5000000,7000000]
test_calc_list=[]
test_sum=sum(test_numbers)
test_list=['Testcase1','Testcase2','Testcase3']
test_sign_list=[]
for x in test_numbers:
test_calc=round((x/float(test_sum)*10))
test_calc_list.append(test_calc)
for y in test_calc_list:
y=int(y)
signs=y*test_sign
test_sign_list.append(signs)
for z in range(len(test_list)):
print "%8s"%test_list[z]+":",test_sign_list[z],test_numbers[z]
Output:
Testcase1: 100000
Testcase2: #### 5000000
Testcase3: ###### 7000000
Desired output:
Testcase1: 100000
Testcase2: #### 5000000
Testcase3: ###### 7000000

This might be a good time to learn {}-formatting, instead of learning more in-depth about the (not-quite-deprecated, but discouraged) %-formatting.
Especially since the only %-formatting you're using seems to be incorrect. (There's no good reason to use %8s for a string you know is going to be 9 characters long…)
So:
print '{}: {:<6} {:>7}'.format(test_list[z], test_sign_list[z], test_numbers[z])
See String Formatting for details on all the options.
As a side note, I think your loop would be more readable this way:
for test, sign, number in zip(test_list, test_sign_list, test_numbers):
print '{}: {:<6} {:>7}'.format(test, sign, number)

Option one, specify length in format:
http://docs.python.org/2/library/string.html#format-specification-mini-language
"width is a decimal integer defining the minimum field width. If not specified, then the field width will be determined by the content."
Option two, pre-pad strings using ljust, rjust and center:
http://docs.python.org/2/library/string.html#string.ljust

Change
print "%8s"%test_list[z]+":",test_sign_list[z],test_numbers[z]
to
print "%8s: %-6s %7i" % (test_list[z], test_sign_list[z], test_numbers[z])

strings = ["abc", "sakjfslkdfnds", "7"]
maxlength = max(map(len, strings))
for index, string in enumerate(strings):
print("Testcase%d: %s" % (index, string.rjust(maxlength, ".")))
Leave out the "." argument if you just want spaces.

Related

Python: Find and increment a number in a string

I can't find a solution to this, so I'm asking here. I have a string that consists of several lines and in the string I want to increase exactly one number by one.
For example:
[CENTER]
[FONT=Courier New][COLOR=#00ffff][B][U][SIZE=4]{title}[/SIZE][/U][/B][/COLOR][/FONT]
[IMG]{cover}[/IMG]
[IMG]IMAGE[/IMG][/CENTER]
[QUOTE]
{description_de}
[/QUOTE]
[CENTER]
[IMG]IMAGE[/IMG]
[B]Duration: [/B]~5 min
[B]Genre: [/B]Action
[B]Subgenre: [/B]Mystery, Scifi
[B]Language: [/B]English
[B]Subtitles: [/B]German
[B]Episodes: [/B]01/5
[IMG]IMAGE[/IMG]
[spoiler]
[spoiler=720p]
[CODE=rich][color=Turquoise]
{mediaInfo1}
[/color][/code]
[/spoiler]
[spoiler=1080p]
[CODE=rich][color=Turquoise]
{mediaInfo2}
[/color][/code]
[/spoiler]
[/spoiler]
[hide]
[IMG]IMAGE[/IMG]
[/hide]
[/CENTER]
I'm getting this string from a request and I want to increment the episode by 1. So from 01/5 to 02/5.
What is the best way to make this possible?
I tried to solve this via regex but failed miserably.
Assuming the number you want to change is always after a given pattern, e.g. "Episodes: [/B]", you can use this code:
def increment_episode_num(request_string, episode_pattern="Episodes: [/B]"):
idx = req_str.find(episode_pattern) + len(episode_pattern)
episode_count = int(request_string[idx:idx+2])
return request_string[:idx]+f"{(episode_count+1):0>2}"+request_string[idx+2:]
For example, given your string:
req_str = """[B]Duration: [/B]~5 min
[B]Genre: [/B]Action
[B]Subgenre: [/B]Mystery, Scifi
[B]Language: [/B]English
[B]Subtitles: [/B]German
[B]Episodes: [/B]01/5
"""
res = increment_episode_num(req_str)
print(res)
which gives you the desired output:
[B]Duration: [/B]~5 min
[B]Genre: [/B]Action
[B]Subgenre: [/B]Mystery, Scifi
[B]Language: [/B]English
[B]Subtitles: [/B]German
[B]Episodes: [/B]02/5
As #Barmar suggested in Comments, and following the example from the documentation of re, also formatting to have the right amount of zeroes as padding:
pattern = r"(?<=Episodes: \[/B\])[\d]+?(?=/\d)"
def add_one(matchobj):
number = str(int(matchobj.group(0)) + 1)
return "{0:0>2}".format(number)
re.sub(pattern, add_one, request)
The pattern uses look-ahead and look-behind to capture only the number that corresponds to Episodes, and should work whether it's in the format 01/5 or 1/5, but always returns in the format 01/5. Of course, you can expand the function so it recognizes the format, or even so it can add different numbers instead of only 1.

Significant figures in a string format not working correctly

So I want to print a simple line using data from a curve fit as such:
print('{}, {}, {}'.format(popt[0], popt[1], chi_squared))
which outputs:
0.33274149918645834, 0.9185831984664338, 19.685835082519155
However as soon as I put a signifacant figures constraint on it it ouputs the first number multiples times as seen below and I am unsure why?
print('{0:.4f}, {0:.4f}, {0:.4f}'.format(popt[0], popt[1], chi_squared))
0.3327, 0.3327, 0.3327
Note: From this, I have got 2 answers. 1 how to fix the issue and 2 that I'm an idiot.
The 0 before the : means to use argument number 0, which means they'll all be popt[0]. Just omit it:
print('{:.4f}, {:.4f}, {:.4f}'.format(popt[0], popt[1], chi_squared))
The zero in "{0:<whatever>}" means "print the 0th element". You don't have to pass the index of the elements to print explicitly, so you can simply write: "{:.4f} {:.4f}".format(3.14159, 1.234567)
Because you are printing only {0} not others
print('{0:.4f}, {1:.4f}, {2:.4f}'.format(popt[0], popt[1], chi_squared))
I Love F-strings:
print(f"{popt[0]:.4f}, {popt[1]:.4f}, {chi_squared:.4f}")

How to add characters to a variable/integer name - Python

There might be a question like this but I can't find it.
I want to be to add the name of a variable/integer. e.g.
num = 5
chr(0x2075)
Now the 2nd line would return 5 in superscript but I want to put the word num into the Unicode instead so something like chr(0x207+num) would return 5 in superscript.
Any ideas? Thanks in advance
chr(0x2070 + num)
As given in the comment, if you want to get the character at U+207x, this is correct.
But this is not the proper way to find the superscript of a number, because U+2071 is ⁱ (superscript "i") while U+2072 and U+2073 are not yet assigned.
>>> chr(0x2070 + 1)
'ⁱ'
The real superscripts ¹ (U+00B9), ² (U+00B2), ³ (U+00B3) are out of place.
>>> chr(0xb9), chr(0xb2), chr(0xb3)
('¹', '²', '³')
Unfortunately, like most things Unicode, the only sane solution here is to hard code it:
def superscript_single_digit_number(x):
return u'⁰¹²³⁴⁵⁶⁷⁸⁹'[x]

Iterate over two big arrays at once

I have to iterate over two arrays which are 1000x1000 big. I already reduced the resolution to 100x100 to make the iteration faster, but it still takes about 15 minutes for ONE array!
So I tried to iterate over both at the same time, for which I found this:
for index, (x,y) in ndenumerate(izip(x_array,y_array)):
but then I get the error:
ValueError: too many values to unpack
Here is my full python code: I hope you can help me make this a lot faster, because this is for my master thesis and in the end I have to run it about a 100 times...
area_length=11
d_circle=(area_length-1)/2
xdis_new=xdis.copy()
ydis_new=ydis.copy()
ie,je=xdis_new.shape
while (np.isnan(np.sum(xdis_new))) and (np.isnan(np.sum(ydis_new))):
xdis_interpolated=xdis_new.copy()
ydis_interpolated=ydis_new.copy()
# itx=np.nditer(xdis_new,flags=['multi_index'])
# for x in itx:
# print 'next x and y'
for index, (x,y) in ndenumerate(izip(xdis_new,ydis_new)):
if np.isnan(x):
print 'index',index[0],index[1]
print 'interpolate'
# define indizes of interpolation area
i1=index[0]-(area_length-1)/2
if i1<0:
i1=0
i2=index[0]+((area_length+1)/2)
if i2>ie:
i2=ie
j1=index[1]-(area_length-1)/2
if j1<0:
j1=0
j2=index[1]+((area_length+1)/2)
if j2>je:
j2=je
# -->
print 'i1',i1,'','i2',i2
print 'j1',j1,'','j2',j2
area_values=xdis_new[i1:i2,j1:j2]
print area_values
b=area_values[~np.isnan(area_values)]
if len(b)>=((area_length-1)/2)*4:
xi,yi=meshgrid(arange(len(area_values[0,:])),arange(len(area_values[:,0])))
weight=zeros((len(area_values[0,:]),len(area_values[:,0])))
d=zeros((len(area_values[0,:]),len(area_values[:,0])))
weight_fac=zeros((len(area_values[0,:]),len(area_values[:,0])))
weighted_area=zeros((len(area_values[0,:]),len(area_values[:,0])))
d=sqrt((xi-xi[(area_length-1)/2,(area_length-1)/2])*(xi-xi[(area_length-1)/2,(area_length-1)/2])+(yi-yi[(area_length-1)/2,(area_length-1)/2])*(yi-yi[(area_length-1)/2,(area_length-1)/2]))
weight=1/d
weight[where(d==0)]=0
weight[where(d>d_circle)]=0
weight[where(np.isnan(area_values))]=0
weight_sum=np.sum(weight.flatten())
weight_fac=weight/weight_sum
weighted_area=area_values*weight_fac
print 'weight'
print weight_fac
print 'values'
print area_values
print 'weighted'
print weighted_area
m=nansum(weighted_area)
xdis_interpolated[index]=m
print 'm',m
else:
print 'insufficient elements'
if np.isnan(y):
print 'index',index[0],index[1]
print 'interpolate'
# define indizes of interpolation area
i1=index[0]-(area_length-1)/2
if i1<0:
i1=0
i2=index[0]+((area_length+1)/2)
if i2>ie:
i2=ie
j1=index[1]-(area_length-1)/2
if j1<0:
j1=0
j2=index[1]+((area_length+1)/2)
if j2>je:
j2=je
# -->
print 'i1',i1,'','i2',i2
print 'j1',j1,'','j2',j2
area_values=ydis_new[i1:i2,j1:j2]
print area_values
b=area_values[~np.isnan(area_values)]
if len(b)>=((area_length-1)/2)*4:
xi,yi=meshgrid(arange(len(area_values[0,:])),arange(len(area_values[:,0])))
weight=zeros((len(area_values[0,:]),len(area_values[:,0])))
d=zeros((len(area_values[0,:]),len(area_values[:,0])))
weight_fac=zeros((len(area_values[0,:]),len(area_values[:,0])))
weighted_area=zeros((len(area_values[0,:]),len(area_values[:,0])))
d=sqrt((xi-xi[(area_length-1)/2,(area_length-1)/2])*(xi-xi[(area_length-1)/2,(area_length-1)/2])+(yi-yi[(area_length-1)/2,(area_length-1)/2])*(yi-yi[(area_length-1)/2,(area_length-1)/2]))
weight=1/d
weight[where(d==0)]=0
weight[where(d>d_circle)]=0
weight[where(np.isnan(area_values))]=0
weight_sum=np.sum(weight.flatten())
weight_fac=weight/weight_sum
weighted_area=area_values*weight_fac
print 'weight'
print weight_fac
print 'values'
print area_values
print 'weighted'
print weighted_area
m=nansum(weighted_area)
ydis_interpolated[index]=m
print 'm',m
else:
print 'insufficient elements'
else:
print 'no need to interpolate'
xdis_new=xdis_interpolated
ydis_new=ydis_interpolated
Some advice:
Profile your code to see what is the slowest part. It may not be the iteration but the computations that need to be done each time.
Reduce function calls as much as possible. Function calls are not for free in Python.
Rewrite the slowest part as a C extension and then call that C function in your Python code (see Extending and Embedding the Python interpreter).
This page has some good advice as well.
You specifically asked for iterating two arrays in a single loop. Here is a way to do that
l1 = ["abc", "def", "hi"]
l2 = ["ghi", "jkl", "lst"]
for f,s in zip(l1,l2):
print "%s : %s" %(f,s)
The above is for python 3, you can use izip for python 2
You may use this as your for loop:
for index, x in ndenumerate((x_array,y_array)):
But it wont help you much, because your computer cant do two things at the same time.
Profiling is definitely a good start to identify where all the time spent actually goes.
I usually use the cProfile module, as it requires minimal overhead and gives me more than enough information.
import cProfile
import pstats
cProfile.run('main()', "ProfileData.txt", 'tottime')
p = pstats.Stats('ProfileData.txt')
p.sort_stats('cumulative').print_stats(100)
I your example you would have to wrap your code into a main() function to be able to use this code snippet at the very end of your file.
Comment #1: You don't want to use ndenumerate on the izip iterator, as it'll output you the iterator, which isn't what you want.
Comment #2:
i1=index[0]-(area_length-1)/2
if i1<0:
i1=0
could be simplified in i1 = min(index[0]-(area_length-1)/2, 0), and you could store your (area_length+/-1)/2 in specific variables.
Idea #1 : try to iterate on flat versions of the arrays, i.e. with something like
for (i, (x, y)) in enumerate(izip(xdis_new.flat,ydis_new.flat)):
You could get the original indices via divmod(i, xdis_new.shape[-1]), as you should be iterating by rows first.
Idea #2 : Iterate only on the nans, i.e. indexing your arrays with np.isnan(xdis_new)|np.isnan(ydis_new), that could save you some iterations
EDIT #1
You probably don't need to initialize d, weight_fac and weighted_area in your loop, as you compute them separately.
Your weight[where(d>0)] can be simplified in weight[d>0]
Do you need weight_fac ? Can't you just compute weight then normalize it in place ? That should save you some temporary arrays.

python String Formatting Operations

Faulty code:
pos_1 = 234
pos_n = 12890
min_width = len(str(pos_n)) # is there a better way for this?
# How can I use min_width as the minimal width of the two conversion specifiers?
# I don't understand the Python documentation on this :(
raw_str = '... from %(pos1)0*d to %(posn)0*d ...' % {'pos1':pos_1, 'posn': pos_n}
Required output:
... from 00234 to 12890 ...
______________________EDIT______________________
New code:
# I changed my code according the second answer
pos_1 = 10234 # can be any value between 1 and pos_n
pos_n = 12890
min_width = len(str(pos_n))
raw_str = '... from % *d to % *d ...' % (min_width, pos_1, min_width, pos_n)
New Problem:
There is one extra whitespace (I marked it _) in front of the integer values, for intigers with min_width digits:
print raw_str
... from _10234 to _12890 ...
Also, I wonder if there is a way to add Mapping keys?
pos_1 = 234
pos_n = 12890
min_width = len(str(pos_n))
raw_str = '... from %0*d to %0*d ...' % (min_width, pos_1, min_width, pos_n)
Concerning using a mapping type as second argument to '%':
I presume you mean something like that '%(mykey)d' % {'mykey': 3}, right?! I think you cannot use this if you use the "%*d" syntax, since there is no way to provide the necessary width arguments with a dict.
But why don't you generate your format string dynamically:
fmt = '... from %%%dd to %%%dd ...' % (min_width, min_width)
# assuming min_width is e.g. 7 fmt would be: '... from %7d to %7d ...'
raw_string = fmt % pos_values_as_tuple_or_dict
This way you decouple the width issue from the formatting of the actual values, and you can use a tuple or a dict for the latter, as it suits you.
"1234".rjust(13,"0")
Should do what you need
addition:
a = ["123", "12"]
max_width = sorted([len(i) for i in a])[-1]
put max_width instead of 13 above and put all your strings in a single array a (which seems to me much more usable than having a stack of variables).
additional nastyness:
(Using array of numbers to get closer to your question.)
a = [123, 33, 0 ,223]
[str(x).rjust(sorted([len(str(i)) for i in a])[-1],"0") for x in a]
Who said Perl is the only language to easily produce braindumps in? If regexps are the godfather of complex code, then list comprehension is the godmother.
(I am relatively new to python and rather convinced that there must be a max-function on arrays somewhere, which would reduce above complexity. .... OK, checked, there is. Pity, have to reduce the example.)
[str(x).rjust(max([len(str(i) for i in a]),"0") for x in a]
And please observe below comments on "not putting calculation of an invariant (the max value) inside the outer list comprehension".

Categories