Using regex's to relabel and remove redundent items in a string - python

I am trying to use pythons regex features to relabel some identifiers in some text.
Here is an example of the text. I am essentially trying to number all the v's in numerical order.
#r=v4 "v4"
A -> C : B
Cell * kcat * B * A / (km + A)
#r=v4 "v4"
C -> C+D
Cell * v2_k * C
#r=v4 "v4"
C -> : D
Cell * kcat2 * D * C / (km2 + C)
#r=v4 "v4"
C -> C+D
Cell * v2_k * C
So the desired output would be
#r=v1 "v1"
A -> C : B
Cell * kcat * B * A / (km + A)
#r=v2 "v2"
C -> C+D
Cell * v2_k * C
#r=v3 "v3"
C -> : D
Cell * kcat2 * D * C / (km2 + C)
#r=v4 "v4"
C -> C+D
Cell * v2_k * C
However there is also a complication. If you look carefully you can see that the 'v2' and 'v4' elements are identical. This is therefore redundant information for me and needs to me removed.
My Code:
string='''
#r=v4 "v4"
A -> C : B
Cell * kcat * B * A / (km + A)
#r=v4 "v4"
C -> C+D
Cell * v2_k * C
#r=v4 "v4"
C -> : D
Cell * kcat2 * D * C / (km2 + C)
#r=v4 "v4"
C -> C+D
Cell * v2_k * C
'''
pattern=re.compile('#r=(.*)')
for i in range(len(re.findall(pattern,string))):
print re.sub(pattern,'#r=v{} "v{}"'.format(str(i+1),str(i+1)),string)
This however does not give me the desired output. Does anybody know how to do what I want? Thanks

Probable solution:
string='''#r=v4 "v4"
A -> C : B
Cell * kcat * B * A / (km + A)
#r=v4 "v4"
C -> C+D
Cell * v2_k * C
#r=v4 "v4"
C -> : D
Cell * kcat2 * D * C / (km2 + C)
#r=v4 "v4"
C -> C+D
Cell * v2_k * C'''
i = 0
for strg in string.splitlines():
if strg == '#r=v4 "v4"':
i += 1
print '#r=v{} "v{}"'.format(i,i)
else:
print strg
Output:
#r=v1 "v1"
A -> C : B
Cell * kcat * B * A / (km + A)
#r=v2 "v2"
C -> C+D
Cell * v2_k * C
#r=v3 "v3"
C -> : D
Cell * kcat2 * D * C / (km2 + C)
#r=v4 "v4"
C -> C+D
Cell * v2_k * C
You can easily concat all string and get text with relabeled identifiers. Like this:
new_text = ""
for strg in string.splitlines():
if strg == '#r=v4 "v4"':
i += 1
new_text += '#r=v{} "v{}"\n'.format(i,i)
else:
new_text += strg + '\n'
For just little bit more difficult case:
for strg in string.splitlines():
if strg in ['#r=v4 "v4"','#r=v2 "v2"','#r=v3 "v3"'] : # any string if there aren't huge amount of cases
i += 1
print '#r=v{} "v{}"'.format(i,i)
else:
print strg

Related

Cant exit while loop on Simpson's Rule

I am trying to calculate an integral using Simpson's Rule formula.The catch is that the value of the integral is the one that satisfies the following condition:You find the Ih and Ih/2.If the absolute of (Ih-Ih/2)<error the loop is complete.Otherwise you repeat the process with half the h,which means it calculates the absolute of (Ih/2-Ih/4) and so on and so on.
while True:
###Ih part
h = (b - a) / N
y1 = np.linspace(a, b, N)
Ez11 = (np.sqrt(ra ** 2 + R ** 2 - 2 * ra * R * np.cos(y1 - fa))) / (
(aa ** 2 - 2 * ra * R * np.cos(y1 - fa)) ** (3 / 2))
I11 = (h/3) * (Ez11[0] + 2*sum(Ez11[:N-2:2]) \
+ 4*sum(Ez11[1:N-1:2]) + Ez11[N-1])
#####Ih/2 part
h = (b-a)/(2*N)
y2 = np.linspace(a, b, 2*N)
Ez22 = (np.sqrt(ra ** 2 + R ** 2 - 2 * ra * R * np.cos(y2 - fa))) / (
(aa ** 2 - 2 * ra * R * np.cos(y2 - fa)) ** (3 / 2))
print(Ez22)
I22 = (h/ 3) * (Ez22[0] + 2 * sum(Ez22[:N - 2:2]) \
+ 4 * sum(Ez22[1:N - 1:2]) + Ez22[N - 1])
# error condition I1=Ih I2=Ih/2
if np.abs(I11 - I22) < error:
break
else:
N = 2*N # h/2
print(np.abs(I11 - I22))
As far as I can tell,my approach should be correct.However the loop goes on and on,never to stop.
My code is as follows:
import numpy as np
from scipy.integrate import simps
import scipy.integrate as integrate
import scipy.special as special
# variables
a = 0
b = np.pi * 2
N = 100
ra = 0.1 # ρα
R = 0.05
fa = 35 * (np.pi / 180) # φα
za = 0.4
Q = 10 ** (-6)
k = 9 * 10 ** 9
aa = np.sqrt(ra ** 2 + R ** 2 + za ** 2)
error = 0.55 * 10 ** (-8)
h=(b-a)/N
I1 = np.nan
I11 = np.nan
#Simpsons section
############ Ez
#automated Simpson
while True:
###Ih part
y1 = np.linspace(a, b, N)
Ez1 = (np.sqrt(ra ** 2 + R ** 2 - 2 * ra * R * np.cos(y1 - fa))) / (
(aa ** 2 - 2 * ra * R * np.cos(y1 - fa)) ** (3 / 2))
print(len(Ez1))
I1 = simps(Ez1, y1)
#####Ih/2 part
y2 = np.linspace(a, b, 2*N)
Ez2 = (np.sqrt(ra ** 2 + R ** 2 - 2 * ra * R * np.cos(y2 - fa))) / (
(aa ** 2 - 2 * ra * R * np.cos(y2 - fa)) ** (3 / 2))
I2 = simps(Ez2, y2)
# error condition I1=Ih I2=Ih/2
if np.abs(I1 - I2) < error:
break
else:
N *= 2 # h/2
#custom-made Simpson
N = 100
while True:
###Ih part
h = (b - a) / N
y1 = np.linspace(a, b, N)
Ez11 = (np.sqrt(ra ** 2 + R ** 2 - 2 * ra * R * np.cos(y1 - fa))) / (
(aa ** 2 - 2 * ra * R * np.cos(y1 - fa)) ** (3 / 2))
I11 = (h/3) * (Ez11[0] + 2*sum(Ez11[:N-2:2]) \
+ 4*sum(Ez11[1:N-1:2]) + Ez11[N-1])
#####Ih/2 part
h = (b-a)/(2*N)
y2 = np.linspace(a, b, 2*N)
Ez22 = (np.sqrt(ra ** 2 + R ** 2 - 2 * ra * R * np.cos(y2 - fa))) / (
(aa ** 2 - 2 * ra * R * np.cos(y2 - fa)) ** (3 / 2))
print(Ez22)
I22 = (h/ 3) * (Ez22[0] + 2 * sum(Ez22[:N - 2:2]) \
+ 4 * sum(Ez22[1:N - 1:2]) + Ez22[N - 1])
# error condition I1=Ih I2=Ih/2
if np.abs(I11 - I22) < error:
break
else:
N = 2*N # h/2
print(np.abs(I11 - I22))
print(I1)
print(I11)
Simpson's Rule is as follows:
After a while it's stuck in this situation
The 5.23 part is the absolute diff of those 2 which shouldnt be that high.

Apply a function to two columns of dataframe and create a new column

I want to apply this function on the following columns but I could not.
def damage(a,b):
l=5
if (a==b+0.3) or ((a>=1.5* b) and (a<=1.9 * b)):
l=1
elif (a>=2* b) and (a<=2.9 * b) :
l=2
elif (a>=4) or (a>= 3* b):
l=3
elif (a==b) or (a<=b+0.3) or (a<= 1.5 *b):
l=0
return l
df['1st_day_damage'] =df[df['Cr-1'],df['Cr']].apply(damage)
It's better to use vectorized code. apply is slow.
Coding in the blind since you didn't provide any example:
a = df['Cr-1']
b = df['Cr']
df['1st_day_damage'] = np.select(
[
(a == b + 0.3) | ((1.5 * b <= a) & (a <= 1.9 * b)),
(2 * b <= a) & (a <= 2.9 * b),
(a >= 4) | (a >= 3 * b),
(a == b) | (a <= b + 0.3) | (a <= 1.5 * b)
],
[
1,
2,
3,
0
],
default=5
)

Asking advice on optimizing a while loop

This is part of an algorithm regarding the RKF method
t = self.a
x = np.array(self.x0)
h = self.hmax
T = np.array( [t] )
X = np.array( [x] )
k= [0]*6
while t < self.b:
if t + h > self.b:
h = self.b - t
k[0] = h * self.f(t, x)
k[1] = h * self.f(t + a2 * h, x + b21 * k[0] )
k[2] = h * self.f(t + a3 * h, x + b31 * k[0] + b32 * k[1])
k[3] = h * self.f(t + a4 * h, x + b41 * k[0] + b42 * k[1] + b43 * k[2])
k[4] = h * self.f(t + a5 * h, x + b51 * k[0] + b52 * k[1] + b53 * k[2] + b54 * k[3])
k[5] = h * self.f(t + a6 * h, x + b61 * k[0] + b62 * k[1] + b63 * k[2] + b64 * k[3] + b65 * k[4])
r = abs( r1 * k[0] + r3 * k[2] + r4 * k[3] + r5 * k[4] + r6 * k[5] ) / h
r = r / (self.atol+self.rtol*(abs(x)+abs(k[0])))
if len( np.shape( r ) ) > 0:
r = max( r )
if r <= 1:
t = t + h
x = x + c1 * k[0] + c3 * k[2] + c4 * k[3] + c5 * k[4]
T = np.append( T, t )
X = np.append( X, [x], 0 )
h = h * min( max( 0.94 * sqrt(sqrt( 1 / r )), 0.1 ), 4.0 )
if h > self.hmax:
h = self.hmax
elif h < self.hmin or t==t-h:
raise RuntimeError("Error: Could not converge to the required tolerance.")
break
Which works just fine, but I was wondering if is it possible to make this even faster and more efficient?

how to create a vertical histogram using in-built python modules?

Basically I need to create a vertical histogram that cascades downwards.
My code so far:
a = 1
b = 8
c = 6
d = 7
x = [a, b, c, d]
z = max(x)
print(z)
i = 0
while i < z:
i += 1
a -= 1
b -= 1
c -= 1
d -= 1
if a >= 0:
print("*".ljust(5), end="")
if b >= 0:
print("*".ljust(5), end="")
if c >= 0:
print("*".ljust(5), end="")
if d >= 0:
print("*".ljust(5))
output obtained:
* * * *
* * *
* * *
* * *
* * *
* * *
* *
*
Required output:
* * * *
* * *
* * *
* * *
* * *
* * *
* *
*
ps: I'm new to all this so please excuse my ignorance 😁
Your code is almost working as is, but the *s are shifting over between columns.
If I change the *s to be the variable they are for, your current output looks like this:
a b c d
b c d
b c d
b c d
b c d
b c d
b d
b
You just need to print some whitespace when your if conditions come up False. So each one becomes
if a >= 0:
print("*".ljust(5), end="")
else:
print(" ".ljust(5), end="")

Given an N-side square matrix, is there a way to find the ring value of a cell without using loops or if conditions?

For instance, imagine you have a 6-side square matrix.
These are the cells cartesian indices:
(0,0) (0,1) (0,2) (0,3) (0,4) (0,5)
(1,0) (1,1) (1,2) (1,3) (1,4) (1,5)
(2,0) (2,1) (2,2) (2,3) (2,4) (2,5)
(3,0) (3,1) (3,2) (3,3) (3,4) (3,5)
(4,0) (4,1) (4,2) (4,3) (4,4) (4,5)
(5,0) (5,1) (5,2) (5,3) (5,4) (5,5)
A 6-side square has 3 rings: a
A A A A A A
A B B B B A
A B C C B A
A B C C B A
A B B B B A
A A A A A A
QUESTION: What's the function that takes the coordinates of a cell, the side N of the square and returns the ring value accordingly? Ex:
f(x = 1, y 2, N = 6) = B
A,B,C... can be any numerical value: 1,2,3 ... or 0,1,2 ... or whatever. What matters is that they are congruent for any N. Ex:
N = 1 => A = 1
N = 2 => A = 1
N = 3 => A = 1, B = 2
N = 4 => A = 1, B = 2
N = 5 => A = 1, B = 2, C = 3
N = 6 => A = 1, B = 2, C = 3
N = 7 => A = 1, B = 2, C = 4, D = 4
...
Using if conditions the problem is easily solved.
Given a pair (x,y) and the square side N:
# N//2 is the number of rings in a N-side square
for k in range(1,N//2+1):
if x == 0+k-1 or y== 0+k-1 or x == N-k or y == N-1:
return k
This seems like a very expensive way to find the ring value of the cell though.
I have been trying to find the function using diagonals, sum of the coordinates, difference of the coordinates ... of the cells, but I still couldn't find anything.
Has anyone ever encountered this problem?
Is there a way to solve it?
Looks like a math problem to solve.
EDIT: Updated function, should be better able to handle even and odd cases after the mid point i hope. However, OP's request to turn this into a mathematical equation, i'm not sure how to do that.
import math
def ring_finder(x, y, N, outer_ring = 0):
'''
x and y are the coordinates of a cell, N is the length of the side of square
Returns the correct ring count starting from outer_ring value (default, 0)
'''
if x >= N or y >= N:
print("coordinates outside square, please check")
return None
no_of_squares = math.ceil(N/2)
x = N - x - 1 if x >= no_of_squares else x
y = N - y - 1 if y >= no_of_squares else y
return min(x, y) + outer_ring
ring_finder(5, 5, 6)
ring_finder(1, 2, 6)
I think this function does what you want:
def ring_id(n, i, j):
even = n % 2 == 0
n_2 = n // 2
i = i - n_2
if even and i >= 0:
i += 1
i = abs(i)
j = j - n_2
if even and j >= 0:
j += 1
j = abs(j)
ring_id = i + max(j - i, 0)
return n_2 - ring_id
Small test with letters:
import string
def print_rings(n):
ring_names = string.ascii_uppercase
for i in range(n):
for j in range(n):
rid = ring_id(n, i, j)
print(ring_names[rid], end=' ')
print()
print_rings(6)
# A A A A A A
# A B B B B A
# A B C C B A
# A B C C B A
# A B B B B A
# A A A A A A
print_rings(7)
# A A A A A A A
# A B B B B B A
# A B C C C B A
# A B C D C B A
# A B C C C B A
# A B B B B B A
# A A A A A A A
EDIT: If you insist in not having the word if in your function, you can (somewhat awkwardly) rewrite the above function as:
def ring_id(n, i, j):
even = 1 - n % 2
n_2 = n // 2
i = i - n_2
i += even * (i >= 0)
i = abs(i)
j = j - n_2
j += even * (j >= 0)
j = abs(j)
ring_id = i + max(j - i, 0)
return n_2 - ring_id
Or if you want it looking more "formula-like" (albeit unreadable and with more repeated computation):
def ring_id(n, i, j):
i2 = abs(i - (n // 2) + (1 - n % 2) * (i >= (n // 2)))
j2 = abs(j - (n // 2) + (1 - n % 2) * (j >= (n // 2)))
return (n // 2) - i2 + max(j2 - i2, 0)
This is not any more or less "mathematical" though, it is fundamentally the same logic.
The ring value is the complement of the distance to the center of the array, in the "infinity norm" sense.
N/2 - max(|X - (N-1)/2|, |Y - (N-1)/2|).
This assigns the value 0 for A, 1 for B and so on.
To avoid the half integers, you can use
(N - min(|2X - N + 1|, |2Y - N + 1|) / 2.
The max and abs functions may involve hidden ifs, but you can't avoid that.
def Ring(X, Y, N):
return (N - max(abs(2 * X - N + 1), abs(2 * Y - N + 1))) // 2
for N in range(1, 8):
for X in range(N):
for Y in range(N):
print(chr(Ring(X, Y, N) + 65), '', end= '')
print()
print()
A
A A
A A
A A A
A B A
A A A
A A A A
A B B A
A B B A
A A A A
A A A A A
A B B B A
A B C B A
A B B B A
A A A A A
A A A A A A
A B B B B A
A B C C B A
A B C C B A
A B B B B A
A A A A A A
A A A A A A A
A B B B B B A
A B C C C B A
A B C D C B A
A B C C C B A
A B B B B B A
A A A A A A A

Categories