Replacing pairs of variables in a file

Replacing pairs of variables in a file - python

I working on a problem and my goal is to replace variables in the file and the name of the files.
The issue is that I have to change a couple of variables at the same time for all combinations (Generally 24 combinations).
I know how to create of all combinations of strings, but I want to put lists inside and iterate over them.
a = [ 'distance', 'T1', 'T2', 'gamma' ]
new_list = list(itertools.permutations(a, 2))
I created the function to pass my values:
def replace_variables(distance ='0', T1 ='0', T2 = '0', gamma = '0'):
template_new = template.replace('*distance*', distance).replace('*T1*', T1).replace('*T2*', T2).replace('*gamma*', gamma)
input_file = input_name.replace('one','T1'+T1).replace('two','T2'+T2).replace('phi','PHI'+Phi).replace('distance','R'+distance)
return template_new, input_file
when I call that function I can pass only names of variables.
for i in new_list:
elem1 = i[0]
elem2 = i[1]
template_new, input_file =replace_variables(elem1, elem2)
print input_file
Though I need to use lists:
distance = ['-3','+3']
T1 = ['-3', '+3']
T2 = ['-3', '+3']
gamma = ['-3', '+3']
And for each pair of variables change values in a file and a name of file such as:
original file: name_file_R_T1_T2_gamma.txt
will be replaced by:
name_file_3_3_0_0.txt, name_file_3_-3_0_0.txt, name_file_-3_3_0_0.txt,
name_file_3_3_0_0.txt, name_file_3_0_3_0.txt, name_file_3_0_-3_0.txt,
and so forth.
The original template looks like:
template = """
R = 3.0 *distance* cm
THETA1 = 60. *T1* degree
THETA2 = 2.0 *T2* degree
GAMMA = 0 *gamma* degree
"""
and I want to obtain:
template = """
R = 3.0 +3 cm
THETA1 = 60. +3 degree
THETA2 = 2.0 +0 degree
GAMMA = 0 +0 degree
"""
and so forth

I think I almost tackled the above problem:
#!/usr/bin/env python
import itertools
import copy
def replace_variables(i, distance ='0', T1 ='0', T2 = '0', gamma = '0' ):
k_ = copy.deepcopy(i)
k_[0][0] = '-2'
k_[1][0] = '2'
template_new = template.replace('*distance*', distance).replace('*T1*', T1).replace('*T2*', T2).replace('*gamma*', gamma)
input_file = input_name.replace('one','T1'+T1).replace('two','T2'+T2).replace('gamma','gamma'+gamma).replace('distance','R'+distance)
f = open(template_new, 'w')
f.write(template_new)
f.close()
input_name = 'name_file_distance_T1_T2_gamma.txt'
template = """
R = 3.0 *distance* cm
THETA1 = 60. *T1* degree
THETA2 = 2.0 *T2* degree
GAMMA = 0 *gamma* degree
"""
a = [['distance','+2','-2'], ['T1','+2','-2'], ['T2','+2','-2'], ['gamma','+2','-2']]
new_list = list(itertools.permutations(a, 2))
for i in new_list:
replace_variables(i, x, y)
Though I faced 2 problems:
1) My code does not change values of variables (apart from default ones) in the replace_variables function and I'm getting:
name_file_Rdistance_T1T1_T20_gamma0.txt, and so on
I think because of default arguments passed to the function.
2) My function does not create a separated files.

Related

When i try to print the mean and standard deviation, I am prompted with a name error: variable not defined

I am trying to print the mean and standard deviation however in its current form it doesnt recognize anything inside the loop. How would I go about correcting this to properly display what is intended. When i try to print the mean it says ex not defined.
import numpy as np
p = 0.44
q = 0.56
mu_1 = 26.5
sigma = 4.3
mu_2 = 76.4
n = 7
print( 'total number of jobs =', n)
lst_times = []
j = 0
def calc_avg_std(n):
while j < 100:
m = np.random.binomial(n,p)
easy_jobs = np.random.normal(mu_1,sigma,m)
n_chall = n-m
chall_jobs = np.random.exponential(mu_2,n_chall)
totalTime = sum(easy_jobs) + sum(chall_jobs)
lst_times.append(totalTime)
j = j + 1
ex = (mu_1 * p) + (mu_2 * q)
ex2 = (p *((mu_1**2)))+ (q*(mu_2**2)*2)
var = ex2-(ex**2)
stdev = np.sqrt(var)
return [ex , stdev]
print(' mean is',ex)
I tried this code without the def and return and runs properly but the professor insists that it should be implemented.

def is used to create a function. When you use return you return the values to the caller.
Replace your last prin witht the following lines:
call the function and keep the return values
print the returned values
mean, stdev = calc_avg_std(n)
print(mean)

Differential equation change of variables with sympy

I have an ordinary differential equation like this:
DiffEq = Eq(-ℏ*ℏ*diff(Ψ,x,2)/(2*m) + m*w*w*(x*x)*Ψ/2 - E*Ψ , 0)
I want to perform a variable change :
sp.Eq(u , x*sqrt(m*w/ℏ))
sp.Eq(Ψ, H*exp(-u*u/2))
How can I do this with sympy?

Use the following function:
def variable_change(ODE,dependent_var,
independent_var,
new_dependent_var = None,
new_independent_var= None,
dependent_var_relation = None,
independent_var_relation = None,
order = 2):
if new_dependent_var == None:
new_dependent_var = dependent_var
if new_independent_var == None:
new_independent_var = independent_var
# dependent variable change
if new_independent_var != independent_var:
for i in range(order, -1, -1):
# remplace derivate
a = D(dependent_var , independent_var, i )
ξ = Function("ξ")(independent_var)
b = D( dependent_var.subs(independent_var, ξ), independent_var ,i)
rel = solve(independent_var_relation, new_independent_var)[0]
for j in range(order, 0, -1):
b = b.subs( D(ξ,independent_var,j), D(rel,independent_var,j))
b = b.subs(ξ, new_independent_var)
rel = solve(independent_var_relation, independent_var)[0]
b = b.subs(independent_var, rel)
ODE = ODE.subs(a,b)
ODE = ODE.subs(independent_var, rel)
# change of variables of indpendent variable
if new_dependent_var != dependent_var:
ODE = (ODE.subs(dependent_var.subs(independent_var,new_independent_var) , (solve(dependent_var_relation, dependent_var)[0])))
ODE = ODE.doit().expand()
return ODE.simplify()
For the example posted:
from sympy import *
from sympy import diff as D
E, ℏ ,w,m,x,u = symbols("E, ℏ , w,m,x,u")
Ψ ,H = map(Function, ["Ψ ","H"])
Ψ ,H = Ψ(x), H(u)
DiffEq = Eq(-ℏ*ℏ*D(Ψ,x,2)/(2*m) + m*w*w*(x*x)*Ψ/2 - E*Ψ,0)
display(DiffEq)
display(Eq(u , x*sqrt(m*w/ℏ)))
display(Eq(Ψ, H*exp(-u*u/2)))
newODE = variable_change(ODE = DiffEq,
independent_var = x,
new_independent_var= u,
independent_var_relation = Eq(u , x*sqrt(m*w/ℏ)),
dependent_var = Ψ,
new_dependent_var = H,
dependent_var_relation = Eq(Ψ, H*exp(-u*u/2)),
order = 2)
display(newODE)
Under this substitution the differential equation outputted is then:
Eq((-E*H + u*w*ℏ*D(H, u) + w*ℏ*H/2 - w*ℏ*D(H, (u, 2))/2)*exp(-u**2/2), 0)

If anyone is wondering how they could do it as well on CoCalc notebooks/anywhere where you can mix Sage and Python, here I defined basically the same variables and functions as OP did on his accepted answer, and then after substitution the result is converted back to Sage:
# Sage objects
var("E w m x u")
var("h_bar", latex_name = r'\hbar')
Ψ = function("Ψ")(x)
H = function('H')(u)
DiffEq = (-h_bar*h_bar*Ψ.diff(x, 2)/(2*m) + m*w*w*(x*x)*Ψ/2 - E*Ψ == 0)
display(DiffEq)
display(u == x*sqrt(m*w/h_bar))
display(Ψ == H*exp(-u*u/2))
# Function is purely sympy
newODE = variable_change(
ODE = DiffEq._sympy_(),
independent_var = x._sympy_(),
new_independent_var = u._sympy_(),
independent_var_relation = (u == x*sqrt(m*w/h_bar))._sympy_(),
dependent_var = Ψ._sympy_(),
new_dependent_var = H._sympy_(),
dependent_var_relation = (Ψ == H*exp(-u*u/2))._sympy_(),
order = 2
)
display(newODE._sage_())
Note that the only difference is that here things are converted to SymPy when using as arguments inside OP's function (it'll probably break if you don't!). After you call _sympy_() only once on a variable or expression, every sympy object gets a _sage_() method to convert back.
The result given was:
# Sage object again
1/2*(2*h_bar*u*w*diff(H(u), u) + h_bar*w*H(u) - h_bar*w*diff(H(u), u, u) - 2*E*H(u))*e^(-1/2*u^2) == 0
Which is just OP's result, but Sage handles operands a little bit differently.
Note: in order to avoid overriding stuff on Sage after importing everything from SymPy, you may want to import only diff as D, Function and solve from the main library. You might also want to rename sympy's solve to something else to avoid overriding Sage's own sage.symbolic.relation.solve.

Why is my interpolation not working properly in my function?

I have a fairly long code that processes spectra, and along the way I need an interpolation of some points. I used to have all this code written line-by-line without any functions, and it all worked properly, but now I'm converting it to two large functions so that I can call it on other models more easily in the future. Below is my code (I have more code after the last line here that plots some things, but that's not relevant to my issue, since I've tested this with a bunch of print lines and learned that my issue arises when I call the interpolation function inside my process function.
import re
import numpy as np
import scipy.interpolate
# Required files and lists
filename = 'bpass_spectra.txt' # number of columns = 4
extinctionfile = 'ExtinctionLawPoints.txt' # R_V = 4.0
datalist = []
if filename == 'bpass_spectra.txt':
filetype = 4
else:
filetype = 1
if extinctionfile == 'ExtinctionLawPoints.txt':
R_V = 4.0
else:
R_V = 1.0 #to be determined
# Constants
h = 4.1357e-15 # Planck's constant [eV s]
c = float(3e8) # speed of light [m/s]
# Inputs
beta = 2.0 # power used in extinction law
R = 1.0 # star formation rate [Msun/yr]
z = 1.0 # redshift
M_gas = 1.0 # mass of gas
M_halo = 2e41 # mass of dark matter halo
# Read spectra file
f = open(filename, 'r')
rawlines = f.readlines()
met = re.findall('Z\s=\s(\d*\.\d+)', rawlines[0])
del rawlines[0]
for i in range(len(rawlines)):
newlist = rawlines[i].split(' ')
datalist.append(newlist)
# Read extinction curve data file
rawpoints = open(extinctionfile, 'r').readlines()
def interpolate(R_V, rawpoints, Elist, i):
pointslist = []
if R_V == 4.0:
for i in range(len(rawpoints)):
newlst = re.split('(?!\S)\s(?=\S)|(?!\S)\s+(?=\S)', rawpoints[i])
pointslist.append(newlst)
pointslist = pointslist[3:]
lambdalist = [float(item[0]) for item in pointslist]
k_abslist = [float(item[4]) for item in pointslist]
xvallist = [(c*h)/(lamb*1e-6) for lamb in lambdalist]
k_interp = scipy.interpolate.interp1d(xvallist, k_abslist)
return k_interp(Elist[i])
# Processing function
def process(interpolate, filetype, datalist, beta, R, z, M_gas, M_halo, met):
speclist = []
if filetype == 4:
metallicity = float(met[0])
Elist = [float(item[0]) for item in datalist]
speclambdalist = [h*c*1e9/E for E in Elist]
met1list = [float(item[1]) for item in datalist]
speclist.extend(met1list)
klist, Tlist = [None]*len(speclist), [None]*len(speclist)
if metallicity > 0.0052:
DGRlist = [50.0*np.exp(-2.21)*metallicity]*len(speclist) # dust to gas ratio
elif metallicity <= 0.0052:
DGRlist = [((50.0*metallicity)**3.15)*np.exp(-0.96)]*len(speclist)
for i in range(len(speclist)):
if Elist[i] <= 4.1357e-3: # frequencies <= 10^12 Hz
klist[i] = 0.1*(float(Elist[i])/(1000.0*h))**beta # extinction law [cm^2/g]
elif Elist[i] > 4.1357e-3: # frequencies > 10^12 Hz
klist[i] = interpolate(R_V, rawpoints, Elist, i) # interpolated function's value at Elist[i]
print "KLIST (INTERPOLATION) ELEMENTS 0 AND 1000:", klist[0], klist[1000]
return
The output from the print line is KLIST (INTERPOLATION) ELEMENTS 0 AND 1000: 52167.31734159269 52167.31734159269.
When I run my old code without functions, I print klist[0] and klist[1000] like I do here and get different values for each. In this new code, I get back two values that are the same from this line. This shouldn't be the case, so it must not be interpolating correctly inside my function (maybe it's not performing it on each point correctly in the loop?). Does anyone have any insight? It would be unreasonable to post my entire code with all the used text files here (they're very large), so I'm not expecting anyone to run it, but rather examine how I use and call my functions.
Edit: Below is the original version of my code up to the interpolation point without the functions (which works).
import re
import numpy as np
import scipy.interpolate
filename = 'bpass_spectra.txt'
extinctionfile = 'ExtinctionLawPoints.txt' # from R_V = 4.0
pointslist = []
datalist = []
speclist = []
# Constants
h = 4.1357e-15 # Planck's constant [eV s]
c = float(3e8) # speed of light [m/s]
# Read spectra file
f = open(filename, 'r')
rawspectra = f.readlines()
met = re.findall('Z\s=\s(\d*\.\d+)', rawspectra[0])
del rawspectra[0]
for i in range(len(rawspectra)):
newlist = rawspectra[i].split(' ')
datalist.append(newlist)
# Read extinction curve data file
rawpoints = open(extinctionfile, 'r').readlines()
for i in range(len(rawpoints)):
newlst = re.split('(?!\S)\s(?=\S)|(?!\S)\s+(?=\S)', rawpoints[i])
pointslist.append(newlst)
pointslist = pointslist[3:]
lambdalist = [float(item[0]) for item in pointslist]
k_abslist = [float(item[4]) for item in pointslist]
xvallist = [(c*h)/(lamb*1e-6) for lamb in lambdalist]
k_interp = scipy.interpolate.interp1d(xvallist, k_abslist)
# Create new lists
Elist = [float(item[0]) for item in datalist]
speclambdalist = [h*c*1e9/E for E in Elist]
z1list = [float(item[1]) for item in datalist]
speclist.extend(z1list)
met = met[0]
klist = [None]*len(speclist)
Loutlist = [None]*len(speclist)
Tlist = [None]*len(speclist)
# Define parameters
b = 2.0 # power used in extinction law (beta)
R = 1.0 # star formation ratw [Msun/yr]
z = 1.0 # redshift
Mgas = 1.0 # mass of gas
Mhalo = 2e41 # mass of dark matter halo
if float(met) > 0.0052:
DGRlist = [50.0*np.exp(-2.21)*float(met)]*len(speclist)
elif float(met) <= 0.0052:
DGRlist = [((50.0*float(met))**3.15)*np.exp(-0.96)]*len(speclist)
for i in range(len(speclist)):
if float(Elist[i]) <= 4.1357e-3: # frequencies <= 10^12 Hz
klist[i] = 0.1*(float(Elist[i])/(1000.0*h))**b # extinction law [cm^2/g]
elif float(Elist[i]) > 4.1357e-3: # frequencies > 10^12 Hz
klist[i] = k_interp(Elist[i]) # interpolated function's value at Elist[i]
print "KLIST (INTERPOLATION) ELEMENTS 0 AND 1000:", klist[0], klist[1000]
The output from this print line is KLIST (INTERPOLATION) ELEMENTS 0 AND 1000 7779.275435560996 58253.589270674354.

You are passing i as an argument to interpolate, and then also using i in a loop within interpolate. Once i is used within the for i in range(len(rawpoints)) loop in interpolate, it will be set to some value: len(rawpoints)-1. The interpolate function will then always return the same value k_interp(Elist[i]), which is equivalent to k_interp(Elist[len(rawpoints)-1]). You will need to either define a new variable within your loop (e.g. for not_i in range(len(rawpoints))), or use a different variable for the Elist argument. Consider the following change to interpolate:
def interpolate(R_V, rawpoints, Elist, j):
pointslist = []
if R_V == 4.0:
for i in range(len(rawpoints)):
newlst = re.split('(?!\S)\s(?=\S)|(?!\S)\s+(?=\S)', rawpoints[i])
pointslist.append(newlst)
pointslist = pointslist[3:]
lambdalist = [float(item[0]) for item in pointslist]
k_abslist = [float(item[4]) for item in pointslist]
xvallist = [(c*h)/(lamb*1e-6) for lamb in lambdalist]
k_interp = scipy.interpolate.interp1d(xvallist, k_abslist)
return k_interp(Elist[j])

Python convert string into variable then use in function

I've generated a variable names for pN from N=0 to n:
p0, p1, p2, p3, p4,....,pn
So far I got:
P = ['p0', 'p1', 'p2',..., 'pn']
What I want to do now is convert them into variable names to be called upon and used in a function.
Convert 'p0' to p0, and so on for all of the values in
P = ['p0', 'p1', 'p2',..., 'pn']
So it becomes
P = [p0, p1, p2,..., pn]
Then generate functions using the variables:
Fi = P[i]/2
ie:
F0 = p0/2
F1 = p1/2
F2 = p2/2
Problem is I don't know how to convert 'pn' to pn properly, then generate functions based on pn. How do I do that? Or rather, what am I doing wrong?
Here's what I got:
# Define the constants
# ====================
k = 1
m = 1
# Generate the names
# ==================
n = 10 # Number of terms
P = [] # Array for momenta names
Q = [] # Array for positon names
for i in range (n): # Generate the names for
p = "p"+str(i) # momentum
q = "q"+str(i) # positon
# Convert the names into variales
exec("%s = %d" % p)
exec("%s = %d" % q)
# Put the names into the arrays
P.append(p)
Q.append(q)
# Print the names out for error checking
print(P)
print(Q)
# Generate the functions
# ======================
Q_dot = []
for i in range(n):
def Q_dot(P[i]):
q_dot = P[i] / m
Q_dot.append(q_dot)
i += 1
Note: I want to keep my variables as variables because I'm using the functions to solve a system of differential equations. My goal is to create a list of differential equations with the same format.

Finding correlation coefficient from 2 lists

I am working on a project which has many functions when given a couple lists of data. I've already seperated the lists and I have defined some functions which I know for certain work correctly, that being a mean function and standard deviation function. My issue is when testing my lists I get a correct mean, correct standard deviation, but incorrect correlation coefficient. Could my math be off here? I need to find the correlation coefficient with only Python's standard library.
MY CODE:
def correlCo(someList1, someList2):
# First establish the means and standard deviations for both lists.
xMean = mean(someList1)
yMean = mean(someList2)
xStandDev = standDev(someList1)
yStandDev = standDev(someList2)
zList1 = []
zList2 = []
# Create 2 new lists taking (a[i]-a's Mean)/standard deviation of a
for x in someList1:
z1 = ((float(x)-xMean)/xStandDev)
zList1.append(z1)
for y in someList2:
z2 = ((float(y)-yMean)/yStandDev)
zList2.append(z2)
# Mapping out the lists to be float values instead of string
zList1 = list(map(float,zList1))
zList2 = list(map(float,zList2))
# Multiplying each value from the lists
zFinal = [a*b for a,b in zip(zList1,zList2)]
totalZ = 0
# Taking the sum of all the products
for a in zFinal:
totalZ += a
# Finally calculating correlation coefficient
r = (1/(len(someList1) - 1)) * totalZ
return r
SAMPLE RUN:
I have a list of [1,2,3,4,4,8] and [3,3,4,5,8,9]
I expect the correct answer of r = 0.8848, but get r = .203727
EDIT: To include the mean and standard deviation functions I have made.
def mean(someList):
total = 0
for a in someList:
total += float(a)
mean = total/len(someList)
return mean
def standDev(someList):
newList = []
sdTotal = 0
listMean = mean(someList)
for a in someList:
newNum = (float(a) - listMean)**2
newList.append(newNum)
for z in newList:
sdTotal += float(z)
standardDeviation = sdTotal/(len(newList))
return standardDeviation

The Pearson correlation can be calculated with numpy's corrcoef.
import numpy
numpy.corrcoef(list1, list2)[0, 1]

Pearson Correlation Coefficient
Code (modified)
def mean(someList):
total = 0
for a in someList:
total += float(a)
mean = total/len(someList)
return mean
def standDev(someList):
listMean = mean(someList)
dev = 0.0
for i in range(len(someList)):
dev += (someList[i]-listMean)**2
dev = dev**(1/2.0)
return dev
def correlCo(someList1, someList2):
# First establish the means and standard deviations for both lists.
xMean = mean(someList1)
yMean = mean(someList2)
xStandDev = standDev(someList1)
yStandDev = standDev(someList2)
# r numerator
rNum = 0.0
for i in range(len(someList1)):
rNum += (someList1[i]-xMean)*(someList2[i]-yMean)
# r denominator
rDen = xStandDev * yStandDev
r = rNum/rDen
return r
print(correlCo([1,2,3,4,4,8], [3,3,4,5,8,9]))
Output
0.884782972876

Normally according to the standard deviation formula, you should have divided the dev to the sample number (length of the list) before sqrrt it.Right?
I mean:
dev += ((someList[i]-listMean)**2)/len(someList)
enter image description here

Your standard deviation is wrong. You forgot to take the squareroot.
You are actually returning variance and not standard deviation from that function.
#DeathPox

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Replacing pairs of variables in a file - python

Related

When i try to print the mean and standard deviation, I am prompted with a name error: variable not defined

Differential equation change of variables with sympy

Why is my interpolation not working properly in my function?

Python convert string into variable then use in function

Finding correlation coefficient from 2 lists

Categories

Resources