I want to create a plot of (T/Tmax vs R/R0) for different values of Pa in a single plot like below. I have written this code that appends values to a list but all values of (T/Tmax vs R/R0) are appended in single list which does not give a good plot. What can I do to have such a plot? Also how can I make an excel sheet from the data from the loop where column 1 is T/Tmax list and column 2,3,4...are corresponding R/R0 values for different pa?
KLMDAT1 = []
KLMDAT2 = []
for j in range(z):
pa[j] = 120000-10000*j
i = 0
R = R0
q = 0
T = 0
while (T<Tmax):
k1 = KLM_RKM(i*dT,R,q,pa[j])
k2 = KLM_RKM((i+0.5)*dT,R,q+0.5*dT*k1,pa[j])
k3 = KLM_RKM((i+0.5)*dT,R,q+0.5*dT*k2,pa[j])
k4 = KLM_RKM((i+1)*dT,R,q+dT*k3,pa[j])
q = q +1/6.0*dT*(k1+2*k2+2*k3+k4)
R = R+dT*q
if(R>0):
KLMDAT1.append(T / Tmax)
KLMDAT2.append(R / R0)
if(R>Rmax):
Rmax = R
if (abs(q)>c or R < 0):
break
T=T+dT
i = i+1
wb.save('KLM.xlsx')
np.savetxt('KLM.csv',[KLMDAT1, KLMDAT2])
plt.plot(KLMDAT1, KLMDAT2)
plt.show()
You are plotting it wrong. Your first variable needs to be T/Tmax. So initialize an empty T list, append T values to it, divide it by Tmax, and then plot twice: first KLMDAT1 and then KLMDAT2. Following pseudocode explains it
KLMDAT1 = []
KLMDAT2 = []
T_list = [] # <--- Initialize T list here
for j in range(z):
...
while (T<Tmax):
...
T=T+dT
T_list.append(T) # <--- Append T here
i = i+1
# ... rest of the code
plt.plot(np.array(T_list)/Tmax, KLMDAT1) # <--- Changed here
plt.plot(np.array(T_list)/Tmax, KLMDAT2) # <--- Changed here
plt.show()
Related
Let me start by saying that I know this error message has posts about it, but I'm not sure what's wrong with my code. The block of code works just fine for the first two loops, but then fails. I've even tried removing the first two loops from the data to rule out issues in the 3rd loop, but no luck. I did have it set to print out the unsorted temporary list, and it just prints an empty array for the 3rd loop.
Sorry for the wall of comments in my code, but I'd rather have each line commented than cause confusion over what I'm trying to accomplish.
TL;DR: I'm trying to find and remove outliers from a list of data, but only for groups of entries that have the same number in column 0.
Pastebin with data
import numpy as np, csv, multiprocessing as mp, mysql.connector as msc, pandas as pd
import datetime
#Declare unsorted data array
d_us = []
#Declare temporary array for use in loop
tmp = []
#Declare sorted data array
d = []
#Declare Sum variable
tot = 0
#Declare Mean variable
m = 0
#declare sorted final array
sort = []
#Declare number of STDs
t = 1
#Declare Standard Deviation variable
std = 0
#Declare z-score variable
z_score
#Timestamp for output files
nts = datetime.datetime.now().timestamp()
#Create output file
with open(f"calib_temp-{nts}.csv", 'w') as ctw:
pass
#Read data from CSV
with open("test.csv", 'r', newline='') as drh:
fr_rh = csv.reader(drh, delimiter=',')
for row in fr_rh:
#append data to unsorted array
d_us.append([float(row[0]),float(row[1])])
#Sort array by first column
d = np.sort(d_us)
#Calculate the range of the data
l = round((d[-1][0] - d[0][0]) * 10)
#Declare the starting value
s = d[0][0]
#Declare the ending value
e = d[-1][0]
#Set the while loop counter
n = d[0][0]
#Iterate through data
while n <= e:
#Create array with difference column
for row in d:
if row[0] == n:
diff = round(row[0] - row[1], 1)
tmp.append([row[0],row[1],diff])
#Convert to numpy array
tmp = np.array(tmp)
#Sort numpy array
sort = tmp[np.argsort(tmp[:,2])]
#Calculate sum of differences
for row in tmp:
tot = tot + row[2]
#Calculate mean
m = np.mean(tot)
#Calculate Standard Deviation
std = np.std(tmp[:,2])
#Calculate outliers and write to output file
for y in tmp:
z_score = (y[2] - m)/std
if np.abs(z_score) > t:
with open(f"calib_temp-{nts}.csv", 'a', newline='') as ct:
c = csv.writer(ct, delimiter = ',')
c.writerow([y[0],y[1]])
#Reset Variables
tot = 0
m = 0
n = n + 0.1
tmp = []
std = 0
z_score = 0
Do this before the loop:
#Create output file
ct = open(f"calib_temp-{nts}.csv", 'w')
c = csv.writer(ct, delimiter = ',')
Then change the loop to this. Note that I have moved your initializations to the top of the loop, so you don't need to initialize them twice. Note the if tmp: line, which solves the numpy exception.
#Iterate through data
while n <= e:
tot = 0
m = 0
tmp = []
std = 0
z_score = 0
#Create array with difference column
for row in d:
if row[0] == n:
diff = round(row[0] - row[1], 1)
tmp.append([row[0],row[1],diff])
#Sort numpy array
if tmp:
#Convert to numpy array
tmp = np.array(tmp)
sort = tmp[np.argsort(tmp[:,2])]
#Calculate sum of differences
for row in tmp:
tot = tot + row[2]
#Calculate mean
m = np.mean(tot)
#Calculate Standard Deviation
std = np.std(tmp[:,2])
#Calculate outliers and write to output file
for y in tmp:
z_score = (y[2] - m)/std
if np.abs(z_score) > t:
c.writerow([y[0],y[1]])
#Reset Variables
n = n + 0.1
I have the following program, it seems that the amp and period at the end print out a list of list(see below). And I am unable to plot them (I want to plot period against amp)
I have tried methods in How to make a flat list out of list of lists? to combine the output of amp and period so that they are plot-table, but nothing worked.
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import solve_ivp
def derivatives(t,y,q,F):
return [y[1], -np.sin(y[0])-q*y[1]+F*np.sin((2/3)*t)]
t = np.linspace(0.0, 100, 10000)
#initial conditions
theta0 = np.linspace(0.0,np.pi,100)
q = 0.0 #alpha / (mass*g), resistive term
F = 0.0 #G*np.sin(2*t/3)
for i in range (0,100):
sol = solve_ivp(derivatives, (0.0,100.0), (theta0[i], 0.0), method = 'RK45', t_eval = t,args = (q,F))
velocity = sol.y[1]
time = sol.t
zero_cross = 0
value = []
amp = []
period = []
for k in range (len(velocity)-1):
if (velocity[k+1]*velocity[k]) < 0:
zero_cross += 1
value.append(k)
else:
zero_cross += 0
zero_cross = zero_cross - zero_cross % 2 # makes the total number of zero-crossings even
if zero_cross != 0:
amp.append(theta0[i])
# period calculated using the time evolved between the first and last zero-crossing detected
period.append((2*(time[value[zero_cross - 1]] - time[value[0]])) / (zero_cross -1))
If I print out amp inside the loop, it displays as follows:
[0.03173325912716963]
[0.06346651825433926]
[0.0951997773815089]
[0.12693303650867852]
[0.15866629563584814]
[0.1903995547630178]
[0.2221328138901874]
[0.25386607301735703]
[0.28559933214452665]
[0.3173325912716963]
[0.3490658503988659]
[0.3807991095260356]
[0.4125323686532052]
[0.4442656277803748]
[0.47599888690754444]
[0.5077321460347141]
[0.5394654051618837]
[0.5711986642890533]
[0.6029319234162229]
[0.6346651825433925]
[0.6663984416705622]
[0.6981317007977318]
[0.7298649599249014]
[0.7615982190520711]
[0.7933314781792408]
[0.8250647373064104]
[0.85679799643358]
[0.8885312555607496]
[0.9202645146879193]
[0.9519977738150889]
[0.9837310329422585]
[1.0154642920694281]
[1.0471975511965979]
[1.0789308103237674]
[1.110664069450937]
[1.1423973285781066]
[1.1741305877052763]
[1.2058638468324459]
[1.2375971059596156]
[1.269330365086785]
[1.3010636242139548]
[1.3327968833411243]
[1.364530142468294]
[1.3962634015954636]
[1.4279966607226333]
[1.4597299198498028]
[1.4914631789769726]
[1.5231964381041423]
[1.5549296972313118]
[1.5866629563584815]
[1.618396215485651]
[1.6501294746128208]
[1.6818627337399903]
[1.71359599286716]
[1.7453292519943295]
[1.7770625111214993]
[1.8087957702486688]
[1.8405290293758385]
[1.872262288503008]
[1.9039955476301778]
[1.9357288067573473]
[1.967462065884517]
[1.9991953250116865]
[2.0309285841388562]
[2.0626618432660258]
[2.0943951023931957]
[2.126128361520365]
[2.1578616206475347]
[2.1895948797747042]
[2.221328138901874]
[2.2530613980290437]
[2.284794657156213]
[2.3165279162833827]
[2.3482611754105527]
[2.379994434537722]
[2.4117276936648917]
[2.443460952792061]
[2.475194211919231]
[2.5069274710464007]
[2.53866073017357]
[2.57039398930074]
[2.6021272484279097]
[2.633860507555079]
[2.6655937666822487]
[2.6973270258094186]
[2.729060284936588]
[2.7607935440637577]
[2.792526803190927]
[2.824260062318097]
[2.8559933214452666]
[2.887726580572436]
[2.9194598396996057]
[2.9511930988267756]
[2.982926357953945]
[3.0146596170811146]
[3.141592653589793]
[Finished in 3.822s]
I am not sure what type of output that is and how to handle, any help would be appreciated!
You are declaring the lists inside the loop, which means they will be reset to empty at every iteration. Consider declaring amp, period, and any array that should be set to empty only once (as initial state) before the loop, like so:
#initialize arrays, executes only once before the loop
amp = []
period = []
for i in range (0,100):
#your logic here, plus appending values to `amp` and `period`
#now `amp` and `period` should contain all desired values
I have a fairly long code that processes spectra, and along the way I need an interpolation of some points. I used to have all this code written line-by-line without any functions, and it all worked properly, but now I'm converting it to two large functions so that I can call it on other models more easily in the future. Below is my code (I have more code after the last line here that plots some things, but that's not relevant to my issue, since I've tested this with a bunch of print lines and learned that my issue arises when I call the interpolation function inside my process function.
import re
import numpy as np
import scipy.interpolate
# Required files and lists
filename = 'bpass_spectra.txt' # number of columns = 4
extinctionfile = 'ExtinctionLawPoints.txt' # R_V = 4.0
datalist = []
if filename == 'bpass_spectra.txt':
filetype = 4
else:
filetype = 1
if extinctionfile == 'ExtinctionLawPoints.txt':
R_V = 4.0
else:
R_V = 1.0 #to be determined
# Constants
h = 4.1357e-15 # Planck's constant [eV s]
c = float(3e8) # speed of light [m/s]
# Inputs
beta = 2.0 # power used in extinction law
R = 1.0 # star formation rate [Msun/yr]
z = 1.0 # redshift
M_gas = 1.0 # mass of gas
M_halo = 2e41 # mass of dark matter halo
# Read spectra file
f = open(filename, 'r')
rawlines = f.readlines()
met = re.findall('Z\s=\s(\d*\.\d+)', rawlines[0])
del rawlines[0]
for i in range(len(rawlines)):
newlist = rawlines[i].split(' ')
datalist.append(newlist)
# Read extinction curve data file
rawpoints = open(extinctionfile, 'r').readlines()
def interpolate(R_V, rawpoints, Elist, i):
pointslist = []
if R_V == 4.0:
for i in range(len(rawpoints)):
newlst = re.split('(?!\S)\s(?=\S)|(?!\S)\s+(?=\S)', rawpoints[i])
pointslist.append(newlst)
pointslist = pointslist[3:]
lambdalist = [float(item[0]) for item in pointslist]
k_abslist = [float(item[4]) for item in pointslist]
xvallist = [(c*h)/(lamb*1e-6) for lamb in lambdalist]
k_interp = scipy.interpolate.interp1d(xvallist, k_abslist)
return k_interp(Elist[i])
# Processing function
def process(interpolate, filetype, datalist, beta, R, z, M_gas, M_halo, met):
speclist = []
if filetype == 4:
metallicity = float(met[0])
Elist = [float(item[0]) for item in datalist]
speclambdalist = [h*c*1e9/E for E in Elist]
met1list = [float(item[1]) for item in datalist]
speclist.extend(met1list)
klist, Tlist = [None]*len(speclist), [None]*len(speclist)
if metallicity > 0.0052:
DGRlist = [50.0*np.exp(-2.21)*metallicity]*len(speclist) # dust to gas ratio
elif metallicity <= 0.0052:
DGRlist = [((50.0*metallicity)**3.15)*np.exp(-0.96)]*len(speclist)
for i in range(len(speclist)):
if Elist[i] <= 4.1357e-3: # frequencies <= 10^12 Hz
klist[i] = 0.1*(float(Elist[i])/(1000.0*h))**beta # extinction law [cm^2/g]
elif Elist[i] > 4.1357e-3: # frequencies > 10^12 Hz
klist[i] = interpolate(R_V, rawpoints, Elist, i) # interpolated function's value at Elist[i]
print "KLIST (INTERPOLATION) ELEMENTS 0 AND 1000:", klist[0], klist[1000]
return
The output from the print line is KLIST (INTERPOLATION) ELEMENTS 0 AND 1000: 52167.31734159269 52167.31734159269.
When I run my old code without functions, I print klist[0] and klist[1000] like I do here and get different values for each. In this new code, I get back two values that are the same from this line. This shouldn't be the case, so it must not be interpolating correctly inside my function (maybe it's not performing it on each point correctly in the loop?). Does anyone have any insight? It would be unreasonable to post my entire code with all the used text files here (they're very large), so I'm not expecting anyone to run it, but rather examine how I use and call my functions.
Edit: Below is the original version of my code up to the interpolation point without the functions (which works).
import re
import numpy as np
import scipy.interpolate
filename = 'bpass_spectra.txt'
extinctionfile = 'ExtinctionLawPoints.txt' # from R_V = 4.0
pointslist = []
datalist = []
speclist = []
# Constants
h = 4.1357e-15 # Planck's constant [eV s]
c = float(3e8) # speed of light [m/s]
# Read spectra file
f = open(filename, 'r')
rawspectra = f.readlines()
met = re.findall('Z\s=\s(\d*\.\d+)', rawspectra[0])
del rawspectra[0]
for i in range(len(rawspectra)):
newlist = rawspectra[i].split(' ')
datalist.append(newlist)
# Read extinction curve data file
rawpoints = open(extinctionfile, 'r').readlines()
for i in range(len(rawpoints)):
newlst = re.split('(?!\S)\s(?=\S)|(?!\S)\s+(?=\S)', rawpoints[i])
pointslist.append(newlst)
pointslist = pointslist[3:]
lambdalist = [float(item[0]) for item in pointslist]
k_abslist = [float(item[4]) for item in pointslist]
xvallist = [(c*h)/(lamb*1e-6) for lamb in lambdalist]
k_interp = scipy.interpolate.interp1d(xvallist, k_abslist)
# Create new lists
Elist = [float(item[0]) for item in datalist]
speclambdalist = [h*c*1e9/E for E in Elist]
z1list = [float(item[1]) for item in datalist]
speclist.extend(z1list)
met = met[0]
klist = [None]*len(speclist)
Loutlist = [None]*len(speclist)
Tlist = [None]*len(speclist)
# Define parameters
b = 2.0 # power used in extinction law (beta)
R = 1.0 # star formation ratw [Msun/yr]
z = 1.0 # redshift
Mgas = 1.0 # mass of gas
Mhalo = 2e41 # mass of dark matter halo
if float(met) > 0.0052:
DGRlist = [50.0*np.exp(-2.21)*float(met)]*len(speclist)
elif float(met) <= 0.0052:
DGRlist = [((50.0*float(met))**3.15)*np.exp(-0.96)]*len(speclist)
for i in range(len(speclist)):
if float(Elist[i]) <= 4.1357e-3: # frequencies <= 10^12 Hz
klist[i] = 0.1*(float(Elist[i])/(1000.0*h))**b # extinction law [cm^2/g]
elif float(Elist[i]) > 4.1357e-3: # frequencies > 10^12 Hz
klist[i] = k_interp(Elist[i]) # interpolated function's value at Elist[i]
print "KLIST (INTERPOLATION) ELEMENTS 0 AND 1000:", klist[0], klist[1000]
The output from this print line is KLIST (INTERPOLATION) ELEMENTS 0 AND 1000 7779.275435560996 58253.589270674354.
You are passing i as an argument to interpolate, and then also using i in a loop within interpolate. Once i is used within the for i in range(len(rawpoints)) loop in interpolate, it will be set to some value: len(rawpoints)-1. The interpolate function will then always return the same value k_interp(Elist[i]), which is equivalent to k_interp(Elist[len(rawpoints)-1]). You will need to either define a new variable within your loop (e.g. for not_i in range(len(rawpoints))), or use a different variable for the Elist argument. Consider the following change to interpolate:
def interpolate(R_V, rawpoints, Elist, j):
pointslist = []
if R_V == 4.0:
for i in range(len(rawpoints)):
newlst = re.split('(?!\S)\s(?=\S)|(?!\S)\s+(?=\S)', rawpoints[i])
pointslist.append(newlst)
pointslist = pointslist[3:]
lambdalist = [float(item[0]) for item in pointslist]
k_abslist = [float(item[4]) for item in pointslist]
xvallist = [(c*h)/(lamb*1e-6) for lamb in lambdalist]
k_interp = scipy.interpolate.interp1d(xvallist, k_abslist)
return k_interp(Elist[j])
I've generated a variable names for pN from N=0 to n:
p0, p1, p2, p3, p4,....,pn
So far I got:
P = ['p0', 'p1', 'p2',..., 'pn']
What I want to do now is convert them into variable names to be called upon and used in a function.
Convert 'p0' to p0, and so on for all of the values in
P = ['p0', 'p1', 'p2',..., 'pn']
So it becomes
P = [p0, p1, p2,..., pn]
Then generate functions using the variables:
Fi = P[i]/2
ie:
F0 = p0/2
F1 = p1/2
F2 = p2/2
Problem is I don't know how to convert 'pn' to pn properly, then generate functions based on pn. How do I do that? Or rather, what am I doing wrong?
Here's what I got:
# Define the constants
# ====================
k = 1
m = 1
# Generate the names
# ==================
n = 10 # Number of terms
P = [] # Array for momenta names
Q = [] # Array for positon names
for i in range (n): # Generate the names for
p = "p"+str(i) # momentum
q = "q"+str(i) # positon
# Convert the names into variales
exec("%s = %d" % p)
exec("%s = %d" % q)
# Put the names into the arrays
P.append(p)
Q.append(q)
# Print the names out for error checking
print(P)
print(Q)
# Generate the functions
# ======================
Q_dot = []
for i in range(n):
def Q_dot(P[i]):
q_dot = P[i] / m
Q_dot.append(q_dot)
i += 1
Note: I want to keep my variables as variables because I'm using the functions to solve a system of differential equations. My goal is to create a list of differential equations with the same format.
Question: How could I peform the following task more efficiently?
My problem is as follows. I have a (large) 3D data set of points in real physical space (x,y,z). It has been generated by a nested for loop that looks like this:
# Generate given dat with its ordering
x_samples = 2
y_samples = 3
z_samples = 4
given_dat = np.zeros(((x_samples*y_samples*z_samples),3))
row_ind = 0
for z in range(z_samples):
for y in range(y_samples):
for x in range(x_samples):
row = [x+.1,y+.2,z+.3]
given_dat[row_ind,:] = row
row_ind += 1
for row in given_dat:
print(row)`
For the sake of comparing it to another set of data, I want to reorder the given data into my desired order as follows (unorthodox, I know):
# Generate data with desired ordering
x_samples = 2
y_samples = 3
z_samples = 4
desired_dat = np.zeros(((x_samples*y_samples*z_samples),3))
row_ind = 0
for z in range(z_samples):
for x in range(x_samples):
for y in range(y_samples):
row = [x+.1,y+.2,z+.3]
desired_dat[row_ind,:] = row
row_ind += 1
for row in desired_dat:
print(row)
I have written a function that does what I want, but it is horribly slow and inefficient:
def bad_method(x_samp,y_samp,z_samp,data):
zs = np.unique(data[:,2])
xs = np.unique(data[:,0])
rowlist = []
for z in zs:
for x in xs:
for row in data:
if row[0] == x and row[2] == z:
rowlist.append(row)
new_data = np.vstack(rowlist)
return new_data
# Shows that my function does with I want
fix = bad_method(x_samples,y_samples,z_samples,given_dat)
print('Unreversed data')
print(given_dat)
print('Reversed Data')
print(fix)
# If it didn't work this will throw an exception
assert(np.array_equal(desired_dat,fix))
How could I improve my function so it is faster? My data sets usually have roughly 2 million rows. It must be possible to do this with some clever slicing/indexing which I'm sure will be faster but I'm having a hard time figuring out how. Thanks for any help!
You could reshape your array, swap the axes as necessary and reshape back again:
# (No need to copy if you don't want to keep the given_dat ordering)
data = np.copy(given_dat).reshape(( z_samples, y_samples, x_samples, 3))
# swap the "y" and "x" axes
data = np.swapaxes(data, 1,2)
# back to 2-D array
data = data.reshape((x_samples*y_samples*z_samples,3))
assert(np.array_equal(desired_dat,data))