I am trying to read a file using Python. I want to take the values that follow 'Average Value' and 'Standard Deviation' after the term 'DISTRIBUTION OF OFFSET_0' in a list that can be put in CSV file for analysis.
A part of the file is given below.
DISTRIBUTION OF BLB_FLIP_0
* NOMINAL VALUE = 5.0000E-01
* AVERAGE VALUE = 5.0000E-01
* STANDARD DEVIATION = 0.0 ( 0.0%)
* STANDARD DEVIATION BASED ON NOMINAL RUN = 0.0 ( 0.0%)
DISTRIBUTION OF OFFSET_0
* NOMINAL VALUE = 4.0000E-03
* AVERAGE VALUE = 1.4000E-02
* STANDARD DEVIATION = 1.9987E-02 (142.8%)
* STANDARD DEVIATION BASED ON NOMINAL RUN = 2.0484E-02 (512.1%)
EXTRACT for TRANSIENT ANALYSIS
PARAM TEMP = -4.0000E+01
PARAM VH = 1.3200E+00
TEMPERATURE = -4.0000E+01 Celsius
*T_OUT_F10_0 = 4.0302E-04
*BL_FLIP_0 = 1.1210E+00
*BLB_FLIP_0 = 1.1200E+00
*OFFSET_0 = 1.0000E-03
DISTRIBUTION OF T_OUT_F10_0
* NOMINAL VALUE = 4.0302E-04
* AVERAGE VALUE = 4.3982E-04
* STANDARD DEVIATION = 3.5741E-05 ( 8.1%)
* STANDARD DEVIATION BASED ON NOMINAL RUN = 4.8746E-05 (12.1%)
DISTRIBUTION OF BL_FLIP_0
* NOMINAL VALUE = 1.1210E+00
* AVERAGE VALUE = 1.1394E+00
* STANDARD DEVIATION = 1.7869E-02 ( 1.6%)
* STANDARD DEVIATION BASED ON NOMINAL RUN = 2.4372E-02 ( 2.2%)
DISTRIBUTION OF BLB_FLIP_0
* NOMINAL VALUE = 1.1200E+00
* AVERAGE VALUE = 1.1200E+00
* STANDARD DEVIATION = 0.0 ( 0.0%)
* STANDARD DEVIATION BASED ON NOMINAL RUN = 0.0 ( 0.0%)
DISTRIBUTION OF OFFSET_0
* NOMINAL VALUE = 1.0000E-03
* AVERAGE VALUE = 1.9400E-02
* STANDARD DEVIATION = 1.7869E-02 (92.1%)
* STANDARD DEVIATION BASED ON NOMINAL RUN = 2.4372E-02 (2437.2%)
However, when I run the following python code, I realised the value is stuck at the first time the loop finds 'DISTRIBUTION OF OFFSET_0' that is the index does not change as I read the file array. I am unable to determine the error. Any lead appreciated.
import csv
import numpy as np
i_file = open ("setup.aex", "r")
file_matrix = i_file.readlines()
#print (file_matrix)
mean = []
sd = []
for i in file_matrix:
print(file_matrix.index (i))
if ('DISTRIBUTION OF OFFSET_0' in i) & ('AVERAGE VALUE' in (file_matrix.index(i)+2)):
print(file_matrix.index(i))
mean_loc = file_matrix[(file_matrix.index(i))+2]
mean_el = float(mean_loc.split ("AVERAGE VALUE = ")[1])
print (mean_el)
mean.append(mean_el)
sd_loc = file_matrix[(file_matrix.index(i))+3]
sd_inter =sd_loc.split (" * STANDARD DEVIATION = ")[1]
sd_el = float (sd_inter.split (" (")[0])
print (sd_el)
sd.append(sd_el)
print (mean)
print (sd)
Thanks in Advance.
you can fix your code according to the comments. In case you want to take this project any further, I've made a more streamlined version to get the data. It gradually consumes the input and returns you a dictionary with all the data in it.
import json
i_file = open ("log.log", "r")
data = {}
def linevalue(line):
name, value = line.split("=")
name = name.replace("*", "").strip()
value = value.strip()
return name, value
try:
while line := next(i_file):
indent = len(line) - len(line.lstrip())
line = line.strip()
is_value = "=" in line
if indent == 0 and len(line.strip()) > 0:
title = line
data[title] = {}
elif (indent == 2 or indent == 4) and is_value:
name, value = linevalue(line)
data[title][name] = value
elif indent == 4: #is a title
name = line
data[title][name] = {}
for i in range(4):
line = next(i_file)
sub_name, sub_value = linevalue(line)
data[title][name][sub_name] = sub_value
except StopIteration:
pass
finally:
del i_file
with open("output.json", "w") as fp:
json.dump(data, fp)
Related
I am trying to print the mean and standard deviation however in its current form it doesnt recognize anything inside the loop. How would I go about correcting this to properly display what is intended. When i try to print the mean it says ex not defined.
import numpy as np
p = 0.44
q = 0.56
mu_1 = 26.5
sigma = 4.3
mu_2 = 76.4
n = 7
print( 'total number of jobs =', n)
lst_times = []
j = 0
def calc_avg_std(n):
while j < 100:
m = np.random.binomial(n,p)
easy_jobs = np.random.normal(mu_1,sigma,m)
n_chall = n-m
chall_jobs = np.random.exponential(mu_2,n_chall)
totalTime = sum(easy_jobs) + sum(chall_jobs)
lst_times.append(totalTime)
j = j + 1
ex = (mu_1 * p) + (mu_2 * q)
ex2 = (p *((mu_1**2)))+ (q*(mu_2**2)*2)
var = ex2-(ex**2)
stdev = np.sqrt(var)
return [ex , stdev]
print(' mean is',ex)
I tried this code without the def and return and runs properly but the professor insists that it should be implemented.
def is used to create a function. When you use return you return the values to the caller.
Replace your last prin witht the following lines:
call the function and keep the return values
print the returned values
mean, stdev = calc_avg_std(n)
print(mean)
I want to create a program that will ask for input until a new line is entered. The user input will be in a list. Then it will calculate the mean and standard deviation of inputs. I wrote the following code, but it shows some data type errors for stddev function.
def main():
print("Enter the data, one value per line.\nEnd by entering empty line.")
a = []
prompt = ""
line = input(prompt)
while line:
a.append(float(line))
line = input(prompt)
meanfunction(a)
stdev(a)
print("The mean of given data was: ",meanfunction(a))
print("The standard deviation of given data was: ",stdev(a))
def meanfunction(data):
average = sum(data) / len(data)
average_f = "{:.2f}".format(average)
return average_f
def variance(data):
n = len(data)
mean = sum(data) / n
deviations = [(x - mean) ** 2 for x in data]
variance = sum(deviations) / (n)
variance_f = "{:.2f}".format(variance)
return variance_f
def stdev(data):
import math
var = variance(data)
std_dev = math.sqrt(var)
return std_dev
if __name__ == "__main__":
main()
Having fixed your indentation, the issue seems to be your format string in variance(data) just before the return line. You use the output of variance as an input in the stdev function but variance returns a string output. It looks like meanfunction does the same thing.
Generally, for these mathematical functions, it would be best to just have them keep to what they are supposed to do: return a number, like you already do with stdev's return. Deal with making it pretty when it comes to actually printing it to the screen.
Also making a variable names more descriptive than "a" is nice, especially when we come to look at our old code! Lastly we usually want to put imports at the very top.
import math
def main():
print("Enter the data, one value per line.\n"
"End by entering an empty line.")
user_values = []
prompt = ""
line = input(prompt)
while line:
user_values.append(float(line))
line = input(prompt)
meanfunction(user_values)
stdev(user_values)
print(f"The mean of the given data was: {meanfunction(user_values):.2f} ")
print(f"The standard deviation of the given data was: {stdev(user_values):.2f}")
def meanfunction(data):
average = sum(data) / len(data)
return average
def variance(data):
n = len(data)
mean = sum(data) / n
deviations = [(x - mean) ** 2 for x in data]
variance = sum(deviations) / (n - 1)
return variance
def stdev(data):
var = variance(data)
std_dev = math.sqrt(var)
return std_dev
if __name__ == "__main__":
main()
I'm trying to calculate the Standard Deviation of all the data thats in the column of "ClosePrices" see the pastebin https://pastebin.com/JtGr672m
We need to calculate one Standard Deviation of all the 1029 floats.
This is my code:
ins1 = open("bijlage.txt", "r")
for line in ins1:
numbers = [(n) for n in number_strings]
i = i + 1
ClosePriceSD = []
ClosePrice = float(data[0][5].replace(',', '.'))
ClosePriceSD.append(ClosePrice)
def sd_calc(data):
n = 1029
if n <= 1:
return 0.0
mean, sd = avg_calc(data), 0.0
# calculate stan. dev.
for el in data:
sd += (float(el) - mean)**2
sd = math.sqrt(sd / float(n-1))
return sd
def avg_calc(ls):
n, mean = len(ls), 0.0
if n <= 1:
return ls[0]
# calculate average
for el in ls:
mean = mean + float(el)
mean = mean / float(n)
return mean
print("Standard Deviation:")
print(sd_calc(ClosePriceSD))
print()
So what I'm trying to calculate is the Standard Deviation of all the floats under the "Closeprices" part.
well I have this "ClosePrice = float(data[0][5].replace(',', '.'))" this should calculate the Standard Deviation from all the floats that are under ClosePrice but it only calculates it from data[0][5]. But I want it to calculate one standard deviation from all the 1029 floats under ClosePrice
I think your error is in the for loop at the beginning. You have for line in ins1 but then you never use line inside the loop. And in your loop you also use number_string and data which are not defined before.
Here is how you can extract the data from you txt file.
with open("bijlage.txt", "r") as ff:
ll = ff.readlines() #extract a list, each element is a line of the file
data = []
for line in ll[1:]: #excluding the first line wich is an header
d = line.split(';')[5] #split each line in a list using semicolon as a separator and keep the element with index 5
data.append(float(d.replace(',', '.'))) #substituting the comma with the dot in the string and convert it to a float
print data #data is a list with all the numbers you want
You should be able to calculate mean and standard deviation from here.
You didn't really specify what the issue/error is. Although this probably doesn't help if it is a school project, you could install scipy, which has a standard deviation function. In this case, just put your array in as a parameter. Could you elaborate on what you're having trouble with? Is the current code giving an error?
Edit:
Looking at the data, you want the 6th element in each line (ClosePrice). If your function is working, and all you need is an array of the ClosedPrice's, this is what I would suggest.
data = []
lines = []
ins1 = open("bijlage.txt", "r")
lines = [lines.rstrip('\n') for line in ins1]
for line in lines:
line.split('\;')
data.append(line[5])
for i in data:
data[i] = float(data[i])
def sd_calc(data):
n = 1029
if n <= 1:
return 0.0
mean, sd = avg_calc(data), 0.0
# calculate stan. dev.
for el in data:
sd += (float(el) - mean)**2
sd = math.sqrt(sd / float(n-1))
return sd
def avg_calc(ls):
n, mean = len(ls), 0.0
if n <= 1:
return ls[0]
# calculate average
for el in ls:
mean = mean + float(el)
mean = mean / float(n)
return mean
print("Standard Deviation:")
print(sd_calc(data))
print()
I have a fairly long code that processes spectra, and along the way I need an interpolation of some points. I used to have all this code written line-by-line without any functions, and it all worked properly, but now I'm converting it to two large functions so that I can call it on other models more easily in the future. Below is my code (I have more code after the last line here that plots some things, but that's not relevant to my issue, since I've tested this with a bunch of print lines and learned that my issue arises when I call the interpolation function inside my process function.
import re
import numpy as np
import scipy.interpolate
# Required files and lists
filename = 'bpass_spectra.txt' # number of columns = 4
extinctionfile = 'ExtinctionLawPoints.txt' # R_V = 4.0
datalist = []
if filename == 'bpass_spectra.txt':
filetype = 4
else:
filetype = 1
if extinctionfile == 'ExtinctionLawPoints.txt':
R_V = 4.0
else:
R_V = 1.0 #to be determined
# Constants
h = 4.1357e-15 # Planck's constant [eV s]
c = float(3e8) # speed of light [m/s]
# Inputs
beta = 2.0 # power used in extinction law
R = 1.0 # star formation rate [Msun/yr]
z = 1.0 # redshift
M_gas = 1.0 # mass of gas
M_halo = 2e41 # mass of dark matter halo
# Read spectra file
f = open(filename, 'r')
rawlines = f.readlines()
met = re.findall('Z\s=\s(\d*\.\d+)', rawlines[0])
del rawlines[0]
for i in range(len(rawlines)):
newlist = rawlines[i].split(' ')
datalist.append(newlist)
# Read extinction curve data file
rawpoints = open(extinctionfile, 'r').readlines()
def interpolate(R_V, rawpoints, Elist, i):
pointslist = []
if R_V == 4.0:
for i in range(len(rawpoints)):
newlst = re.split('(?!\S)\s(?=\S)|(?!\S)\s+(?=\S)', rawpoints[i])
pointslist.append(newlst)
pointslist = pointslist[3:]
lambdalist = [float(item[0]) for item in pointslist]
k_abslist = [float(item[4]) for item in pointslist]
xvallist = [(c*h)/(lamb*1e-6) for lamb in lambdalist]
k_interp = scipy.interpolate.interp1d(xvallist, k_abslist)
return k_interp(Elist[i])
# Processing function
def process(interpolate, filetype, datalist, beta, R, z, M_gas, M_halo, met):
speclist = []
if filetype == 4:
metallicity = float(met[0])
Elist = [float(item[0]) for item in datalist]
speclambdalist = [h*c*1e9/E for E in Elist]
met1list = [float(item[1]) for item in datalist]
speclist.extend(met1list)
klist, Tlist = [None]*len(speclist), [None]*len(speclist)
if metallicity > 0.0052:
DGRlist = [50.0*np.exp(-2.21)*metallicity]*len(speclist) # dust to gas ratio
elif metallicity <= 0.0052:
DGRlist = [((50.0*metallicity)**3.15)*np.exp(-0.96)]*len(speclist)
for i in range(len(speclist)):
if Elist[i] <= 4.1357e-3: # frequencies <= 10^12 Hz
klist[i] = 0.1*(float(Elist[i])/(1000.0*h))**beta # extinction law [cm^2/g]
elif Elist[i] > 4.1357e-3: # frequencies > 10^12 Hz
klist[i] = interpolate(R_V, rawpoints, Elist, i) # interpolated function's value at Elist[i]
print "KLIST (INTERPOLATION) ELEMENTS 0 AND 1000:", klist[0], klist[1000]
return
The output from the print line is KLIST (INTERPOLATION) ELEMENTS 0 AND 1000: 52167.31734159269 52167.31734159269.
When I run my old code without functions, I print klist[0] and klist[1000] like I do here and get different values for each. In this new code, I get back two values that are the same from this line. This shouldn't be the case, so it must not be interpolating correctly inside my function (maybe it's not performing it on each point correctly in the loop?). Does anyone have any insight? It would be unreasonable to post my entire code with all the used text files here (they're very large), so I'm not expecting anyone to run it, but rather examine how I use and call my functions.
Edit: Below is the original version of my code up to the interpolation point without the functions (which works).
import re
import numpy as np
import scipy.interpolate
filename = 'bpass_spectra.txt'
extinctionfile = 'ExtinctionLawPoints.txt' # from R_V = 4.0
pointslist = []
datalist = []
speclist = []
# Constants
h = 4.1357e-15 # Planck's constant [eV s]
c = float(3e8) # speed of light [m/s]
# Read spectra file
f = open(filename, 'r')
rawspectra = f.readlines()
met = re.findall('Z\s=\s(\d*\.\d+)', rawspectra[0])
del rawspectra[0]
for i in range(len(rawspectra)):
newlist = rawspectra[i].split(' ')
datalist.append(newlist)
# Read extinction curve data file
rawpoints = open(extinctionfile, 'r').readlines()
for i in range(len(rawpoints)):
newlst = re.split('(?!\S)\s(?=\S)|(?!\S)\s+(?=\S)', rawpoints[i])
pointslist.append(newlst)
pointslist = pointslist[3:]
lambdalist = [float(item[0]) for item in pointslist]
k_abslist = [float(item[4]) for item in pointslist]
xvallist = [(c*h)/(lamb*1e-6) for lamb in lambdalist]
k_interp = scipy.interpolate.interp1d(xvallist, k_abslist)
# Create new lists
Elist = [float(item[0]) for item in datalist]
speclambdalist = [h*c*1e9/E for E in Elist]
z1list = [float(item[1]) for item in datalist]
speclist.extend(z1list)
met = met[0]
klist = [None]*len(speclist)
Loutlist = [None]*len(speclist)
Tlist = [None]*len(speclist)
# Define parameters
b = 2.0 # power used in extinction law (beta)
R = 1.0 # star formation ratw [Msun/yr]
z = 1.0 # redshift
Mgas = 1.0 # mass of gas
Mhalo = 2e41 # mass of dark matter halo
if float(met) > 0.0052:
DGRlist = [50.0*np.exp(-2.21)*float(met)]*len(speclist)
elif float(met) <= 0.0052:
DGRlist = [((50.0*float(met))**3.15)*np.exp(-0.96)]*len(speclist)
for i in range(len(speclist)):
if float(Elist[i]) <= 4.1357e-3: # frequencies <= 10^12 Hz
klist[i] = 0.1*(float(Elist[i])/(1000.0*h))**b # extinction law [cm^2/g]
elif float(Elist[i]) > 4.1357e-3: # frequencies > 10^12 Hz
klist[i] = k_interp(Elist[i]) # interpolated function's value at Elist[i]
print "KLIST (INTERPOLATION) ELEMENTS 0 AND 1000:", klist[0], klist[1000]
The output from this print line is KLIST (INTERPOLATION) ELEMENTS 0 AND 1000 7779.275435560996 58253.589270674354.
You are passing i as an argument to interpolate, and then also using i in a loop within interpolate. Once i is used within the for i in range(len(rawpoints)) loop in interpolate, it will be set to some value: len(rawpoints)-1. The interpolate function will then always return the same value k_interp(Elist[i]), which is equivalent to k_interp(Elist[len(rawpoints)-1]). You will need to either define a new variable within your loop (e.g. for not_i in range(len(rawpoints))), or use a different variable for the Elist argument. Consider the following change to interpolate:
def interpolate(R_V, rawpoints, Elist, j):
pointslist = []
if R_V == 4.0:
for i in range(len(rawpoints)):
newlst = re.split('(?!\S)\s(?=\S)|(?!\S)\s+(?=\S)', rawpoints[i])
pointslist.append(newlst)
pointslist = pointslist[3:]
lambdalist = [float(item[0]) for item in pointslist]
k_abslist = [float(item[4]) for item in pointslist]
xvallist = [(c*h)/(lamb*1e-6) for lamb in lambdalist]
k_interp = scipy.interpolate.interp1d(xvallist, k_abslist)
return k_interp(Elist[j])
I inherited a project in the middle of pandemonium and to makes matters worse I am just learning python.
I managed to implement a polynomial function into my code and the results
are the same as the ones posted in the examples of this web page.
[z = numpy.polyfit(x, y, 5)]
However, I will like to know how to modify the program so I can insert an input of one of the know values of y to find x.
In other words. I have x and y arrays where:
- array x holds values for know kilograms weights (0.0, 0.5, 1.0, 1.5, 2.0)
- array y (0.074581967, 0.088474754, 0.106797419, 0.124461935, 0.133726833)
I have a program who reads a load of weight and the tension created by a phidget and generates the array to be used by this program.
What I need to accomplish is to read the next value from the phidget and be able to convert the reading into kilograms, from within the provided array.
Is there a way to do this? I feel I am only missing a line of code but I don't know how to implement the results of the returned values from the formula. (z)
Thanks in advance.
CODE ADDED AS REQUESTED
from Phidgets.PhidgetException import PhidgetException
from Phidgets.Devices.Bridge import Bridge, BridgeGain
import datetime
import os
import re
import sys
import time
import numpy
wgh = list()
avg = list()
buf = list()
x = []
y = []
def nonlinear_regression(): # reads the calibration mapping and generates the conversion coefficients
fd = open("calibration.csv", "r")
for line in fd:
[v0, v1] = line.split(",")
x.append(float(v0))
y.append(float(v1[:len(v1) - 1]))
xdata = numpy.array(x)
ydata = numpy.array(y)
z = numpy.polyfit(x, y, 5)
return z
def create_data_directory(): # create the data directory
if not os.path.exists("data"):
os.makedirs("data")
def parse_config(): # get config-file value
v = int()
config = open("config.properties", "r")
for line in config:
toks = re.split(r"[\n= ]+", line)
if toks[0] == "record_interval":
v = int(toks[1])
return v
def read_bridge_data(event): # read the data
buf.append(event.value)
def record_data(f_name, date, rms):
if not os.path.isfile(f_name):
fd = open(f_name, "w")
fd.write("time,weight\n")
fd.write(datetime.datetime.strftime(date, "%H:%M"))
fd.write(",")
fd.write(str(rms) + "\n")
fd.close()
else:
fd = open(f_name, "a")
fd.write(datetime.datetime.strftime(date, "%H:%M"))
fd.write(",")
fd.write(str(rms) + "\n")
fd.close()
print("Data recorded.")
def release_bridge(event): # release the phidget device
try:
event.device.closePhidget()
except:
print("Phidget bridge could not be released properly.")
sys.exit(1)
def main():
create_data_directory()
RECORD_INTERVAL = parse_config() # get the config-file value
calibrate = nonlinear_regression() # get calibration function; use like: calibrate(some_input)
bridge = Bridge()
try:
bridge.setOnBridgeDataHandler(read_bridge_data)
bridge.setOnDetachHandler(release_bridge) # when the phidget gets physically detached
bridge.setOnErrorhandler(release_bridge) # asynchronous exception (i.e. keyboard interrupt)
except:
print("Phidget bridge event binding failed.")
sys.exit(1)
try:
bridge.openPhidget()
bridge.waitForAttach(3000)
except:
print("Phidget bridge opening failed.")
sys.exit(1)
last_record = int()
while (True):
date = datetime.datetime.now()
f_name = "data\\" + datetime.datetime.strftime(date, "%B_%d_%Y") + ".csv"
curr = time.time() * 1000
if (curr - last_record) > (RECORD_INTERVAL * 1000):
try:
bridge.setDataRate(10)
last = time.time() * 1000
bridge.setEnabled(0, True)
while (time.time() * 1000 - last) < 1000: # collects over 1 sec
pass
bridge.setEnabled(0, False)
except:
print("Phidget bridge data reading error.")
bridge.setEnabled(0, False)
bridge.closePhidget()
sys.exit(1)
vol = sum(buf) / len(buf)
del buf[:]
last_record = curr
record_data(f_name, date, vol) # replace curr with calibrated data
#THIS IS WHERE I WILL LIKE TO INCORPORATE THE CHANGES TO SAVE THE WEIGHT
#record_data(f_name, date, conversion[0] * vol + conversion[1]) # using the linear conversion function
else:
time.sleep(RECORD_INTERVAL - 1) # to reduce the CPU's busy-waiting
if __name__ == "__main__":
main()
The linear conversion function from the calibrations is returned by numpy.polyfit as an array of coefficients. Since you passed 5 for the degree argument of polyfit, you will get an array of six coefficients:
f(x) = ax5 + bx4 + cx3 + dx2 + ex + f
Where a, b, c, d, e, and f are the elements of the z array returned by nonlinear_regression.
To implement the linear conversion formula, simply use the power operator **, the elements of z, and the value of vol:
vol_calibrated = vol**5 * z[0] + vol**4 * z[1] + vol**3 * z[2] + vol**2 * z[3] + vol * z[4] + z[5]
Or more generally:
degree = len(z) - 1
vol_calibrated = sum(vol**(degree-i) * coeff for i, coeff in enumerate(z))