To start off with I have two files, lets call them fileA and fileB.
In fileB I have two sequences, lets call them initial and final. Each sequences has exactly 32 values, most of which are simple equations that slightly differ from each other, hence the 32 unique values. For simplicity's sake let's scope them down to 5 each. so for initial it would look something like.
~fileB
T1 = 60
inital = [0.112, 0.233, 0.322*T1, 0.55*T1, 0.665*T1]
Variable T1 does not change at any point at all, So initial is constant permanently. The 2nd variable is called "final"
For final I have:
T2 = 120
k_0 = T2**2 - T1**2
final = [x * k_0 for x in initial]
This gives me the values I want for final and it gives me a sequence of the same length. In fileA I want to evalute an iterator at multiple T2 values and get an "answer" for each respective T2 value. However, since I am new, i'm limiting my self so that i'm only doing this for the very first final value.
So now on to fileA:
~fileA
import fileB
import math
answer = []
T2 = np.array(120,400,10)
x = symbols('x')
int1 = Integral(x**2 -1,x)
eq1 = int.doit()
for i in T2:
k = k_0*final[0]
answer.append(solve(eq1 - k, x))
This is where things get tricky, as i want it to evaluate this ONLY for the first "final" value
final[0]
but i want it to re-evaluate the two variables
k_0 = T2**2 - T1**2
and
answer = []
At each and every T2 value, how can I do this so that I can make an array/table that looks like the following
T2 (header) Answer(header)
value_1 Value_1
value_2 Value_2
value_3 Value_3
value_4 Value_4
.... ....
If you need me to explain it better of have questions feel free to ask.
If necessary i'm using python 3.6/3.7 in the anaconda distribution.
Okay my question was a bit confusing but I figured out how to solve it.
The first step was to write a list comprehension for T2 instead of a np array, like so:
T2 = [20 + x*1.01 +273.15 for x in range(40)]
Then I assigned the integral i wanted solved to a variable, let's call it int1.
int1 = Integral((1-x)**-2,x)
Then I have to create two new and separate empty sequences.
answer = []
X = []
For the first sequence (answer) I did the following:
for i,temp in enumerate(t2):
k_0 = math.exp((ee/r)*((1/t1)-(1/t2[i])))
k = initial_k[0]*k_0
v_0 = 2
Fa_0 = 5
Ca_0 = Fa_0/v_0
Q = (k*Ca_0*V)/v_0
eq1 = int1.doit()
answer.append(solve(eq1 - Q,x))
While this will give the numerical answers I was looking for, each and every individual answer is returned as a sequence, which is inside of a sequence.
answer = [[value1],[value2],[value3],[value4]]
This causes problems when trying to use the numerical answers in other equations or operations.
To fix this i used the same technique as above and returned all of the numerical float values into a single sequence
for i,val in enumerate(answer):
X.append(float(answer[i][0]))
This finally allowed me to use the numerical float values original stored in answer[] by simply transferring them to a new sequence where they are not nested.
Ca = [(1-x)*Ca_0 for x in X]
Fa = [(1-x)*Fa_0 for x in X]
Finally, I was able to get the table I originally wanted!
Temperature Conversion Ca Fa
293.15 0.44358038322375287 1.3910490419406178 2.7820980838812357
303.15 0.44398389128120275 1.390040271796993 2.780080543593986
313.15 0.44436136324002395 1.38909659189994 2.77819318379988
323.15 0.4447152402068642 1.3882118994828396 2.7764237989656793
I'm not completely sure why I had to do this but it worked, however, any help or insight into this would be appreciated.
Related
I have a .json file with multiple values, seperated by commas. I would like to go through the file and output each value sequentially, as variables. The first one being x1, the second one being x2 and so on. The point is so that I can use these values in an equation later on.
The output would look like this:
x1 = 0.0234
x2 = 0.512
x3 = 0.9782
I pretty sure I need to use a for loop after this:
g = open('beat_times_knownsong1.json')
another_song = json.load(g)
EDIT: this is some of the .json data:
0.023219954648526078,
0.5108390022675737,
0.9752380952380952,
1.4628571428571429,
1.9504761904761905,
2.414875283446712,
2.9024943310657596,
3.3668934240362813,
3.8545124716553287,
4.31891156462585,
4.806530612244898,
5.270929705215419,
5.758548752834467,
6.222947845804988,
6.710566893424036,
7.174965986394557,
they're just numbers increasing in value. If I just do:
g = open('beat_times_knownsong1.json')
another_song = json.load(g)
for beat in another_song:
print(beat)
then it just prints the values. I would like for it to save each value to an "x" variable, increasing from x1 = 0.023219954648526078 to x2, x3 and so on.
Use enumerate to get the index of the current value in the list you are iterating over.
There are different ways in Python to convert a string to a variable (see e.g. here). One is using the locals method, which returns a dict that reflects all local variables defined in a scope (e.g. a function) and allows manipulating them (e.g. reassigning, adding new variables or deleting existing ones).
Example:
g = open('beat_times_knownsong1.json')
another_song = json.load(g)
for i, beat in enumerate(another_song):
locals()[f"x{i}"] = beat
print(x0)
print(x1)
...
Why not using a dictionary instead of variable:
g = open('beat_times_knownsong1.json')
another_song = json.load(g)
variable_dict={}
i=0
for beat in another_song:
variable_dict["x"+str(i)]=beat
i+=1
and you can get the value you want like this:
variable_dict["x1"]
I'm trying to calculate the alcohol by volume (abv) of some beer by using variables from 2 separate lists (which I took from a dictionary entry). I'm having trouble getting the values from both lists to be applied to the equation that I have for abv (and it's probably not possible to have a for loop with an and statement like the one I have below). Is it possible to get variables from two separate lists to be subbed into the same equation in one for loop?
Right now it's telling me that I have a type error where 'bool' object is not iterable. Here's what I've tried so far in terms of coding:
beers = {"SG": [1.050, 1.031, 1.077, 1.032, 1.042, 1.055, 1.019, 1.089, 1.100, 1.032],
"FG": [1.010, 1.001, 1.044, 1.003, 1.003, 1.013, 1.002, 1.020, 1.056, 1.000],
"grad student 1": [5.264, 3.983, 4.101, 7.216, 2.313, 4.876, 2.255, 8.991, 5.537, 4.251],
"grad student 2": [5.211, 3.008, 4.117, 3.843, 5.168, 5.511, 3.110, 8.903, 5.538, 4.255]}
#separating the SG and FG values from the dictionary entry
SG_val = beers["SG"]
FG_val = beers['FG']
def find_abv(SG = SG_val, FG = FG_val):
abv_list = []
i = 0.0
j = 0.0
for i in SG_val and j in FG_val:
abv = (((1.05/0.79)*((i - j)/j))*100)
abv_list.append(abv)
return abv_list
find_abv()
print(abv_list)```
You cannot use and to iterate two variables in a single for loop. You can use the zip function to do that:
def find_abv(SG = SG_val, FG = FG_val):
abv_list = []
i = 0.0
j = 0.0
for i, j in zip(SG,FG):
abv = (((1.05/0.79)*((i - j)/j))*100)
abv_list.append(abv)
return abv_list
abv_list = find_abv()
print(abv_list)
You also need to assign the result of find_abv() to a variable in order to print it, which you don't, as it seems in your code.
Another thing is that the use of SG_val and FG_val in the loop of your find_abv is pointless, since you have the SG an FG parameters in your function.
You can't use a for loop to directly iterate through multiple lists. Currently, your function is trying to iterate through (SG_val and j in FG_val), which itself is a boolean and can therefore not be iterated through.
If the two lists will always have the same number of items, then you could simply iterate through the indexes:
# len(SG_val) returns the length of SG_val
for i in range(len(SG_val)):
abv = (((1.05/0.79)*((SG_val[i] - FG_val[i])/FG_val[i]))*100)
abv_list.append(abv)
# put the return outside of the for loop so that it can finish iterating before returning the value
return abv_list
If the lists aren't always going to be the same length then you can write for i in range(len(SG_val) if len(SG_val) <= len(FG_val) else len(SG_val)): instead of for i in range(len(SG_val)):so that it iterates until it reaches the end of the smallest list.
Also, to output the value returned by the function you have to assign it to something and then print it or just print it directly:
abv_list = find_abv()
print(abv_list)
# or
print(find_abv())
Background:I have two catalogues consisting of positions of spatial objects. My aim is to find the similar ones in both catalogues with a maximum difference in angular distance of certain value. One of them is called bss and another one is called super.
Here is the full code I wrote
import numpy as np
def crossmatch(bss_cat, super_cat, max_dist):
matches=[]
no_matches=[]
def find_closest(bss_cat,super_cat):
dist_list=[]
def angular_dist(ra1, dec1, ra2, dec2):
r1 = np.radians(ra1)
d1 = np.radians(dec1)
r2 = np.radians(ra2)
d2 = np.radians(dec2)
a = np.sin(np.abs(d1-d2)/2)**2
b = np.cos(d1)*np.cos(d2)*np.sin(np.abs(r1 - r2)/2)**2
rad = 2*np.arcsin(np.sqrt(a + b))
d = np.degrees(rad)
return d
for i in range(len(bss_cat)): #The problem arises here
for j in range(len(super_cat)):
distance = angular_dist(bss_cat[i][1], bss_cat[i][2], super_cat[j][1], super_cat[j][2]) #While this is supposed to produce single floating point values, it produces numpy.ndarray consisting of three entries
dist_list.append(distance) #This list now contains numpy.ndarrays instead of numpy.float values
for k in range(len(dist_list)):
if dist_list[k] < max_dist:
element = (bss_cat[i], super_cat[j], dist_list[k])
matches.append(element)
else:
element = bss_cat[i]
no_matches.append(element)
return (matches,no_matches)
When put seperately, the function angular_dist(ra1, dec1, ra2, dec2) produces a single numpy.float value as expected. But when used inside the for loop in this crossmatch(bss_cat, super_cat, max_dist) function, it produces numpy.ndarrays instead of numpy.float. I've stated this inside the code also. I don't know where the code goes wrong. Please help
I'm writing a code to analyze a (8477960, 1) column vector. I am not sure if the while loops in my code are running infinitely, or if the way I've written things is just really slow.
This is a section of my code up to the first while loop, which I cannot get to run to completion.
import numpy as np
import pandas as pd
data = pd.read_csv(r'C:\Users\willo\Desktop\TF_60nm_2_2.txt')
def recursive_low_pass(rawsignal, startcoeff, endcoeff, filtercoeff):
# The current signal length
ni = len(rawsignal) # signal size
rougheventlocations = np.zeros(shape=(100000, 3))
# The algorithm parameters
# filter coefficient
a = filtercoeff
raw = np.array(rawsignal).astype(np.float)
# thresholds
s = startcoeff
e = endcoeff # for event start and end thresholds
# The recursive algorithm
# loop init
ml = np.zeros(ni)
vl = np.zeros(ni)
s = np.zeros(ni)
ml[0] = np.mean(raw) # local mean init
vl[0] = np.var(raw) # local variance init
i = 0 # sample counter
numberofevents = 0 # number of detected events
# main loop
while i < (ni - 1):
i = i + 1
# local mean low pass filtering
ml[i] = a * ml[i - 1] + (1 - a) * raw[i]
# local variance low pass filtering
vl[i] = a * vl[i - 1] + (1 - a) * np.power([raw[i] - ml[i]],2)
# local threshold to detect event start
sl = ml[i] - s * np.sqrt(vl[i])
I'm not getting any error messages, but I've let the program run for more than 10 minutes without any results, so I assume I'm doing something incorrectly.
You should try to vectorize this process rather than accessing/processing indexes (otherwise why use numpy).
The other thing is that you seem to be doing unnecessary work (unless we're not seeing the whole function).
the line:
sl = ml[i] - s * np.sqrt(vl[i])
assigns the variable sl which you're not using inside the loop (or anywhere else). This assignment performs a whole vector multiplication by s which is all zeroes. If you do need the sl variable, you should calculate it outside of the loop using the last encountered values of ml[i] and vl[i] which you can store in temporary variables instead of computing on every loop.
If ni is in the millions, this unnecessary vector multiplication (of millions of zeros) is going to be very costly.
You probably didn't mean to override the value of s = startcoeff with s = np.zeros(ni) in the first place.
In order to vectorize these calculations you will need to use np.acumulate with some customized functions.
The non-numpy equivalent would be as follows (using itertools instead):
from itertools import accumulate
ml = [np.mean(raw)]+[0]*(ni-1)
mlSums = accumulate(zip(ml,raw),lambda r,d:(a*r[0] + (1-a)*d[1],0))
ml = [v for v,_ in mlSums]
vl = [np.var(raw)]+[0]*(ni-1)
vlSums = accumulate(zip(vl,raw,ml),lambda r,d:(a*r[0] + (1-a)*(d[1]-d[2])**2,0,0))
vl = [v for v,_,_ in vlSums]
In each case, the ml / vl vectors are initialized with the base value at index zero and the rest filled with zeroes.
The accumulate(zip(... function calls go through the array and call the lambda function with the current sum in r and the paired elements in d. For the ml calculation, this corresponds to r = (ml[i-1],_) and d = (0,raw[i]).
Because accumulate ouputs the same date type as it is given as input (which are zipped tuples), the actual result is only the first value of the tuples in the mlSums/vlSums lists.
This took 9.7 seconds to process for 8,477,960 items in the lists.
This is my third thread in StackOverflow.
I think I already learnt a LOT by reading threads here and clearing my doubts.
I'm trying to transform an excel table, in my own python script. I've done so much, and now that I'm almost finishing the script, I'm getting an Err message that I can't really understand. Here is my code: (I tried to give as much information as possible!)
def _sensitivity_analysis(datasource):
#datasource is a list with data that may be used for HBV_model() function;
datasource_length = len(datasource) #returns tha size of the data time series
sense_param = parameter_vector #collects the parameter data from the global vector (parameter_vector);
sense_index = np.linspace(0, 11, 12) #Vector that reflects the indexes of parameters that must be analyzed (0 - 11)
sense_factor = np.linspace(0.5, 2, 31) #Vecor with the variance factors that multiply the original parameter value;
ns_sense = [] #list that will be filled with Nasch-Sutcliff values (those numbers will be data for sensitivity analysis)
for i in range(sense_factor.shape[0]): #start column loop
ns_sense.append([]) #create column in ns_sense matrix
for j in range(sense_index.shape[0]): #start row loop
aux = sense_factor[i]*sense_param[j] #Multiplies the param[j] value by the factor[i] value
print(i,j,aux) #debug purposes
sense_param[j] = aux #substitutes the original parameter value by the modified one
hbv = _HBV_model(datasource, sense_param) #run the model calculations (works awesomely!)
sqrdiff = _square_diff() #does square-difference calculations for Nasch-Sutcliff;
average = _qcalc_qmed() #does square-difference calculations for Nasch-Sutcliff [2];
nasch = _nasch_sutcliff(sqrdiff, average) #Returns the Nasch-Sutcliff calculation value
ns_sense[i].insert(j, nasch) #insert the value into ns_sense(i, j) for further uses;
sense_param = np.array([np.float64(catchment_area), np.float64(thresh_temp),
np.float64(degreeday_factor), np.float64(field_capacity),
np.float64(shape_coeficient), np.float64(model_paramC),
np.float64(surfaceflow_param), np.float64(thresh_surface_level),
np.float64(interflow_param), np.float64(baseflow_param),
np.float64(percolation_param), np.float64(soilmoist_param)]) #restores sense_param to original values
for i in range(len(datasource)): #HBV_model() transforms original data (index = 5) in a fully calculated data (index 17)
for j in range(12): #in order to return it to original state before a new loop
datasource[i].pop() #data is popped out;
print(ns_sense) #debug purposes
So, when I run _sensitivity_analysis(datasource) I receive this message:
File "<ipython-input-47-c9748eaba818>", line 4, in <module>
aux = sense_factor[i]*sense_param[j]
IndexError: index 3652 is out of bounds for axis 0 with size 31;
I'm totally aware that it is talking about a index that is not accessible as it does not exists.
Explaining my situation, datasource is a list with index [3652]. But I can't see how the console is trying to access index 3652, as I'm not asking it to do so. The only point I'm trying to access such value is in the final loop:
for i in range(len(datasource)):
I'm really lost. I'd really apreciate if you could help me guys! If you need more info, I can give you.
Guess: sense_factor = np.linspace(0.5, 2, 31) has 31 elements - you ask for element 3652 and it naturally blows. i takes this value in final loop. Rewrite final loop as:
for k in range(len(datasource))
for m in range(12):
datasource[k].pop()
However your code has many issues - you should have been not using indexes
at all - instead use for loops directly on the arrays
You reused your variable names, here:
for i in range(sense_factor.shape[0]):
...
for j in range(sense_index.shape[0]):
and then here:
for i in range(len(datasource)):
for j in range(12):
so in aux = sense_factor[i]*sense_param[j], you're using the wrong value of i, and it's basically a fluke that you're not using the wrong value of j too.
Don't reuse variable names in the same scope.