I'm running a multi-objective optimisation with Pymoo (0.5.0) using NSGA-III and within my population of new candidates some of the generated candidates have nan parameters. This results in my evaluate function (which is a call to a neural network) returning nan. The optimisation is running and producing desired results but I'd like to know why some of the candidate parameters are nan. Here is the code for the problem.
Problem setup:
opt_name = "sopt00001KaRxD60fLn2"
pop_size = 165
n_gen = 350
cross_over_pb = 0.9
mutation_pb = 0.1
# Fixed params
band ="KaRx"
arc = "RevF"
source_spec = "sweep99p1"
lens_diameter = 60.0
source_z = 5.0
r_lam = 0.1
use_original_transformation = 0 # false
source_x0 = 0.0
target_scans = [0, 70, 50]
# Optimisation param ranges
lens_material_delta_n = [1.5, 3.6]
lens_thick = [5, 35]
lens_radii_back = [39, 22500]
lens_radii_front = [39, 22500]
source_width = [2, 20]
source_x = [12, 20]
params_lower_lim = [lens_thick[0], lens_radii_front[0], lens_radii_back[0], source_width[0], source_x[0], source_x[0],
lens_material_delta_n[0], -1, -1, -1, 0, -1, -1, -1, 0]
params_upper_lim = [lens_thick[1], lens_radii_front[1], lens_radii_back[1], source_width[1], source_x[1], source_x[1],
lens_material_delta_n[1], 1, 1, 1, 1, -1, -1, -1, 1]
n_var = len(params_lower_lim)
assert n_var == len(params_upper_lim), print("Upper and lower parameter limits are not equal length!")
# Other required params
if band == "KaRx":
freq_center = 19.45
freq_width = 3.5
Evaluate function:
class ProblemWrapper(Problem):
def _evaluate(self, params, out, *args, **kwargs):
res = []
for param in params:
source_x70 = source_x_f(param[4], param[5], source_x, 50, r_lam, target_scans, freq_center, freq_width)
source_x50 = source_x_f(param[4], param[5], source_x, 70, r_lam, target_scans, freq_center, freq_width)
res.append(smeep(band,
lens_diameter, param[0],
param[1], param[2],
param[3],
source_x0, source_x70, source_x50,
source_z,
param[6], param[7], param[8], param[9], param[10], param[11], param[12], param[13], param[14],
r_lam, use_original_transformation,
arc,
source_spec,
target_scans))
out['F'] = np.array(res)
Algorithm settings:
ref_dirs = get_reference_directions("das-dennis", 3, n_partitions=12)
problem = ProblemWrapper(n_var=n_var,
n_obj=len(target_scans),
xl=params_lower_lim,
xu=params_upper_lim)
algorithm = NSGA3(
pop_size=pop_size,
ref_dirs=ref_dirs,
sampling=get_sampling("real_random"),
cross_over=get_crossover("real_sbx", prob=cross_over_pb),
mutation=get_mutation("real_pm", prob=mutation_pb)
)
Execution:
res = minimize(problem=problem,
algorithm=algorithm,
termination=("n_gen", n_gen),
save_history=True,
verbose=True
)
It looks like the only affected parameters are the poly6 (param[11]), poly7 (param[12]) and poly8 (param[13]) terms. And it differs candidate to candidate. I confess I have not tried any different crossover or mutation schemes but these seemed the best from the documentation.
Thanks in advance!
The nan arise because the limits for your parameters 11, 12 and 12 are equal (-1 and -1 in all cases).
If you look at the code for the polynomial mutation (real_pm), you have the following lines:
delta1 = (X - xl) / (xu - xl)
delta2 = (xu - X) / (xu - xl)
where xu and xl are the upper and lower bounds of the parameters. In your case, that would cause a divide-by-0.
Since the limits are the same (if this is correct), they are actually not part of the optimization and you should remove them from the list.
Related
I am given the following bond:
and need to fit the Vasicek model to this data.
My attempt is the following:
# ... imports
years = np.array([1, 2, 3, 4, 7, 10])
pric = np.array([0, .93, .85, .78, .65, .55, .42])
X = sympy.symbols("a b sigma")
a, b, s = X
rt1_rt = np.diff(pric)
ab_rt = np.array([a*(b-r) for r in pric[1:] ])
term = rt1_rt - ab_rt
def normpdf(x, mean, sd):
var = sd**2
denom = (2*sym.pi*var)**.5
num = sym.E**(-(x-mean)**2/(2*var))
return num/denom
pdfs = np.array([sym.log(normpdf(x, 0, s)) for x in term])
func = 0
for el in pdfs:
func += el
func = func.factor()
lmd = sym.lambdify(X, func)
def target_fun(params):
return lmd(*params)
result = scipy.optimize.least_squares(target_fun, [10, 10, 10])
I don't think that it outputs correct solution.
Your code is almost correct.
You want to maximize your function, therefore you need to place minus sign in front of lmd in your function.
def target_fun(params):
return -lmd(*params)
Additionally, the initial values are usually set to less than 1. Picking 10 is not the best choice as the algorithm might converge to a saddle point.
Consider [0.01, 0.01, 0.01].
I am following a book which has the following code:
import numpy as np
np.random.seed(1)
streetlights = np.array([[1, 0, 1], [0, 1, 1], [0, 0, 1], [1, 1, 1]])
walk_vs_stop = np.array([[1, 1, 0, 0]]).T
def relu(x):
return (x > 0) * x
def relu2deriv(output):
return output > 0
alpha = 0.2
hidden_layer_size = 4
# random weights from the first layer to the second
weights_0_1 = 2*np.random.random((3, hidden_layer_size)) -1
# random weights from the second layer to the output
weights_1_2 = 2*np.random.random((hidden_layer_size, 1)) -1
for iteration in range(60):
layer_2_error = 0
for i in range(len(streetlights)):
layer_0 = streetlights[i : i + 1]
layer_1 = relu(np.dot(layer_0, weights_0_1))
layer_2 = relu(np.dot(layer_1, weights_1_2))
layer_2_error += np.sum((layer_2 - walk_vs_stop[i : i + 1])) ** 2
layer_2_delta = layer_2 - walk_vs_stop[i : i + 1]
layer_1_delta = layer_2_delta.dot(weights_1_2.T) * relu2deriv(layer_1)
weights_1_2 -= alpha * layer_1.T.dot(layer_2_delta)
weights_0_1 -= alpha * layer_0.T.dot(layer_1_delta)
if iteration % 10 == 9:
print(f"Error: {layer_2_error}")
Which outputs:
# Error: 0.6342311598444467
# Error: 0.35838407676317513
# Error: 0.0830183113303298
# Error: 0.006467054957103705
# Error: 0.0003292669000750734
# Error: 1.5055622665134859e-05
I understand everything but this part is not explained and I am not sure why it is the way it is:
weights_0_1 = 2*np.random.random((3, hidden_layer_size)) -1
weights_1_2 = 2*np.random.random((hidden_layer_size, 1)) -1
I don't understand:
Why there is 2* the whole matrix and why is there a -1
If I change 2 to 3 my error becomes greatly lower # Error: 5.616513576418916e-13
I tried changing the 2 to many other numbers along with the change of -1 to many other numbers I get # Error: 2.0 most of the time or the Error is much worst than combination of 3 and -1.
I can't seem to grasp the relationship and the purpose of multiplying the random weights by a number and subracting a number afterwards.
P.S. The idea of the network is to understand a streetlight pattern when people should go and when they should stop depending what combination of the lights in streetlight is on / off.
There is a lot of ways to initialize neural network, and it's a current research subject as it can have a great impact on performance and training time. Some rules of thumb :
avoid having only one value for all weights, as they would all update the same
avoid having too large weights that could make your gradient too high
avoid having too small weights that could make your gradient vanish
In your case, the goal is just to have something between [-1;1] :
np.random.random gives you a float in [0;1]
multiply by 2 gives you something in [0;2]
substract 1 gives you a number in [-1;1]
2*np.random.random((3, 4)) -1 is a way to generated 3*4=12 random number from uniform distribution of half-open interval [-1, +1) i.e including -1 but excluding +1.
This is equivalent to more readable code
np.random.uniform(-1, 1, (3, 4))
I have a graph degree list degrees size of 10000 (e.g.: [1, 14, 4, 14, 6, 1 ...]. I am trying to calculate entropy of this list by this way:
Firstly, I am finding probability of each unique value in list:
uniqueDegreeList = list(set(degrees))
a = 0
for i in uniqueDegreeList:
p = degrees.count(i) / len(degrees)
print(p)
a += p
Output:
0.5054494550544946
0.24577542245775422
0.12188781121887811
0.06379362063793621
0.031596840315968405
0.0150984901509849
0.007799220077992201
0.0034996500349965005
0.0024997500249975004
0.0010998900109989002
0.0008999100089991
0.00039996000399960006
0.00019998000199980003
And:
print(a)
>> 1.0
This part is working. Then I am trying to find entropy of list:
S = 0
for i in uniqueDegreeList:
p = degrees.count(i) / len(degrees)
S -= p * math.log(p, 2)
And when I print S I get 1.99. Entropy should not be more than 1, why I get 1.99?
I want to use TensorFlow to calculate hashcode‘s mAP (mean average precision), but I don‘t know how to use tensor calculations directly.
The code which using NumPy is the following:
import numpy as np
import time
import os
# read train and test binarayCode
CURRENT_DIR = os.getcwd()
def getCode(train_codes,train_groudTruth,test_codes,test_groudTruth):
line_number = 0
with open(CURRENT_DIR+'/result.txt','r') as f:
for line in f:
temp = line.strip().split('\t')
if line_number < 10000:
test_codes.append([i if i==1 else -1 for i in map(int, list(temp[0]))])
list2 = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
list2[int(temp[1])] = 1
test_groudTruth.append(list2) # get test ground truth(0-9)
else:
train_codes.append([i if i==1 else -1 for i in map(int, list(temp[0]))]) # change to -1, 1
list2 = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
list2[int(temp[1])] = 1
train_groudTruth.append(list2) # get test ground truth(0-9)
line_number += 1
print 'read data finish'
def getHammingDist(code_a,code_b):
dist = 0
for i in range(len(code_a)):
if code_a[i]!=code_b[i]:
dist += 1
return dist
if __name__ =='__main__':
print getNowTime(),'start!'
train_codes = []
train_groudTruth =[]
test_codes = []
test_groudTruth = []
# get g.t. and binary code
getCode(train_codes,train_groudTruth,test_codes,test_groudTruth)
train_codes = np.array(train_codes)
train_groudTruth = np.array(train_groudTruth)
test_codes = np.array(test_codes)
test_groudTruth = np.array(test_groudTruth)
numOfTest = 10000
# generate hanmming martix, g.t. martix 10000*50000
gt_martix = np.dot(test_groudTruth, np.transpose(train_groudTruth))
print getNowTime(),'gt_martix finish!'
ham_martix = np.dot(test_codes, np.transpose(train_codes)) # hanmming distance map to dot value
print 'ham_martix finish!'
# sort hanmming martix,Returns the indices that would sort an array.
sorted_ham_martix_index = np.argsort(ham_martix,axis=1)
# calculate mAP
print 'sort ham_matrix finished,start calculate mAP'
apall = np.zeros((numOfTest,1),np.float64)
for i in range(numOfTest):
x = 0.0
p = 0
test_oneLine = sorted_ham_martix_index[i,:]
length = test_oneLine.shape[0]
num_return_NN = 5000 # top 1000
for j in range(num_return_NN):
if gt_martix[i][test_oneLine[length-j-1]] == 1: # reverse
x += 1
p += x/(j+1)
if p == 0:
apall[i]=0
else:
apall[i]=p/x
mAP = np.mean(apall)
print 'mAP:',mAP
I want to re-write the code above using tensor operations (like tf.equal()、tf.reduce_sum() so on).
for example
I want to calculate valid accuracy of images
logits = self._model(x_valid)
valid_preds = tf.argmax(logits, axis=1)
valid_preds = tf.to_int32(valid_preds)
self.valid_acc = tf.equal(valid_preds, y_valid)
self.valid_acc = tf.to_int32(self.valid_acc)
self.valid_acc = tf.to_float(tf.reduce_sum(self.valid_acc))/tf.to_float(self.batch_size)
I want to use TensorFlow to calculate hashcode‘s mAP (mean average precision) this way(like tf.XX opreation)
How could I do? Thanks!
You can just calculate the y_score (or predictions) and then use sklearn.metrics to calculate the average precision:
from sklearn.metrics import average_precision_score
predictions = model.predict(x_test)
average_precision_score(y_test, predictions)
If you just want to calculate average precision based on the validation set predictions, you can use the vector of predicted probabilities and the vector of true labels in this scikit-learn function.
If you really want to use a tensorflow function, there's a tensorflow function average_precision_at_k.
For more info about average precision you can see this article.
I've got two musical files: one lossless with little sound gap (at this time it's just silence but it could be anything: sinusoid or just some noise) at the beginning and one mp3:
In [1]: plt.plot(y[:100000])
Out[1]:
In [2]: plt.plot(y2[:100000])
Out[2]:
This lists are similar but not identical so I need to cut this gap, to find the first occurrence of one list in another with lowest delta error.
And here's my solution (5.7065 sec.):
error = []
for i in range(25000):
y_n = y[i:100000]
y2_n = y2[:100000-i]
error.append(abs(y_n - y2_n).mean())
start = np.array(error).argmin()
print(start, error[start]) #23057 0.0100046
Is there any pythonic way to solve this?
Edit:
After calculating the mean distance between special points (e.g. where data == 0.5) I reduce the area of search from 25000 to 2000. This gives me reasonable time of 0.3871s:
a = np.where(y[:100000].round(1) == 0.5)[0]
b = np.where(y2[:100000].round(1) == 0.5)[0]
mean = int((a - b[:len(a)]).mean())
delta = 1000
error = []
for i in range(mean - delta, mean + delta):
...
What you are trying to do is a cross-correlation of the two signals.
This can be done easily using signal.correlate from the scipy library:
import scipy.signal
import numpy as np
# limit your signal length to speed things up
lim = 25000
# do the actual correlation
corr = scipy.signal.correlate(y[:lim], y2[:lim], mode='full')
# The offset is the maximum of your correlation array,
# itself being offset by (lim - 1):
offset = np.argmax(corr) - (lim - 1)
You might want to take a look at this answer to a similar problem.
Let's generate some data first
N = 1000
y1 = np.random.randn(N)
y2 = y1 + np.random.randn(N) * 0.05
y2[0:int(N / 10)] = 0
In these data, y1 and y2 are almost the same (note the small added noise), but the first 10% of y2 are empty (similarly to your example)
We can now calculate the absolute difference between the two vectors and find the first element for which the absolute difference is below a sensitivity threshold:
abs_delta = np.abs(y1 - y2)
THRESHOLD = 1e-2
sel = abs_delta < THRESHOLD
ix_start = np.where(sel)[0][0]
fig, axes = plt.subplots(3, 1)
ax = axes[0]
ax.plot(y1, '-')
ax.set_title('y1')
ax.axvline(ix_start, color='red')
ax = axes[1]
ax.plot(y2, '-')
ax.axvline(ix_start, color='red')
ax.set_title('y2')
ax = axes[2]
ax.plot(abs_delta)
ax.axvline(ix_start, color='red')
ax.set_title('abs diff')
This method works if the overlapping parts are indeed "almost identical". You will have to think of smarter alignment ways if the similarity is low.
I think what you are looking for is correlation. Here is a small example.
import numpy as np
equal_part = [0, 1, 2, 3, -2, -4, 5, 0]
y1 = equal_part + [0, 1, 2, 3, -2, -4, 5, 0]
y2 = [1, 2, 4, -3, -2, -1, 3, 2]+y1
np.argmax(np.correlate(y1, y2, 'same'))
Out:
7
So this returns the time-difference, where the correlation between both signals is at its maximum. As you can see, in the example the time difference should be 8, but this depends on your data...
Also note that both signals have the same length.