I was trying to understand what is the effective difference between those two pieces of code. They are both written for an assignment I got at school, but only the first one works as it should. I've been unable to understand what goes wrong in the second one so I'd be fantastically grateful if someone could shine some light on this problem.
First code:
def classify(self, obj):
if sum([c[0].classify(obj)*c[1] for c in self.classifiers]) >0:
return 1
else: return -1
def update_weights(self, best_error, best_classifier):
w=self.data_weights
for index in range(len(self.data_weights)):
if self.standard.classify(self.data[index])==best_classifier.classify(self.data[index]):
s=-1
else: s=1
self.data_weights[index] = self.data_weights[index]*math.exp(s*error_to_alpha(best_error))
Second code:
def classify(self, obj):
score = 0
for c, alpha in self.classifiers:
score += alpha * c.classify(obj)
if score > 0:
return 1
else:
return -1
def update_weights(self, best_error, best_classifier):
alpha = error_to_alpha(best_error)
for d, w in zip(self.data, self.data_weights):
if self.standard.classify(d) == best_classifier.classify(d):
w *= w * math.exp(alpha)
else:
w *= w * math.exp(-1.0*alpha)
the second doesn't modify the weights.
in the first you explicitly modify the weights array with the line
self.data_weights[index] = ...
but in the second you are only modifying w:
w *= ...
(and you have an extra factor of w). in the second case, w is a variable that is initialised from data_weights, but it is a new variable. it is not the same thing as the array entry, and changing its value does not change the array itself.
so when you later go to look at data_weights in the second case, it will not have been updated.
Related
I'm simplifying an engineering problem as much as possible for this question:
I have a working code like this:
import numpy as np
# FUNCTION DEFINITION
def Calculations(a, b): # a function defined to work based on 2 arguments, a and b
A = a * b - a
B = a * b - b
d = A - B
return(A, B, d, a, b)
# STORE LIST CREATION
A_list = []
B_list = []
d_list = []
a_list = []
b_list = [] # I will need this list later
# 1st sequential iteration in a for loop
length = np.arange(60, 62.5, 0.5)
for l in length:
lower = 50 # this is what I want the program to update based on d
upper = 70.5 # this is what I want the program to update based on d
step = 0.5
width = np.arange(lower, upper, step)
# 2nd for loop, but here I wouldn't like a sequential iteration
for w in width:
A_list.append(Calculations(l, w)[0])
B_list.append(Calculations(l, w)[1])
d_list.append(Calculations(l, w)[2])
a_list.append(Calculations(l, w)[3])
b_list.append(Calculations(l, w)[4])
print(A_list, " \n")
print(B_list, " \n")
print(d_list, " \n")
print(a_list, " \n")
print(b_list, " \n")
This is the way I have it now, but not how I want it to work.
Here, the program iterates each time through the values of length(l) in a sequential manner, meaning it evaluates everything for l=60, then for l=60.5 and so on... this is ok, but then, for l=60 it evaluates first for w=50, then for w=50.5 and so on...
What I want is that, for l=60 he evaluates for any random value (let's call this n) between the 50 (lower) and 70.5 (upper) with a step of 0.5 (step), he will then find a particular d as one of the "returned" results, if the d is negative then the n he used is the new upper, if d is positive that n is the new lower, and he will continue to do this until d is zero.
I will keep trying to figure it out by myself, but any help would be appreciated.
PD:
As I said this example is a simplification of my real problem, as side questions I would like to ask:
The real condition of while loop to break is not when d is zero, but the closest possible to zero, or phrased in other way, the min() of the abs() values composing the d_list. I tried something like:
for value in d_list:
if value = min(abs(d_list)):
print(A_list, " \n")
print(B_list, " \n")
print(d_list, " \n")
print(a_list, " \n")
print(b_list, " \n")
but that's not correct.
I don't want to use a conditions such as if d < 0.2 because sometimes I will get d's like 0.6 and that may be ok, neither do I want a condition like if d < 1 because then if for example d = 0.005 I would get a lot of d's before that, satisfying the condition of being < 1, but I only want one for each l.
I also need to find the associated values in the returned lists, for that specific d
EDIT
I made a mistake earlier in the conditions for new upper and lower based on the obtained value of d, I fixed that.
Also, I tried solving the problem like this:
length = np.arange(60, 62.5, 0.5)
for l in length:
lower_w = 59.5 # this is what I want the program to update based on d
upper_w = 63 # this is what I want the program to update based on d
step = 0.5
width = np.arange(lower_w, upper_w, step)
np.random.shuffle(width)
for w in width:
while lower_w < w < upper_w:
A_list.append(Calculations(l,w)[0])
B_list.append(Calculations(l,w)[1])
d_list.append(Calculations(l,w)[2])
a_list.append(Calculations(l,w)[3])
b_list.append(Calculations(l,w)[4])
for element in d_list:
if element < 0:
upper = w
else:
lower = w
if abs(element) < 1 :
break
But the while loop does not get to break...
Use np.random.shuffle to pick the elements of width in a random order:
width = np.arange(lower, upper, step)
np.random.shuffle(width)
But here you don't really want the second loop, just pick one element from it at random, so use np.range.choice(width):
length = np.arange(60, 62.5, 0.5)
lower = 50 # this is what I want the program to update based on d
upper = 70.5 # this is what I want the program to update based on d
step = 0.5
for l in length:
width = np.arange(lower, upper, step)
if len(width) == 0:
width = [lower]
w = np.random.choice(width)
(A, B, d, a, b) = Calculations(l, w)
A_list.append(A)
B_list.append(B)
d_list.append(d)
a_list.append(l)
b_list.append(w)
if d < 0:
lower = w
elif d:
upper = w
No need to pass a and b in the return of the Calculations function, you can just append the original parameters to a_list and b_list.
Note that you will run into an error if your lower and upper bound are identical, because the list will just be empty, so you need to fill in the list with a bound if it returns [].
I'm a beginner to Python and I'm trying to calculate the angles (-26.6 &18.4) for this figure below and so on for the rest of the squares by using Python code.
I have found the code below and I'm trying to understand very well. How could it work here? Any clarification, please?
Python Code:
def computeDegree(a,b,c):
babc = (a[0]-b[0])*(c[0]-b[0])+(a[1]-b[1])*(c[1]-b[1])
norm_ba = math.sqrt((a[0]-b[0])**2 + (a[1]-b[1])**2)
norm_bc = math.sqrt((c[0]-b[0])**2 + (c[1]-b[1])**2)
norm_babc = norm_ba * norm_bc
radian = math.acos(babc/norm_babc)
degree = math.degrees(radian)
return round(degree, 1)
def funcAngle(p, s, sn):
a = (s[0]-p[0], s[1]-p[1])
b = (sn[0]-p[0], sn[1]-p[1])
c = a[0] * b[1] - a[1] * b[0]
if p != sn:
d = computeDegree(s, p, sn)
else:
d = 0
if c > 0:
result = d
elif c < 0:
result = -d
elif c == 0:
result = 0
return result
p = (1,4)
s = (2,2)
listSn= ((1,2),(2,3),(3,2),(2,1))
for sn in listSn:
func(p,s,sn)
The results
I expected to get the angles in the picture such as -26.6, 18.4 ...
Essentially, this uses the definition of dot products to solve for the angle. You can read more it at this link (also where I found these images).
To solve for the angle you first need to convert your 3 input points into two vectors.
# Vector from b to a
# BA = (a[0] - b[0], a[1] - b[1])
BA = a - b
# Vector from b to c
# BC = (a[0] - c[0], a[1] - c[1])
BC = c - b
Using the two vectors you can then find the angle between them by first finding the value of the dot product with the second formula.
# babc = (a[0]-b[0])*(c[0]-b[0])+(a[1]-b[1])*(c[1]-b[1])
dot_product = BA[0] * BC[0] + BA[1] * BC[1]
Then by going back to the first definition, you can divide off the lengths of the two input vectors and the resulting value should be the cosine of the angle between the vectors. It may be hard to read with the array notation but its just using the Pythagoras theorem.
# Length/magnitude of vector BA
# norm_ba = math.sqrt((a[0]-b[0])**2 + (a[1]-b[1])**2)
length_ba = math.sqrt(BA[0]**2 + BA[1]**2)
# Length/magnitude of vector BC
# norm_bc = math.sqrt((c[0]-b[0])**2 + (c[1]-b[1])**2)
length_bc = math.sqrt(BC[0]**2 + BC[1]**2)
# Then using acos (essentially inverse of cosine), you can get the angle
# radian = math.acos(babc/norm_babc)
angle = Math.acos(dot_product / (length_ba * length_bc))
Most of the other stuff is just there to catch cases where the program might accidentally try to divide by zero. Hopefully this helps to explain why it looks the way it does.
Edit: I answered this question because I was bored and didn't see harm in explaining the math behind that code, however in the future try to avoid asking questions like 'how does this code work' in the future.
Let's start with funcAngle since it calls computeDegree later.
The first thing it does is define a as a two item tuple. A lot of this code seems to use two item tuples, with the two parts referenced by v[0] and v[1] or similar. These are almost certainly two dimensional vectors of some sort.
I'm going to write these as šÆ for the vector and vā and vįµ§ since they're probably the two components.
[don't look too closely at that second subscript, it's totally a y and not a gamma...]
a is the vector difference between s and p: i.e.
a = (s[0]-p[0], s[1]-p[1])
is aā=sā-pā and aįµ§=sįµ§-pįµ§; or just š=š¬-š© in vector.
b = (sn[0]-p[0], sn[1]-p[1])
again; š=š¬š§-š©
c = a[0] * b[1] - a[1] * b[0]
c=aābįµ§-aįµ§bā; c is the cross product of š and š (and is just a number)
if p != sn:
d = computeDegree(s, p, sn)
else:
d = 0
I'd take the above in reverse: if š© and š¬š§ are the same, then we already know the angle between them is zero (and it's possible the algorithm fails badly) so don't compute it. Otherwise, compute the angle (we'll look at that later).
if c > 0:
result = d
elif c < 0:
result = -d
elif c == 0:
result = 0
If c is pointing in the normal direction (via the left hand rule? right hand rule? can't remember) that's fine: if it isn't, we need to negate the angle, apparently.
return result
Pass the number we've just worked out to some other code.
You can probably invoke this code by adding something like:
print (funcangle((1,0),(0,1),(2,2))
at the end and running it. (Haven't actually tested these numbers)
So this function works out a and b to get c; all just to negate the angle if it's pointing the wrong way. None of these variables are actually passed to computeDegree.
so, computeDegree():
def computeDegree(a,b,c):
First thing to note is that the variables from before have been renamed. funcAngle passed s, p and sn, but now they're called a, b and c. And the note the order they're passed in isn't the same as they're passed to funcAngle, which is nasty and confusing.
babc = (a[0]-b[0])*(c[0]-b[0])+(a[1]-b[1])*(c[1]-b[1])
babc = (aā-bā)(cā-bā)+(aįµ§-bįµ§)(cįµ§-bįµ§)
If š' and š' are š-š and š-š respectively, this is just
a'āc'ā+a'įµ§c'įµ§, or the dot product of š' and š'.
norm_ba = math.sqrt((a[0]-b[0])**2 + (a[1]-b[1])**2)
norm_bc = math.sqrt((c[0]-b[0])**2 + (c[1]-b[1])**2)
norm_ba = ā[(aā-bā)Ā² + (aįµ§-bįµ§)Ā²] (and norm_bc likewise).
This looks like the length of the hypotenuse of š' (and š' respectively)
norm_babc = norm_ba * norm_bc
which we then multiply together
radian = math.acos(babc/norm_babc)
We use the arccosine (inverse cosine, cos^-1) function, with the length of those multiplied hypotenuses as the hypotenuse and that dot product as the adjacent length...
degree = math.degrees(radian)
return round(degree, 1)
but that's in radians, so we convert to degrees and round it for nice formatting.
Ok, so now it's in maths, rather than Python, but that's still not very easy to understand.
(sidenote: this is why descriptive variable names and documentation is everyone's friend!)
I have done the Recursive function in Python that works:
def Rec(n):
if (n<=5):
return 2*n
elif (n>=6):
return Rec(n-6)+2*Rec(n-4)+4*Rec(n-2)
print (Rec(50))
But I can't think of an iterative one
I am sure I will need to use a loop and possibly have 4 variables to store the previous values, imitating a stack.
For your particular question, assuming you have an input n, the following code should calculate the function iteratively in python.
val = []
for i in range(6):
val.append(2*i)
for i in range(6,n+1):
val.append( val[i-6] + 2*val[i-4] + 4*val[i-2] )
print(val[n])
I get this answer:
$ python test.py
Rec(50) = 9142785252232708
Kist(50) = 9142785252232708
Using the code below. The idea is that your function needs a "window" of previous values - Kn-6, Kn-4, Kn-2 - and that window can be "slid" along as you compute new values.
So, for some value like "14", you would have a window of K8, K9, ... K13. Just compute using those values, then drop K8 since you'll never use it again, and append K14 so you can use it in computing K15..20.
def Rec(n):
if (n<=5):
return 2*n
elif (n>=6):
return Rec(n-6)+2*Rec(n-4)+4*Rec(n-2)
def Kist(n):
if n <= 5:
return 2 * n
KN = [2*n for n in range(6)]
for i in range(6, n+1):
kn = KN[-6] + 2 * KN[-4] + 4 * KN[-2]
KN.append(kn)
KN = KN[-6:]
return KN[-1]
print("Rec(50) =", Rec(50))
print("Kist(50) =", Kist(50))
I'm trying to find some Python library or method that will create a formula that describes an an arbitrary list of integers like: [1,1,2,4,3,1,2,3,4,1,...]
For example:
from some_awesome_library import magic_method
seq = [1,1,2,4,3,1,2,3,4,1,4,3,2,2,4,3]
my_func(sequence):
equation = magic_method(seq)
return equation
print(my_func(seq))
The order of the sequence matters, but it has certain rules. For example, all integers will be between 1 and 4, and there will be an equal number of each integer within the sequence.
I've looked into numpy.polyfit and scipy.optimize.leastsq. I suspect that scipy is what I need, but it'd be great to have confirmation of that approach and any suggestions for the types of mathematical functions I should look into using (I'm not much of a math's person - just studied up to college calculus). Maybe a some sort of modulo function? Maybe a sine wave?
Thanks in advance for any help or suggestions you have.
EDIT: Thanks for your comments below. I'm curious about Sudoku puzzles, specifically N=2 puzzles. I suspect that if you take the entire solution space and line them up in a certain order that patterns will emerge that might be useful for solving Sudoku faster. I've got a class that represents the solution space called SolutionManager and returns "slices" of the solution space that look like the list of integers shown above. Below is an image of one such example for an N=2 puzzle solution space (generated with Jupyter Notebook):
I think I can see patterns in the data, but I'm trying to figure out how to develop formulas that represent these patterns. I also suspect that reordering the solutions will make for simpler equations describing those patterns.
To prove that I'm trying to write a genetic algorithm that will reorder the solutions described in the SolutionManager according to how simple the equations describing them are. I'm stuck on writing the fitness function, which should rate the SolutionManager instance by how simple it's described equations are.
Code for the SolutionManager is below:
class SolutionManager:
"""Object that contains all possible Solutions for an n=2 Sudoku array"""
def __init__(self, solutions_file):
input_file = open(solutions_file, 'r')
grids = [r.strip() for r in input_file.readlines()]
list_of_solutions = []
i = 0
for grid in grids:
list_of_cubes = []
i += 1
for r in range(4):
for c in range(4):
pos = r * 4 + c
digit = int(grid[pos])
cube = Cube(i, c, r, digit)
list_of_cubes.append(cube)
list_of_solutions.append(Solution(list_of_cubes))
self.solutions = list_of_solutions
assert isinstance(self.solutions, list)
assert all(isinstance(x, Solution) for x in self.solutions)
"""Get a vertical slice of the Solution Space"""
def get_vertical_slice(self, slice_index):
assert slice_index <= 4
assert slice_index >= 0
slice = []
for sol in self.solutions:
slice.append(sol.get_column(slice_index))
return slice
"""Get a horizontal slice of the Solution Space"""
def get_horizontal_slice(self, slice_index):
assert slice_index <= 4
assert slice_index >= 0
slice = []
for sol in self.solutions:
slice.append(sol.get_row(slice_index))
return slice
"""Sorts the solutions by a vertical axis using an algorithm"""
def sort_solutions_by_vertical_axis(self, axis_index):
pass
class Solution:
def __init__(self, cubes):
assert (len(cubes) == 16)
self.solution_cubes = cubes
def get_column(self, c):
return list(_cube for _cube in self.solution_cubes if _cube.row == c)
def get_row(self, r):
return list(_cube for _cube in self.solution_cubes if _cube.column == r)
def get_column_value(self, c, v):
single_cube = list(_cube for _cube in self.solution_cubes if _cube.row == r and _cube.value == v)
assert (len(single_cube) == 1)
return single_cube[0]
def get_row_value(self, r, v):
single_cube = list(_cube for _cube in self.solution_cubes if _cube.column == c and _cube.value == v)
assert (len(single_cube) == 1)
return single_cube[0]
def get_position(self, r, c):
single_cube = list(_cube for _cube in self.solution_cubes if _cube.column == c and _cube.row == r)
assert (len(single_cube) == 1)
return single_cube[0]
class Cube:
def __init__(self, d, c, r, v):
self.depth = d
self.column = c
self.row = r
self.value = v
def __str__(self):
return str(self.value)
I've written some python code to calculate a certain quantity from a cosmological simulation. It does this by checking whether a particle in contained within a box of size 8,000^3, starting at the origin and advancing the box when all particles contained within it are found. As I am counting ~2 million particles altogether, and the total size of the simulation volume is 150,000^3, this is taking a long time.
I'll post my code below, does anybody have any suggestions on how to improve it?
Thanks in advance.
from __future__ import division
import numpy as np
def check_range(pos, i, j, k):
a = 0
if i <= pos[2] < i+8000:
if j <= pos[3] < j+8000:
if k <= pos[4] < k+8000:
a = 1
return a
def sigma8(data):
N = []
to_do = data
print 'Counting number of particles per cell...'
for k in range(0,150001,8000):
for j in range(0,150001,8000):
for i in range(0,150001,8000):
temp = []
n = []
for count in range(len(to_do)):
n.append(check_range(to_do[count],i,j,k))
to_do[count][1] = n[count]
if to_do[count][1] == 0:
temp.append(to_do[count])
#Only particles that have not been found are
# searched for again
to_do = temp
N.append(sum(n))
print 'Next row'
print 'Next slice, %i still to find' % len(to_do)
print 'Calculating sigma8...'
if not sum(N) == len(data):
return 'Error!\nN measured = {0}, total N = {1}'.format(sum(N), len(data))
else:
return 'sigma8 = %.4f, variance = %.4f, mean = %.4f' % (np.sqrt(sum((N-np.mean(N))**2)/len(N))/np.mean(N), np.var(N),np.mean(N))
I'll try to post some code, but my general idea is the following: create a Particle class that knows about the box that it lives in, which is calculated in the __init__. Each box should have a unique name, which might be the coordinate of the bottom left corner (or whatever you use to locate your boxes).
Get a new instance of the Particle class for each particle, then use a Counter (from the collections module).
Particle class looks something like:
# static consts - outside so that every instance of Particle doesn't take them along
# for the ride...
MAX_X = 150,000
X_STEP = 8000
# etc.
class Particle(object):
def __init__(self, data):
self.x = data[xvalue]
self.y = data[yvalue]
self.z = data[zvalue]
self.compute_box_label()
def compute_box_label(self):
import math
x_label = math.floor(self.x / X_STEP)
y_label = math.floor(self.y / Y_STEP)
z_label = math.floor(self.z / Z_STEP)
self.box_label = str(x_label) + '-' + str(y_label) + '-' + str(z_label)
Anyway, I imagine your sigma8 function might look like:
def sigma8(data):
import collections as col
particles = [Particle(x) for x in data]
boxes = col.Counter([x.box_label for x in particles])
counts = boxes.most_common()
#some other stuff
counts will be a list of tuples which map a box label to the number of particles in that box. (Here we're treating particles as indistinguishable.)
Using list comprehensions is much faster than using loops---I think the reason is that you're basically relying more on the underlying C, but I'm not the person to ask. Counter is (supposedly) highly-optimized as well.
Note: None of this code has been tested, so you shouldn't try the cut-and-paste-and-hope-it-works method here.