How to calculate distance from points in lists? - python

I have two group of lists, A and O. Both of them have points from x,y z coordinate. I want to calculate the distance between points from A and B. I used a for loop, but it only give me one result. It should give me 8 numbers from the results. I'm very appreciate that someone can have a look. It's the last step in my project.
Ax = [-232.34, -233.1, -232.44, -233.02, -232.47, -232.17, -232.6, -232.29, -231.65]
Ay = [-48.48, -49.48, -50.81, -51.42, -51.95, -52.25, -52.83, -53.63, -53.24]
Az = [-260.77, -253.6, -250.25, -248.88, -248.06, -247.59, -245.82, -243.98, -243.76]
Ox = [-302.07, -302.13, -303.13, -302.69, -303.03, -302.55, -302.6, -302.46, -302.59]
Oy = [-1.73, -3.37, -4.92, -4.85, -5.61, -5.2, -5.91, -6.41, -7.4]
Oz = [-280.1, -273.02, -269.74, -268.32, -267.45, -267.22, -266.01, -264.79, -264.96]
distance = []
for xa in A1:
for ya in A2:
for za in A3:
for x1 in o1:
for y1 in o2:
for z1 in o3:
distance += distance
distance = (((xa-x1)**2)+((ya-y1)**2)+((za-z1)**2))**(1/2)
print(distance)

Other people have provided fixes to your immediate problem. I would also recommend that you start using numpy and avoid all of those for loops. Numpy provides ways to vectorize your code, basically offload all of the looping that needs to be done to very efficient C++ implementations. For instance, you can replace your whole nested for-loop thing with the following vectorized implementation:
import numpy as np
# Convert your arrays to numpy arrays
Ax = np.asarray(Ax)
Ay = np.asarray(Ay)
Az = np.asarray(Az)
Ox = np.asarray(Ox)
Oy = np.asarray(Oy)
Oz = np.asarray(Oz)
# Find the distance in a single, vectorized operation
np.sqrt(np.sum(((Ax-Ox)**2, (Ay-Oy)**2, (Az-Oz)**2), axis=0))

Your first issue is this:
distance = (((xa-x1)**2)+((ya-y1)**2)+((za-z1)**2))**(1/2)
This despite you defining distance as a list. You're replacing a list of values with a single value. What you want is
distance.append((((xa-x1)**2)+((ya-y1)**2)+((za-z1)**2))**(1/2))
which will add this value to the end of the list.
Second thing: your workflow could be improved. Instead of using that many for loops, try doing this: You know that the lengths of A1, A2, A3, o1, o2, and o3 are the same length, so:
distance = []
for i in range(len(A1)): # will run 8 times because the length of A1 is 8
xa, ya, za = A1[i], A2[i], A3[i] # these values correspond to each other
xb, yb, zb = o1[i], o2[i], o3[i] # all are in the same position in their respective list
distance.append((((xa-x1)**2)+((ya-y1)**2)+((za-z1)**2))**(1/2))
print distance

You need to be appending to distance not assigning it. You should be doing something like this inside of your for loops:
distance.append((((xa-x1)**2)+((ya-y1)**2)+((za-z1)**2))**(1/2))

By nesting all those loops, you're going to be executing each "subloop" every iteration of the "parent loop", and so on, resulting in far more loops than necessary and some mixed up data. As other answers have mentioned, you're also reassigning distance to the value of the last calulation of the inner-most loop, every pass.
You can do all of this a lot more efficiently by zipping the data.
distance = []
for ptA, ptB in zip(zip(Ax, Ay, Az), zip(Ox, Oy, Oz)):
distance.append(pow(sum(pow(a - b, 2) for a, b in zip(ptA, ptB)), 0.5))

Your nested loops are not merely inefficient, but incorrect. You are going through every combination of x, y, and z values for both sets of points.
Here's a list comprehension to accomplish the task:
distance = [((xa-x1)**2 + (ya-y1)**2 + (za-z1)**2)**(0.5)
for (xa, ya, za, x1, y1, z1) in zip(Ax, Ay, Az, Ox, Oy, Oz)]
The zip call produces groups of the corresponding coordinate values. These are then unpacked into individual values for a given pair of points. Then the distances is calculated and added to the resulting list. Here is the result:
[86.14803712215387, 85.25496701072612, 86.50334270997855, 86.02666679582558, 86.61455593605497, 86.90445212991106, 86.65519315078585, 87.10116761559514, 87.08173861378742]
Note that the (1/2) in your formula works for Python 3, but not for Python 2. I've use 0.5, which will work for both. Using math.sqrt() might be an even better idea.

Related

Variables with indexes and sums with indexes in mosek

I have to find solutions to an integer programming problem:
I am using Mosek's Fusion API (Python). Now the constrains are easy to put in, I am more worried about the actual objective. The problem for me is: How can I tell mosek that I want to sum by all is, js or ks and define what they are, what are their boundaries, etc.?
This is a simplified version of a self-caching problem in the context of servers. So i here means a server, j means an object to cache, but in this version there's one object, so this I guess is not important. k means server too, so e.g. d(ik) means the distance from the server i to the server k.
But whatever I want to achieve, I don't know how to write this objective. For now I have something like this:
from mosek.fusion import Domain, Model, Expr, ObjectiveSense
alpha = 4 # alpha is the same for all i and j
demand = 1 # w is the same for all i and k
n = 6 # number of servers
distances_matrix = [[...], [...], ...]
with Model("lo1") as M:
x = M.variable("x", n, Domain.integral(Domain.inRange(0, 1)))
y = M.variable("y", n, Domain.integral(Domain.inRange(0, 1)))
alpha_times_x = Expr.mul(alpha, x)
demand_times_dist_times_y = Expr.mul(demand, distances_matrix, y)
M.objective("obj", ObjectiveSense.Minimize, )
M.solve()
print(x.level())
print(y.level())
Now of course the demand_times_dist_times_y is wrong, because I want to get the distance from i to k from the matrix. And the x above is fine since xs are: {x0, x1, x2, x3, x4, x5, x6}, but the ys would have to be {y11, y12, y13, y14, y15, y16, y21, y22, ..., y66}, so I guess I defined them wrong.
So e.g. how can I define that i,k are in {1,2,3,4,5,6} and create an Expr.sum by e.g. k? And how would I define those two sums at the beginning of the objective?
I don't know if that answers the question, but if you have, say
x = M.variable("x", n, Domain.integral(Domain.inRange(0, 1)))
then sum_i x_i is obtained with
Expr.sum(x)
Similarly, if now alpha is a numerical array of length n then sum_i (alpha_i*x_i) is obtained with
Expr.sum( Expr.mulElm(alpha,x) )
or even
Expr.dot( alpha, x )
and so on. You never explicitly specify the summation index, you are summing all entries of whatever appears inside the Expr.sum and similar methods.

Incorrect results for simple 2D transformation

I'm attempting a 2D transformation using the nudged package.
The code is really simple:
import nudged
# Domain data
x_d = [2538.87, 1294.42, 3002.49, 2591.56, 2881.37, 891.906, 1041.24, 2740.13, 1928.55, 3335.12, 3771.76, 1655.0, 696.772, 583.242, 2313.95, 2422.2]
y_d = [2501.89, 4072.37, 2732.65, 2897.21, 808.969, 1760.97, 992.531, 1647.57, 2407.18, 2868.68, 724.832, 1938.11, 1487.66, 1219.14, 672.898, 145.059]
# Range data
x_r = [3.86551776277075, 3.69693290266126, 3.929110096606081, 3.8731112887391532, 3.9115924127798536, 3.6388068074815862, 3.6590261077461577, 3.892482104449016, 3.781816183438835, 3.97464058821231, 4.033173444601999, 3.743901522907265, 3.6117470568340906, 3.5959585708147728, 3.8338853650390945, 3.8487836817639334]
y_r = [1.6816478101135388, 1.8732008327428353, 1.7089144628920678, 1.729386055302033, 1.4767657611559102, 1.5933812675900505, 1.5003232598807479, 1.5781629182153942, 1.670867507106891, 1.7248363641300841, 1.4654588884234485, 1.6143557610354264, 1.5603626129237362, 1.5278835570641824, 1.4609066190929916, 1.397111300807424]
# Random domain data
x, y = np.random.uniform(0., 4000., (2, 1000))
# Define domain and range points
dom, ran = (x_d, y_d), (x_r, y_r)
# Obtain transformation dom --> ran
trans = nudged.estimate(dom, ran)
# Apply the transformation to the (x, y) points
x_t, y_t = trans.transform((x, y))
where (x_d, y_d) and (x_r, y_r) are the 1 to 1 correlated "domain" and "range" points, and (x, y) are all the points in the (x_d, y_d) (domain) system that I want to transform to the (x_r, y_r) (range) system.
This is the result I get:
where:
trans.get_matrix()
[[-0.0006459232439068067, -0.0007947429558548157, 6.534164085946009], [0.0007947429558548157, -0.0006459232439068067, 2.515279819707991], [0, 0, 1]]
trans.get_rotation()
2.2532603497070713
trans.get_scale()
0.0010241255796531702
trans.get_translation()
[6.534164085946009, 2.515279819707991]
This is the final transformed dom values with the original ran points overlayed:
This is clearly not right and I can't figure out what I'm doing wrong.
I was able to figure out your issue. It is simply that nudge has somewhat problematic notation, which is poorly documented.
The estimate function accepts a list of coordinate pairs. You effectively have to transpose dom and ran to get this to work. I suggest either switching to numpy arrays, or using list(map(list, zip(...))) to do the transpose.
The Transform.transfom method is extremely restrictive, and requires that the inner pairs be of type list. Not tuple, not any other sequence, but specifically list. Your attempt to call trans.transform((x, y)) only happened to work by pure luck. transform assessed that the first element is not a list, and attempted to transform (x, y) as a pair of integers. Luckily for you, numpy operators are vectorized, so you can process an entire array as a single unit.
Here is a working version of your code that generates the correct plots using mostly python:
x_d = [2538.87, 1294.42, 3002.49, 2591.56, 2881.37, 891.906, 1041.24, 2740.13, 1928.55, 3335.12, 3771.76, 1655.0, 696.772, 583.242, 2313.95, 2422.2]
y_d = [2501.89, 4072.37, 2732.65, 2897.21, 808.969, 1760.97, 992.531, 1647.57, 2407.18, 2868.68, 724.832, 1938.11, 1487.66, 1219.14, 672.898, 145.059]
# Range data
x_r = [3.86551776277075, 3.69693290266126, 3.929110096606081, 3.8731112887391532, 3.9115924127798536, 3.6388068074815862, 3.6590261077461577, 3.892482104449016, 3.781816183438835, 3.97464058821231, 4.033173444601999, 3.743901522907265, 3.6117470568340906, 3.5959585708147728, 3.8338853650390945, 3.8487836817639334]
y_r = [1.6816478101135388, 1.8732008327428353, 1.7089144628920678, 1.729386055302033, 1.4767657611559102, 1.5933812675900505, 1.5003232598807479, 1.5781629182153942, 1.670867507106891, 1.7248363641300841, 1.4654588884234485, 1.6143557610354264, 1.5603626129237362, 1.5278835570641824, 1.4609066190929916, 1.397111300807424]
# Random domain data
uni = np.random.uniform(0., 4000., (2, 1000))
# Define domain and range points
dom = list(map(list, zip(x_d, y_d)))
ran = list(map(list, zip(x_r, y_r)))
# Obtain transformation dom --> ran
trans = estimate(dom, ran)
# Apply the transformation to the (x, y) points
tra = trans.transform(uni)
fig, ax = plt.subplots(2, 2)
ax[0][0].scatter(x_d, y_d)
ax[0][0].set_title('dom')
ax[0][1].scatter(x_r, y_r)
ax[0][1].set_title('ran')
ax[1][0].scatter(*uni)
ax[1][1].scatter(*tra)
I left in your hack with uni, since I did not feel like converting the array of random values to a nested list. The resulting plot looks like this:
My overall recommendation is to submit a number of bug reports to the nudge library based on these findings.

python (sympy) implicit function: get values instead of plot?

I am new to sympy but I already get a nice output when I plot the implicit function (actually the formula for Cassini's ovals) using sympy:
from sympy import plot_implicit, symbols, Eq, solve
x, y = symbols('x y')
k=2.7
a=3
eq = Eq((x**2 + y**2)**2-2*a**2*(x**2-y**2), k**4-a**4)
plot_implicit(eq)
Now is it actually possible to somehow get the x and y values corresponding to the plot? or alternatively solve the implicit equation without plotting at all?
thanks! :-)
This is an answer addressing your
is it actually possible to somehow get the x and y values corresponding to the plot?
and I say "addressing" because it's not possible to get the x and y values used to draw the curves — because the curves are not drawn using a sequenc of 2D points… more on this later,
TL;DR
pli = plot_implicit(...)
series = pli[0]
data, action = series.get_points()
data = np.array([(x_int.mid, y_int.mid) for x_int, y_int in data])
Let's start with your code
from sympy import plot_implicit, symbols, Eq, solve
x, y = symbols('x y')
k=2.7
a=3
eq = Eq((x**2 + y**2)**2-2*a**2*(x**2-y**2), k**4-a**4)
and plot it, with a twist: we save the Plot object and print it
pli = plot_implicit(eq)
print(pli)
to get
Plot object containing:
[0]: Implicit equation: Eq(-18*x**2 + 18*y**2 + (x**2 + y**2)**2, -27.8559000000000) for x over (-5.0, 5.0) and y over (-5.0, 5.0)
We are interested in this object indexed by 0,
ob = pli[0]
print(dir(ob))
that gives (ellipsis are mine)
['__class__', …, get_points, …, 'var_y']
The name get_points sounds full of promise, doesn't it?
print(ob.get_points())
that gives (edited for clarity and with a big cut)
([
[interval(-3.759774, -3.750008), interval(-0.791016, -0.781250)],
[interval(-3.876961, -3.867195), interval(-0.634768, -0.625003)],
[interval(-3.837898, -3.828133), interval(-0.693361, -0.683596)],
[interval(-3.847664, -3.837898), interval(-0.673830, -0.664065)],
...
[interval(3.837895, 3.847661), interval(0.664064, 0.673830)],
[interval(3.828130, 3.837895), interval(0.683596, 0.693362)],
[interval(3.867192, 3.876958), interval(0.625001, 0.634766)],
[interval(3.750005, 3.759770), interval(0.781255, 0.791021)]
], 'fill')
What is this? the documentation of plot_implicit has
plot_implicit, by default, uses interval arithmetic to plot functions.
Following the source code of plot_implicit.py and plot,py one realizes that, in this case, the actual plotting (speaking of the matpolotlib backend) is just a line of code
self.ax.fill(x, y, facecolor=s.line_color, edgecolor='None')
where x and y are constructed from the list of intervals, as returned from .get_points(), as follows
x, y = [], []
for intervals in interval_list:
intervalx = intervals[0]
intervaly = intervals[1]
x.extend([intervalx.start, intervalx.start,
intervalx.end, intervalx.end, None])
y.extend([intervaly.start, intervaly.end,
intervaly.end, intervaly.start, None])
so that for each couple of intervals matplotlib is directed to draw a filled rectangle, small enough that the eye sees a continuous line (note the use of None to have disjoint rectangles).
We can conclude that the list of couples of intervals
l_xy_intervals = ((pli[0]).get_points())[0]
represents rectangular areas where the implicit expression you are plotting is
"true enough"
You can do this, even with interval math, if you try getting the mid point of each interval. Starting from your code, and slightly change it, by saving the plot_implicit object in a variable called g we have:
from sympy import plot_implicit, symbols, Eq, solve
x, y = symbols('x y')
k=2.7
a=3
eq = Eq((x**2 + y**2)**2-2*a**2*(x**2-y**2), k**4-a**4)
g = plot_implicit(eq)
Now let's save in a variable named ptos the intervals that were used to draw the plot.
ptos = g[0].get_points()[0]
This way ptos[0][0] will be the first interval in the x axis and ptos[0][1] will be its pair in the y axis. The intervals have a property called mid which gives the middle point of the interval. So you can suppose that ptos[0][0].mid, ptos[0][1].mid will be a pair x,y "true enough" to be one of our numerical solutions.
This way, a data frame composed of this middle point pairs can be generated with:
intervs = np.array(dtype='object')
meio = lambda x0:x0.mid
px = list(map(meio, intervs[:,0]))
py = list(map(meio, intervs[:,1]))
import pandas as pd
dados = pd.DataFrame({'x':px, 'y':px})
dados.head()
Which in this example would give us:
x y
0 -1.177733 0.598826
1 -1.175389 0.596483
2 -1.175389 0.598826
3 -1.173045 0.596483
4 -1.173045 0.598826
This idea of getting the intervals middle points can be used whenever one needs to move from "interval math" to "standard" point level math. Hope this helps. Regards.

How to unravel data interpolated with griddata

I have interpolate a function on a grid with scipy.interpolate.griddata like so
interpolated_quantity = scipy.interpolate.griddata(old_points, old_array, grid_x, grid_y, grid_z, method='nearest')
What I would like to do is to convert have a set of 4 1-D arrays: 3 with the position of each cell and one with the corresponding value of interpolated quantity in each cell.
So far I'm using a very slow and time consuming operation:
arrays={}
base_gridx = linspace(xmin,xmax,abs(ngridx)+1)
base_gridy = linspace(ymin,ymax,abs(ngridy)+1)
base_gridz = linspace(zmin,zmax,abs(ngridz)+1)
cx = (base_gridx[1:]+base_gridx[:-1])/2.
cy = (base_gridy[1:]+base_gridy[:-1])/2.
cz = (base_gridz[1:]+base_gridz[:-1])/2.
data_len = len(cx)*len(cy)*len(cz)
for ii in arange(0,len(cx)):
for jj in arange(0,len(cy)):
for kk in arange(0,len(cz)):
arrays["x"].append(cx[ii])
arrays["y"].append(cy[jj])
arrays["z"].append(cz[kk])
arrays["prop"].append(interpolated quantity[ii][jj][kk])
This works, but it just takes a huge amount of time. Do you think there might be a faster way to do this? Maybe using ravel?
It is as simple as you suggest. The four arrays are:
grid_x.ravel()
grid_y.ravel()
grid_z.ravel()
interpolated_quantity.ravel()

Accessing coordinates within multidimensional arrays - python

I am looking to put the x and y values of the coordinate grid into their own separate arrays in order to perform functions such as Pythagoras etc.
Here's my code below.
x1d = np.linspace(-xlen,xlen,res)
y1d = np.linspace(-ylen,ylen,res)
from itertools import product
coordinates = list(product(x1d, y1d))
xcoord = coordinates[:][:][0]
print np.shape(coordinates), np.shape(xcoord), coordinates
I get that the below code will give me
coordinates = [[x1,y1],[x2,y2],...,[xn,yn]].
How would one go about extracting the following arrays?
xcoord = [x1,x2,...,xn]
ycoord = [x1,x2,...,xn]
Is this the right solution for generating a 2D grid of points where I can perform functions upon each individual x,y point, assigning a resultant value to that point?
Thanks!
You could also use itertools to get your x and y values:
import itertools
x,y=itertools.izip(*coordinates)
# x=(x1,x2,...,xn)
# y=(y1,y2,...,yn)
In regards to the grid, have a look at numpy's meshgrid which could be useful for you. You can use it like so (taken from the example on the linked website):
x=np.arange(-5,5,.1)
y=np.arange(-5,5,.1)
xx,yy=meshgrid(x,y,sparse=True)
xx,yy=np.meshgrid(x,y,sparse=True)
z = np.sin(xx**2 + yy**2) / (xx**2 + yy**2)
h = plt.contourf(x,y,z)
Treating it as a normal list, you can use a list comprehension:
xcord, ycord = [e[0] for e in coordinates], [e[1] for e in coordinates]
Hope this helps!

Categories