Using awkward-array with zip/unzip with two different physics objects

Using awkward-array with zip/unzip with two different physics objects - python

I'm trying to reproduce parts of the Higgs discovery in the Higgs --> 4 leptons channel with open data and making use of awkward. I can do it when the leptons are the same (e.g. 4 muons) with zip/unzip, but is there a way to do it in the 2 muon/2 electron channel? I started with the example in the HSF tutorial
https://hsf-training.github.io/hsf-training-scikit-hep-webpage/04-awkward/index.html
So I now have the following. First get the input file
curl http://opendata.cern.ch/record/12361/files/SMHiggsToZZTo4L.root --output SMHiggsToZZTo4L.root
Then I do the following
import numpy as np
import matplotlib.pylab as plt
import uproot
import awkward as ak
# Helper functions
def energy(m, px, py, pz):
E = np.sqrt( (m**2) + (px**2 + py**2 + pz**2))
return E
def invmass(E, px, py, pz):
m2 = (E**2) - (px**2 + py**2 + pz**2)
if m2 < 0:
m = -np.sqrt(-m2)
else:
m = np.sqrt(m2)
return m
def convert(pt, eta, phi):
px = pt * np.cos(phi)
py = pt * np.sin(phi)
pz = pt * np.sinh(eta)
return px, py, pz
####
# Read in the file
infile_name = 'SMHiggsToZZTo4L.root'
infile = uproot.open(infile_name)
# Convert to Cartesian
muon_pt = infile_signal['Events/Muon_pt'].array()
muon_eta = infile_signal['Events/Muon_eta'].array()
muon_phi = infile_signal['Events/Muon_phi'].array()
muon_q = infile_signal['Events/Muon_charge'].array()
muon_mass = infile_signal['Events/Muon_mass'].array()
muon_px,muon_py,muon_pz = convert(muon_pt, muon_eta, muon_phi)
muon_energy = energy(muon_mass, muon_px, muon_py, muon_pz)
# Do the magic
nevents = len(infile['Events/Muon_pt'].array())
# nevents = 1000 # For testing
max_entries = nevents
muons = ak.zip({
"px": muon_px[0:max_entries],
"py": muon_py[0:max_entries],
"pz": muon_pz[0:max_entries],
"e": muon_energy[0:max_entries],
"q": muon_q[0:max_entries],
})
quads = ak.combinations(muons, 4)
mu1, mu2, mu3, mu4 = ak.unzip(quads)
mass_try = (mu1.e + mu2.e + mu3.e + mu4.e)**2 - ((mu1.px + mu2.px + mu3.px + mu4.px)**2 + (mu1.py + mu2.py + mu3.py + mu4.py)**2 + (mu1.pz + mu2.pz + mu3.pz + mu4.pz)**2)
mass_try = np.sqrt(mass_try)
qtot = mu1.q + mu2.q + mu3.q + mu4.q
plt.hist(ak.flatten(mass_try[qtot==0]), bins=100,range=(0,300));
And the histogram looks good!
So how would I do this for 2-electron + 2-muon combinations? I would guess there's a way to make lepton_xxx arrays? But I'm not sure how to do this elegantly (quickly) such that I could also create a flag to keep track of what the lepton combinations are?
Thanks!
Matt

This could be answered in a variety of ways:
make a union array (mixed data types) of electrons and muons
make an array of electrons and muons that are the same type, but have a flag to indicate flavor (electron vs muon)
use ak.combinations with n=2 for the muons, ak.combinations with n=2 again for the electrons, and then combine them with ak.cartesian (and deal with tuples of tuples, rather than one level of tuples, which would mean two calls to ak.unzip)
break the electron and muon collections down into single-charge collections. You'll want exactly 1 positive muon, 1 negative muon, 1 positive electron, and 1 negative electron, so that would be an ak.cartesian of the four collections.
I'll go with the last method because I've decided that it's easiest.
Another thing you probably want to know about is the Vector library. After
import vector
vector.register_awkward()
you don't have to do explicit coordinate transformations or mass calculations. I'll be using that. Here's how I read in the data:
infile = uproot.open("/tmp/SMHiggsToZZTo4L.root")
muon_branch_arrays = infile["Events"].arrays(filter_name="Muon_*")
electron_branch_arrays = infile["Events"].arrays(filter_name="Electron_*")
muons = ak.zip({
"pt": muon_branch_arrays["Muon_pt"],
"phi": muon_branch_arrays["Muon_phi"],
"eta": muon_branch_arrays["Muon_eta"],
"mass": muon_branch_arrays["Muon_mass"],
"charge": muon_branch_arrays["Muon_charge"],
}, with_name="Momentum4D")
electrons = ak.zip({
"pt": electron_branch_arrays["Electron_pt"],
"phi": electron_branch_arrays["Electron_phi"],
"eta": electron_branch_arrays["Electron_eta"],
"mass": electron_branch_arrays["Electron_mass"],
"charge": electron_branch_arrays["Electron_charge"],
}, with_name="Momentum4D")
And this reproduces your plot:
quads = ak.combinations(muons, 4)
quad_charge = quads["0"].charge + quads["1"].charge + quads["2"].charge + quads["3"].charge
mu1, mu2, mu3, mu4 = ak.unzip(quads[quad_charge == 0])
plt.hist(ak.flatten((mu1 + mu2 + mu3 + mu4).mass), bins=100, range=(0, 200));
The quoted number slices (e.g. "0" and "1") are picking tuple fields, rather than array entries; it's a manual ak.unzip. (The fields could have had real names if I had given a fields argument to ak.combinations.)
For the two muons, two electrons case, let's make four distinct collections.
muons_plus = muons[muons.charge > 0]
muons_minus = muons[muons.charge < 0]
electrons_plus = electrons[electrons.charge > 0]
electrons_minus = electrons[electrons.charge < 0]
The ak.combinations function (with default axis=1) returns the Cartesian product of each list in the array of lists with itself, excluding an item with itself, (x, x) (unless you want that, specify replacement=True), and excluding one of each symmetric pair (x, y)/(y, x).
If you want just a plain Cartesian product of lists from one array with lists from another array, that's ak.cartesian. The four collections muons_plus, muons_minus, electrons_plus, electrons_minus are non-overlapping and we want each four-lepton group to have exactly one from each, so that's a plain Cartesian product.
mu1, mu2, e1, e2 = ak.unzip(ak.cartesian([muons_plus, muons_minus, electrons_plus, electrons_minus]))
plt.hist(ak.flatten((mu1 + mu2 + e1 + e2).mass), bins=100, range=(0, 200));
Separating particles by flavor (electron vs muon) but not by charge is an artifact of the typing constraints imposed by C++. Electrons are measured in different ways from muons, so they have different attributes in the dataset. In a statically typed language like C++, we couldn't put them in the same collection because they differ by type (have different attributes). But charges only differ by value (an integer), so there was no reason they had to be in different collections.
But now, the only thing distinguishing a type is (a) what attributes the objects have and (b) what objects with that set of attributes are named. Here, I named them "Momentum4D" because that allowed Vector to recognize them as Lorentz vectors and give them Lorentz vector methods. But they could have had other attributes (e.g. e_over_p for electrons, but not for muons). charge is a non-Momentum4D attribute.
So if we're going to the trouble to put different-flavor leptons in different arrays, why not put different-charge leptons in different arrays, too? Physically, flavor and charge are both particle properties at the same level of distinction, so we probably want to do the analysis that way. No reason to let a C++ type distinction get in the way of the analysis logic!
Also, you probably want
nevents = infile["Events"].num_entries
to get the number of entries from a TTree without reading the array data from it.

Related

Pyomo accesing/retrieving dual variables - shadow price with binary variables

I am pretty new to optimization in general and pyomo in particular, so I apologize in advance for any rookie mistakes.
I have defined a simple unit commitment exercise (example 3.1 from [1]) using [2] as starting point. I got the correct result and my code runs, but I have a few questions regarding how to access stuff.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import shutil
import sys
import os.path
import pyomo.environ as pyo
import pyomo.gdp as gdp #necessary if you use booleans to select active and innactive units
def bounds_rule(m, n, param='Cap_MW'):
# m because it pases the module
# n because it needs a variable from each set, in this case there was only m.N
return (0, Gen[n][param]) #returns lower and upper bounds.
def unit_commitment():
m = pyo.ConcreteModel()
m.dual = pyo.Suffix(direction=pyo.Suffix.IMPORT_EXPORT)
N=Gen.keys()
m.N = pyo.Set(initialize=N)
m.Pgen = pyo.Var(m.N, bounds = bounds_rule) #amount of generation
m.Rgen = pyo.Var(m.N, bounds = bounds_rule) #amount of generation
# m.OnOff = pyo.Var(m.N, domain=pyo.Binary) #boolean on/off marker
# objective
m.cost = pyo.Objective(expr = sum( m.Pgen[n]*Gen[n]['energy_$MWh'] + m.Rgen[n]*Gen[n]['res_$MW'] for n in m.N), sense=pyo.minimize)
# demand
m.demandP = pyo.Constraint(rule=lambda m: sum(m.Pgen[n] for n in N) == Demand['ener_MWh'])
m.demandR = pyo.Constraint(rule=lambda m: sum(m.Rgen[n] for n in N) == Demand['res_MW'])
# machine production limits
# m.lb = pyo.Constraint(m.N, rule=lambda m, n: Gen[n]['Cap_min']*m.OnOff[n] <= m.Pgen[n]+m.Rgen[n] )
# m.ub = pyo.Constraint(m.N, rule=lambda m, n: Gen[n]['Cap_MW']*m.OnOff[n] >= m.Pgen[n]+m.Rgen[n])
m.lb = pyo.Constraint(m.N, rule=lambda m, n: Gen[n]['Cap_min'] <= m.Pgen[n]+m.Rgen[n] )
m.ub = pyo.Constraint(m.N, rule=lambda m, n: Gen[n]['Cap_MW'] >= m.Pgen[n]+m.Rgen[n])
m.rc = pyo.Suffix(direction=pyo.Suffix.IMPORT)
return m
Gen = {
'GenA' : {'Cap_MW': 100, 'energy_$MWh': 10, 'res_$MW': 0, 'Cap_min': 0},
'GenB' : {'Cap_MW': 100, 'energy_$MWh': 30, 'res_$MW': 25, 'Cap_min': 0},
} #starting data
Demand = {
'ener_MWh': 130, 'res_MW': 20,
} #starting data
m = unit_commitment()
pyo.SolverFactory('glpk').solve(m).write()
m.display()
df = pd.DataFrame.from_dict([m.Pgen.extract_values(), m.Rgen.extract_values()]).T.rename(columns={0: "P", 1: "R"})
print(df)
print("Cost Function result: " + str(m.cost.expr()) + "$.")
print(m.rc.display())
print(m.dual.display())
print(m.dual[m.demandR])
da= {'duals': m.dual[m.demandP],
'uslack': m.demandP.uslack(),
'lslack': m.demandP.lslack(),
}
db= {'duals': m.dual[m.demandR],
'uslack': m.demandR.uslack(),
'lslack': m.demandR.lslack(),
}
duals = pd.DataFrame.from_dict([da, db]).T.rename(columns={0: "demandP", 1: "demandR"})
print(duals)
Here come my questions.
1-Duals/shadow-price: By definition the shadow price are the dual variables of the constraints (m.demandP and m.demandR). Is there a way to access this values and put them into a dataframe without doing that "shitty" thing I did? I mean defining manually da and db and then creating the dataframe as both dictionaries joined? I would like to do something cleaner like the df that holds the results of P and R for each generator in the system.
2-Usually, in the unit commitment problem, binary variables are used in order to "mark" or "select" active and inactive units. Hence the "m.OnOff" variable (commented line). For what I found in [3], duals don't exist in models containing binary variables. After that I rewrote the problem without including binarys. This is not a problem in this simplistic exercise in which all units run, but for larger ones. I need to be able to let the optimization decide which units will and won't run and I still need the shadow-price. Is there a way to obtain the shadow-price/duals in a problem containing binary variables?
I let the constraint definition based on binary variables also there in case someone finds it useful.
Note: The code also runs with the binary variables and gets the correct result, however I couldn't figure out how to get the shadow-price. Hence my question.
[1] Morales, J. M., Conejo, A. J., Madsen, H., Pinson, P., & Zugno, M. (2013). Integrating renewables in electricity markets: operational problems (Vol. 205). Springer Science & Business Media.
[2] https://jckantor.github.io/ND-Pyomo-Cookbook/04.06-Unit-Commitment.html
[3] Dual Variable Returns Nothing in Pyomo

To answer 1, you can dynamically get the constraint objects from your model using model.component_objects(pyo.Constraint) which will return an iterator of your constraints, which keeps your from having to hard-code the constraint names. It gets tricky for indexed variables because you have to do an extra step to get the slacks for each index, not just the constraint object. For the duals, you can iterate over the keys attribute to retrieve those values.
duals_dict = {str(key):m.dual[key] for key in m.dual.keys()}
u_slack_dict = {
# uslacks for non-indexed constraints
**{str(con):con.uslack() for con in m.component_objects(pyo.Constraint)
if not con.is_indexed()},
# indexed constraint uslack
# loop through the indexed constraints
# get all the indices then retrieve the slacks for each index of constraint
**{k:v for con in m.component_objects(pyo.Constraint) if con.is_indexed()
for k,v in {'{}[{}]'.format(str(con),key):con[key].uslack()
for key in con.keys()}.items()}
}
l_slack_dict = {
# lslacks for non-indexed constraints
**{str(con):con.lslack() for con in m.component_objects(pyo.Constraint)
if not con.is_indexed()},
# indexed constraint lslack
# loop through the indexed constraints
# get all the indices then retrieve the slacks for each index of constraint
**{k:v for con in m.component_objects(pyo.Constraint) if con.is_indexed()
for k,v in {'{}[{}]'.format(str(con),key):con[key].lslack()
for key in con.keys()}.items()}
}
# combine into a single df
df = pd.concat([pd.Series(d,name=name)
for name,d in {'duals':duals_dict,
'uslack':u_slack_dict,
'lslack':l_slack_dict}.items()],
axis='columns')
Regarding 2, I agree with #Erwin s comment about solving with the binary variables to get the optimal solution, then removing the binary restriction but fixing the variables to the optimal values to get some dual variable values.

I'm having trouble with my program running too long. I'm not sure if it's running infinitely or if it's just really slow

I'm writing a code to analyze a (8477960, 1) column vector. I am not sure if the while loops in my code are running infinitely, or if the way I've written things is just really slow.
This is a section of my code up to the first while loop, which I cannot get to run to completion.
import numpy as np
import pandas as pd
data = pd.read_csv(r'C:\Users\willo\Desktop\TF_60nm_2_2.txt')
def recursive_low_pass(rawsignal, startcoeff, endcoeff, filtercoeff):
# The current signal length
ni = len(rawsignal) # signal size
rougheventlocations = np.zeros(shape=(100000, 3))
# The algorithm parameters
# filter coefficient
a = filtercoeff
raw = np.array(rawsignal).astype(np.float)
# thresholds
s = startcoeff
e = endcoeff # for event start and end thresholds
# The recursive algorithm
# loop init
ml = np.zeros(ni)
vl = np.zeros(ni)
s = np.zeros(ni)
ml[0] = np.mean(raw) # local mean init
vl[0] = np.var(raw) # local variance init
i = 0 # sample counter
numberofevents = 0 # number of detected events
# main loop
while i < (ni - 1):
i = i + 1
# local mean low pass filtering
ml[i] = a * ml[i - 1] + (1 - a) * raw[i]
# local variance low pass filtering
vl[i] = a * vl[i - 1] + (1 - a) * np.power([raw[i] - ml[i]],2)
# local threshold to detect event start
sl = ml[i] - s * np.sqrt(vl[i])
I'm not getting any error messages, but I've let the program run for more than 10 minutes without any results, so I assume I'm doing something incorrectly.

You should try to vectorize this process rather than accessing/processing indexes (otherwise why use numpy).
The other thing is that you seem to be doing unnecessary work (unless we're not seeing the whole function).
the line:
sl = ml[i] - s * np.sqrt(vl[i])
assigns the variable sl which you're not using inside the loop (or anywhere else). This assignment performs a whole vector multiplication by s which is all zeroes. If you do need the sl variable, you should calculate it outside of the loop using the last encountered values of ml[i] and vl[i] which you can store in temporary variables instead of computing on every loop.
If ni is in the millions, this unnecessary vector multiplication (of millions of zeros) is going to be very costly.
You probably didn't mean to override the value of s = startcoeff with s = np.zeros(ni) in the first place.
In order to vectorize these calculations you will need to use np.acumulate with some customized functions.
The non-numpy equivalent would be as follows (using itertools instead):
from itertools import accumulate
ml = [np.mean(raw)]+[0]*(ni-1)
mlSums = accumulate(zip(ml,raw),lambda r,d:(a*r[0] + (1-a)*d[1],0))
ml = [v for v,_ in mlSums]
vl = [np.var(raw)]+[0]*(ni-1)
vlSums = accumulate(zip(vl,raw,ml),lambda r,d:(a*r[0] + (1-a)*(d[1]-d[2])**2,0,0))
vl = [v for v,_,_ in vlSums]
In each case, the ml / vl vectors are initialized with the base value at index zero and the rest filled with zeroes.
The accumulate(zip(... function calls go through the array and call the lambda function with the current sum in r and the paired elements in d. For the ml calculation, this corresponds to r = (ml[i-1],_) and d = (0,raw[i]).
Because accumulate ouputs the same date type as it is given as input (which are zipped tuples), the actual result is only the first value of the tuples in the mlSums/vlSums lists.
This took 9.7 seconds to process for 8,477,960 items in the lists.

Vectorizing a monte carlo simulation in python

I've recently been working on some code in python to simulate a 2 dimensional U(1) gauge theory using monte carlo methods. Essentially I have an n by n by 2 array (call it Link) of unitary complex numbers (their magnitude is one). I randomly select element of my Link array and propose a random change to the number at that site. I then compute the resulting change in the action that would occur due to that change. I then accept the change with a probability equal to min(1,exp(-dS)), where dS is the change in the action. The code for the iterator is as follows
def iteration(j1,B0):
global Link
Staple = np.zeros((2),dtype=complex)
for i0 in range(0,j1):
x1 = np.random.randint(0,n)
y1 = np.random.randint(0,n)
u1 = np.random.randint(0,1)
Linkrxp1 = np.roll(Link,-1, axis = 0)
Linkrxn1 = np.roll(Link, 1, axis = 0)
Linkrtp1 = np.roll(Link, -1, axis = 1)
Linkrtn1 = np.roll(Link, 1, axis = 1)
Linkrxp1tn1 = np.roll(np.roll(Link, -1, axis = 0),1, axis = 1)
Linkrxn1tp1 = np.roll(np.roll(Link, 1, axis = 0),-1, axis = 1)
Staple[0] = Linkrxp1[x1,y1,1]*Linkrtp1[x1,y1,0].conj()*Link[x1,y1,1].conj() + Linkrxp1tn1[x1,y1,1].conj()*Linkrtn1[x1,y1,0].conj()*Linkrtn1[x1,y1,1]
Staple[1] = Linkrtp1[x1,y1,0]*Linkrxp1[x1,y1,1].conj()*Link[x1,y1,0].conj() + Linkrxn1tp1[x1,y1,0].conj()*Linkrxn1[x1,y1,1].conj()*Linkrxn1[x1,y1,0]
uni = unitary()
Linkprop = uni*Link[x1,y1,u1]
dE3 = (Linkprop - Link[x1,y1,u1])*Staple[u1]
dE1 = B0*np.real(dE3)
d1 = np.random.binomial(1, np.minimum(np.exp(dE1),1))
d = np.random.uniform(low=0,high=1)
if d1 >= d:
Link[x1,y1,u1] = Linkprop
else:
Link[x1,y1,u1] = Link[x1,y1,u1]
At the beginning of program I call a routine called "randomize" to generate K random unitary complex numbers which have small imaginary parts and store them in an array called Cnum of length K. In the same routine I also go through my Link array and set each element to a random unitary complex number. The code is listed below.
def randommatrix():
global Cnum
global Link
for i1 in range(0,K):
C1 = np.random.normal(0,1)
Cnum[i1] = np.cos(C1) + 1j*np.sin(C1)
Cnum[i1+K] = np.cos(C1) - 1j*np.sin(C1)
for i3,i4 in itertools.product(range(0,n),range(0,n)):
C2 = np.random.uniform(low=0, high = 2*np.pi)
Link[i3,i4,0] = np.cos(C2) + 1j*np.sin(C2)
C2 = np.random.uniform(low=0, high = 2*np.pi)
Link[i3,i4,1] = np.cos(C2) + 1j*np.sin(C2)
The following routine is used during the iteration routine to get a random complex number with a small imaginary part (by retrieving a random element of the Cnum array we generated earlier).
def unitary():
I1 = np.random.randint((0),(2*K-1))
mat = Cnum[I1]
return mat
Here is an example of what the iteration routine would be used for. I've written a routine called plaquette, which calculates the mean plaquette (real part of a 1 by 1 closed loop of link variables) for a given B0. The iteration routine is being used to generate new field configurations which are independent of previous configurations. After we get a new field configuration we calculate the plaquette for said configuration. We then repeat this process j1 times using a while loop, and at the end we end up with the mean plaquette.
def Plq(j1,B0):
i5 = 0
Lboot = np.zeros(j1)
while i5<j1:
iteration(25000,B0)
Linkrxp1 = np.roll(Link,-1, axis = 0)
Linkrtp1 = np.roll(Link, -1, axis = 1)
c0 = np.real(Link[:,:,0]*Linkrxp1[:,:,1]*Linkrtp1[:,:,0].conj()*Link[:,:,1].conj())
i5 = i5 + 1
We need to define some variables before we run anything, so here's the initial variables which I define before defining any routines
K = 20000
n = 50
a = 1.0
Link = np.zeros((n,n,2),dtype = complex)
Cnum = np.zeros((2*K), dtype = complex)
This code works, but it is painfully slow. Is there a way that I can use multiprocessing or something to speed this up?

You should use cython and c data types. Another cython link. It's built for fast computation.
You could use multiprocessing, potentially, in one of two cases.
If you have one object that multiple process would need to share you would need to use Manager (see multiprocessing link), Lock, and Array to share the object between processes. However, there is no guarantee this will result in an increased speed since each process needs to lock the link to guarantee your prediction, assuming the predictions are affected by all elements in the link (if a process modifies an element at the same time another process is making a prediction for an element, the prediction wouldn't be based on the most current information).
If your predictions do not take into account the state of the other elements, i.e. it only cares about the one element, then you could break your Link array into segments and divvy chunks out to several processes in a process pool, and when done combine the segments back to one array. This would certainly save time, and you wouldn't have to use any additional multiprocessing mechanisms.

Decompress Bing Maps GeoData (borders) with Python

I have been trying to decompress the Bing Maps location/border shape algorithm using Python. My end goal is to have custom regions/borders created from combining multiple counties and cities, and save the location data to our database for faster and more accurate location based analysis.
My strategy is as follows, but I'm a little stuck on #2, since I can't seem to accurately decompress the code:
Retrieve County/City borders from Bing Maps GeoData API - They refer to it as "shape"
Decompress their "shape" data to get the latitude and longitude of the border points
Remove the points that have the same lat/lng as other shapes (The goal is to make one large shape of multiple counties, as opposed to 5-6 separate shapes)
Compress the end result and save in the database
The function I am using seems to work for the example of 'vx1vilihnM6hR7mEl2Q' provided in the Point Compression Algorithm documentation. However, when I insert something a little more complex, like Cook County, the formula seems to be off (tested by inserting several of the points into different polygon mapping/drawing applications that also use Bing Maps). It basically creates a line at the south side of Chicago that vigorously goes East and West into Indiana, without much North-South movement. Without knowing what the actual coordinates of any counties are supposed to be, I'm not sure how to figure out where I'm going wrong.
Any help is greatly appreciated, even if it is a suggestion of a different strategy.
Here is the python code (sorry for the overuse of the decimal format - my poor attempt to ensure the error wasn't a result of inadvertently losing precision):
safeCharacters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_-'
def decodeBingBorder(compressedData):
latLng = []
pointsArray = []
point = []
lastLat = Decimal(0)
lastLng = Decimal(0)
# Assigns the number of of each character based on the respective index of 'safeCharacters'
# numbers under 32 indicate it is the last number of the combination of the point, and a new point is begun
for char in compressedData:
num = Decimal(safeCharacters.index(char))
if num < 32:
point.append(num)
pointsArray.append(point)
point = []
else:
num -= Decimal(32)
point.append(num)
# Loops through each point to determine the lat/lng of each point
for pnt in pointsArray:
result = Decimal(0)
# This revereses step 7 of the Point Compression Algorithm https://msdn.microsoft.com/en-us/library/jj158958.aspx
for num in reversed(pnt):
if result == 0:
result = num
else:
result = result * Decimal(32) + num
# This was pretty much taken from the Decompression Algorithm (not in Python format) at https://msdn.microsoft.com/en-us/library/dn306801.aspx
# Determine which diaganal it's on
diag = Decimal(int(round((math.sqrt(8 * result + 5) - 1) / 2)))
# submtract the total number of points from lower diagonals, and get the X and Y from what's left over
latY = Decimal(result - Decimal(diag * (diag + 1) / 2))
lngX = Decimal(diag - latY)
# undo the sign encoding
if latY % 2 == 1:
latY = (latY + Decimal(1)) * Decimal(-1)
if lngX % 2 == 1:
lngX = (lngX + Decimal(1)) * Decimal(-1)
latY /= 2
lngX /= 2
# undo the delta encoding
lat = latY + lastLat
lng = lngX + lastLng
lastLat = lat
lastLng = lng
# position the decimal point
lat /= Decimal(100000)
lng /= Decimal(100000)
# append the point to the latLng list in a string format, as opposed to the decimal format
latLng.append([str(lat), str(lng)])
return latLng
The compressed algorithm:
1440iqu9vJ957r8pB_825syB6rh_gXh1-ntqB56sk2B2nq07Mwvq5f64r0m0Fni11ooE4kkvxEy4wzMuotr_DvsiqvFozvt-Lw9znxH-r5oxLv9yxCwhh7wKnk4sB8o0Rvv56D8snW5n1jBg50K4kplClkpqBpgl9F4h4X_sjMs85Ls6qQi6qvqBr188mBqk-pqIxxsx5EpsjosI-8hgIoygDigU94l_4C
This is the result:
[['41.46986', '-87.79031'], ['41.47033', '-87.52569'], ['41.469145',
'-87.23372'], ['41.469395', '-87.03741'], ['41.41014', '-86.7114'],
['41.397545', '-86.64553'], ['41.3691', '-86.47018'], ['41.359585',
'-86.41984'], ['41.353585', '-86.9637'], ['41.355725', '-87.43971'],
['41.35561', '-87.52716'], ['41.3555', '-87.55277'], ['41.354625',
'-87.63504'], ['41.355635', '-87.54018'], ['41.360745', '-87.40351'],
['41.362315', '-87.29262'], ['41.36214', '-87.43194'], ['41.360915',
'-87.44473'], ['41.35598', '-87.58256'], ['41.3551', '-87.59025'],
['41.35245', '-87.59828'], ['41.34782', '-87.60784'], ['41.34506',
'-87.61664'], ['41.34267', '-87.6219'], ['41.34232', '-87.62643'],
['41.33809', '-87.63286'], ['41.33646', '-87.63956'], ['41.32985',
'-87.65056'], ['41.33069', '-87.65596'], ['41.32965', '-87.65938'],
['41.33063', '-87.6628'], ['41.32924', '-87.66659'], ['41.32851',
'-87.71306'], ['41.327105', '-87.75963'], ['41.329515', '-87.64388'],
['41.32698', '-87.73614'], ['41.32876', '-87.61933'], ['41.328275',
'-87.6403'], ['41.328765', '-87.63857'], ['41.32866', '-87.63969'],
['41.32862', '-87.70802']]

As mentioned by rbrundritt, storing the data from Big Maps is against the terms of use. However, there are other sources of this same data available, such as http://nationalmap.gov/boundaries.html
In the interest of solving the problem, and to store this and other coordinate data more efficiently, I solved the problem by removing the 'round' function when calculating 'diag'. This should be what replaces it:
diag = int((math.sqrt(8 * result + 5) - 1) / 2)
All of the 'Decimal' crap I added is not necessary, so you can remove it if you wish.

You can also do
diag=int(round((sqrt(8 * number + 1)/ 2)-1/2.))
Don't forget to subtract longitude*2 from latitude to get N/E coordinates!

Maybe it will be usefull, i found bug in code.
invert pair function should be
diag = math.floor((math.sqrt(8 * result + 1) - 1) / 2)
after fixing this, your implementation work correct

You can't store the boundary data from the Bing Maps GeoData API or any data derived from it in a database. This is against the terms of use of the platform.

Substituting values using SymPy for initial conditions of odeint if they are present

I'm trying to solve a set of coupled ODE's using odeint in python; I initially created the program to solve 3 specified equations (where I use sympy.subs for each value I knew was present), but now I wish to solve N many coupled ODE's. The issue I am running into is how to properly substitute for initial conditions given a these equations were I don't know which values are present (i.e. which ones must be substituted).
For example: For an input of a 3x3 matrix, the set of ODEs I have is:
v0' = -6*v1*v3 - 12*v2*v6
v1' = -3*v1*(6 + v4) - 9*v2*v7 + 3*(3 + v0)*v1
v2' = -6*v2*(9 + v8) + 6*(3 + v0)*v2
v3' = 3*v3*(3 + v0) - 9*v5*v6 - 3*(6 + v4)*v3
v4' = 6*v3*v1 - 6*v5*v7
v5' = 9*v3*v2 - 3*v5*(9 + v8) + 3*(6 + v4)*v5
v6' = 6*v6*(3 + v0) - 6*(9 + v8)*v6
v7' = 9*v6*v1 + 3*v7*(6 + v4) - 3*(9 + v8)*v7
v8' = 12*v6*v2 + 6*v7*v5
where I have initial values for v0-v8 (where v0-v8 are set up as symbols through SymPy) in a vector but without manually substituting in each value, I don't know how to solve this.
Is there a way to substitute the values for v0-v8 without knowing which values of v0-v8 are present. (For different size matrices, the amount of initial values also change - i.e. 4x4 has v0-v15)
Edit: edited so that v0'-v8' are shown as functions of the coupled ODEs. When entering them in the odeint though, the equations are just implicitly equal to v0'-v8' by their position in the vector.
v0-v8 are created by:
v = sy.symbols('v0:%d'%matSize, commutative=False)
where matSize is an input int that correlates to the size of the input matrix.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Using awkward-array with zip/unzip with two different physics objects - python

Related

Pyomo accesing/retrieving dual variables - shadow price with binary variables

I'm having trouble with my program running too long. I'm not sure if it's running infinitely or if it's just really slow

Vectorizing a monte carlo simulation in python

Decompress Bing Maps GeoData (borders) with Python

Substituting values using SymPy for initial conditions of odeint if they are present

Categories

Resources