My script calculates the location error using a set of the equation for different values of x and y and stores the output into an empty array t_error. However, there are two issues that need to be resolved:
1: How to store the output in a 20_by_20 matrix instead of a 400_by_1 dimension.
2: How to make a contour plot (error surface) using x, y, and out_put parameter that is t_error in our case.
The sample script is as below:
**import pandas as pd
import numpy as np
import math
ev_loc= pd.read_csv("test_grid.txt", sep='\t',header=None)
x=np.array(ev_loc[1])
y=np.array(ev_loc[0])
v=3.5
t_error=[]
for s in x:
for t in y:
for i, j, k in [[73.9,33.1, 1.268571], [73.5,33.1, 1.268571], [73.4,33.1, 2.854286], [73.7,33.2, 0.317143],[73.7,33.0, 0.317143]]:
u=((np.sqrt((t-j)**2 + (s-i)**2)/v)*111 - k)
v=u*u
t_error.append(float(v))
df_hr = pd.DataFrame(t_error)
numbers = np.array(df_hr)
window_size = 5
i = 0
moving_averages = []
while i < len(numbers) - window_size + 1:
this_window = numbers[i : i + window_size]
window_average = sum(this_window)
moving_averages.append(window_average)
i += 5
Error = pd.DataFrame(moving_averages)
Error.to_csv('test_total_error.csv')
print(Error)**
The data of test_grid.txt is as below
x1=np.linspace(73,75,num=41)
y1=np.linspace(33,35,num=41)
v=3.5
t_error=[]
for i, j, k in [[71.91500,33.82850, 57.2], [72.32200,33.16267, 38.28], [72.57900, 33.61317, 37.48], [73.44883, 33.83300, 27.8], [71.52967,33.15267, 58.8],
[73.27017,33.65167, 18.44], [73.14017, 33.75200, 29.97], [72.46550,32.63183, 39.98], [73.22900, 32.99867, 14.77], [72.67167, 31.92100, 48.71],
[71.91817, 32.53983, 54.73],[71.92333,33.04400, 49.67],[71.74417,32.79617, 57.39]]:
u=((np.sqrt((y1-j)**2 + (x1-i)**2)/v)*111 - k)
c=np.sum(u)
t_error.append(c)
plt.plot(t_error)
plt.show()
What is the error suppose to show?
I don't understand why the following code output the same random variables for simulated_returns_pr from the SECOND loop (same for the 2 charts from the function). Actually I removed some code but all following variable which should be different are also the same from the SECOND loop. I am missing something but do not understand. Any contribution would be appreciated.
My code:
logR= timeseries
i=1
while i < 5:
simulated_returns_pr= np.random.normal(loc=mean(logR)*30, scale=stdev(logR)*np.sqrt(30.), size=30)
seed = 2
N = 30
def Brownian(seed, N):
np.random.seed(seed)
dt = 1./N # time step
b = simulated_returns_pr*np.sqrt(dt)
W = np.cumsum(b) # brownian path
return W, b
b = Brownian(seed, N)[1]
W = Brownian(seed, N)[0]
W = np.insert(W, 0, 0.)
plt.rcParams['figure.figsize'] = (10,8)
xb = np.linspace(1, len(b), len(b))
plt.plot(xb, b)
plt.title('Brownian Increments')
plt.show()
xw = np.linspace(1, len(W), len(W))
plt.plot(xw, W)
plt.title('Brownian Motion')
plt.show()
i += 1
Output simulated_returns_pr:
[ 0.012191 1.16322303 -0.23225735 -0.12357125 0.35687974 1.02187274
0.25248517 0.74665974 0.54373161 0.43677913 0.69960184 -0.81226681
0.50380517 -0.25108897 0.47459444 0.49541601 0.79958083 -0.20233765
0.5142276 -0.31340253 0.46332258 0.48350956 0.06662023 0.53800548
-0.01440759 -0.23280276 -0.07377719 -0.29948791 0.15798112 0.10707121]
[-0.10796927 0.07350919 -0.97356921 0.9275805 -0.80101665 -0.32191758
0.35499571 -0.52506813 -0.43075947 -0.35577774 0.37944815 1.25577886
0.12274682 -0.4609512 0.37320789 -0.19828379 0.09220437 0.69335439
-0.27465829 0.10637854 -0.3402222 0.02308293 0.2309978 -0.3959363
-0.06873477 -0.01706476 -0.21917336 -0.49603296 -0.61363441 0.02456247]
[-0.10796927 0.07350919 -0.97356921 0.9275805 -0.80101665 -0.32191758
0.35499571 -0.52506813 -0.43075947 -0.35577774 0.37944815 1.25577886
0.12274682 -0.4609512 0.37320789 -0.19828379 0.09220437 0.69335439
-0.27465829 0.10637854 -0.3402222 0.02308293 0.2309978 -0.3959363
-0.06873477 -0.01706476 -0.21917336 -0.49603296 -0.61363441 0.02456247]
[-0.10796927 0.07350919 -0.97356921 0.9275805 -0.80101665 -0.32191758
0.35499571 -0.52506813 -0.43075947 -0.35577774 0.37944815 1.25577886
0.12274682 -0.4609512 0.37320789 -0.19828379 0.09220437 0.69335439
-0.27465829 0.10637854 -0.3402222 0.02308293 0.2309978 -0.3959363
-0.06873477 -0.01706476 -0.21917336 -0.49603296 -0.61363441 0.02456247]
I have two loops that runs for a different x and y coordinates and for each different (x,y) coordinates, a linear equation is being solved for force 1 and force 2 using matrices method i.e. finding the inverse of A if Ax = C. For each loop it gives an answer as a matrix where first element is force 1 and 2nd element is force 2 at those specific coordinates. Here's my code:
import numpy as np
from scipy import linalg
def Force():
Force1 = np.zeros((160,90))
Force2 = np.zeros((160,90))
for x in np.arange(0,16.1,0.1):
for y in np.arange(1,9.1,0.1):
l1 = np.hypot(x,y)
l2 = np.hypot(15-x,y)
A = np.array([[(x/l1),((x-15)/l2)],[(y/l1),(y/l2)]])
c = np.array([[0],[70*9.81]])
F = linalg.solve(A,c)
Force1[x,y] = F[0]
Force2[x,y] = F[1]
print("Force 1 = {} \nForce 2 = {}\n".format(F[0], F[1]))
so at each point (x,y) a matrix [[Force 1],[Force 2]] is solved. Now I would like to append all the Force1(s) into a list of Force1[x,y] and similarly for Forces2(s) so that I can do
plt.imshow[Force1]
plt.imshow[Force2]
to plot a 2 heatmaps. How would I go about doing that?
This solves your issue - you were trying to assign to indices in Force1 and Force2 of type float. I've changed the for loops to use enumerate instead, and tweaked the assignment so it assigns F[0][0] and F[1][0].
import numpy as np
from scipy import linalg
def Force():
Force1 = np.zeros((160,90))
Force2 = np.zeros((160,90))
for i, x in enumerate(np.arange(0,16,0.1)):
for j, y in enumerate(np.arange(1,9,0.1)):
l1 = np.hypot(x,y)
l2 = np.hypot(15-x,y)
A = np.array([[(x/l1),((x-15)/l2)],[(y/l1),(y/l2)]])
c = np.array([[0],[70*9.81]])
F = linalg.solve(A,c)
Force1[i, j] = F[0][0]
Force2[i, j] = F[1][0]
# print("Force 1 = {} \nForce 2 = {}\n".format(F[0], F[1]))
plt.imshow(Force1)
plt.show()
plt.imshow(Force2)
plt.show()
Force()
The generated plots are:
and
Consider I have these lists:
l = [5,6,7,8,9,10,5,15,20]
m = [10,5]
I want to get the index of m in l. I used list comprehension to do that:
[(i,i+1) for i,j in enumerate(l) if m[0] == l[i] and m[1] == l[i+1]]
Output : [(5,6)]
But if I have more numbers in m, I feel its not the right way. So is there any easy approach in Python or with NumPy?
Another example:
l = [5,6,7,8,9,10,5,15,20,50,16,18]
m = [10,5,15,20]
The output should be:
[(5,6,7,8)]
The easiest way (using pure Python) would be to iterate over the items and first only check if the first item matches. This avoids doing sublist comparisons when not needed. Depending on the contents of your l this could outperform even NumPy broadcasting solutions:
def func(haystack, needle): # obviously needs a better name ...
if not needle:
return
# just optimization
lengthneedle = len(needle)
firstneedle = needle[0]
for idx, item in enumerate(haystack):
if item == firstneedle:
if haystack[idx:idx+lengthneedle] == needle:
yield tuple(range(idx, idx+lengthneedle))
>>> list(func(l, m))
[(5, 6, 7, 8)]
In case your interested in speed I checked the performance of the approaches (borrowing from my setup here):
import random
import numpy as np
# strided_app is from https://stackoverflow.com/a/40085052/
def strided_app(a, L, S ): # Window len = L, Stride len/stepsize = S
nrows = ((a.size-L)//S)+1
n = a.strides[0]
return np.lib.stride_tricks.as_strided(a, shape=(nrows,L), strides=(S*n,n))
def pattern_index_broadcasting(all_data, search_data):
n = len(search_data)
all_data = np.asarray(all_data)
all_data_2D = strided_app(np.asarray(all_data), n, S=1)
return np.flatnonzero((all_data_2D == search_data).all(1))
# view1D is from https://stackoverflow.com/a/45313353/
def view1D(a, b): # a, b are arrays
a = np.ascontiguousarray(a)
void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
return a.view(void_dt).ravel(), b.view(void_dt).ravel()
def pattern_index_view1D(all_data, search_data):
a = strided_app(np.asarray(all_data), L=len(search_data), S=1)
a0v, b0v = view1D(np.asarray(a), np.asarray(search_data))
return np.flatnonzero(np.in1d(a0v, b0v))
def find_sublist_indices(haystack, needle):
if not needle:
return
# just optimization
lengthneedle = len(needle)
firstneedle = needle[0]
restneedle = needle[1:]
for idx, item in enumerate(haystack):
if item == firstneedle:
if haystack[idx+1:idx+lengthneedle] == restneedle:
yield tuple(range(idx, idx+lengthneedle))
def Divakar1(l, m):
return np.squeeze(pattern_index_broadcasting(l, m)[:,None] + np.arange(len(m)))
def Divakar2(l, m):
return np.squeeze(pattern_index_view1D(l, m)[:,None] + np.arange(len(m)))
def MSeifert(l, m):
return list(find_sublist_indices(l, m))
# Timing setup
timings = {Divakar1: [], Divakar2: [], MSeifert: []}
sizes = [2**i for i in range(5, 20, 2)]
# Timing
for size in sizes:
l = [random.randint(0, 50) for _ in range(size)]
m = [random.randint(0, 50) for _ in range(10)]
larr = np.asarray(l)
marr = np.asarray(m)
for func in timings:
# first timings:
# res = %timeit -o func(l, m)
# second timings:
if func is MSeifert:
res = %timeit -o func(l, m)
else:
res = %timeit -o func(larr, marr)
timings[func].append(res)
%matplotlib notebook
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure(1)
ax = plt.subplot(111)
for func in timings:
ax.plot(sizes,
[time.best for time in timings[func]],
label=str(func.__name__))
ax.set_xscale('log')
ax.set_yscale('log')
ax.set_xlabel('size')
ax.set_ylabel('time [seconds]')
ax.grid(which='both')
ax.legend()
plt.tight_layout()
In case your l and m are lists my function outperforms the NumPy solutions for all sizes:
But in case you have these as numpy arrays you'll get faster results for large arrays (size > 1000 elements) when using Divakars NumPy solutions:
You are basically looking for the starting indices of a list in another list.
Approach #1 : One approach to solve it would be to create sliding windows of the elements in list in which we are searching, giving us a 2D array and then simply use NumPy broadcasting to perform broadcasted comparison against the search list against each row of the 2D sliding window version obtained earlier. Thus, one method would be -
# strided_app is from https://stackoverflow.com/a/40085052/
def strided_app(a, L, S ): # Window len = L, Stride len/stepsize = S
nrows = ((a.size-L)//S)+1
n = a.strides[0]
return np.lib.stride_tricks.as_strided(a, shape=(nrows,L), strides=(S*n,n))
def pattern_index_broadcasting(all_data, search_data):
n = len(search_data)
all_data = np.asarray(all_data)
all_data_2D = strided_app(np.asarray(all_data), n, S=1)
return np.flatnonzero((all_data_2D == search_data).all(1))
out = np.squeeze(pattern_index_broadcasting(l, m)[:,None] + np.arange(len(m)))
Sample runs -
In [340]: l = [5,6,7,8,9,10,5,15,20,50,16,18]
...: m = [10,5,15,20]
...:
In [341]: np.squeeze(pattern_index_broadcasting(l, m)[:,None] + np.arange(len(m)))
Out[341]: array([5, 6, 7, 8])
In [342]: l = [5,6,7,8,9,10,5,15,20,50,16,18,10,5,15,20]
...: m = [10,5,15,20]
...:
In [343]: np.squeeze(pattern_index_broadcasting(l, m)[:,None] + np.arange(len(m)))
Out[343]:
array([[ 5, 6, 7, 8],
[12, 13, 14, 15]])
Approach #2 : Another method would be to get the sliding window and then get the row-wise scalar view into the data to be search data and the data to be search for, giving us 1D data to work with, like so -
# view1D is from https://stackoverflow.com/a/45313353/
def view1D(a, b): # a, b are arrays
a = np.ascontiguousarray(a)
void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
return a.view(void_dt).ravel(), b.view(void_dt).ravel()
def pattern_index_view1D(all_data, search_data):
a = strided_app(np.asarray(all_data), L=len(search_data), S=1)
a0v, b0v = view1D(np.asarray(a), np.asarray(search_data))
return np.flatnonzero(np.in1d(a0v, b0v))
out = np.squeeze(pattern_index_view1D(l, m)[:,None] + np.arange(len(m)))
2020 Versions
In search of more easy/compact approaches, we could look into scikit-image's view_as_windows for getting sliding windows with a built-in. I am assuming arrays as inputs for less messy code. For lists as input, we have to use np.asarray() as shown earlier.
Approach #3 : Basically a derivative of pattern_index_broadcasting with view_as_windows for a one-liner with a as the larger data and b is the array to be searched -
from skimage.util import view_as_windows
np.flatnonzero((view_as_windows(a,len(b))==b).all(1))[:,None]+np.arange(len(b))
Approach #4 : For a small number of matches from b in a, we could optimize, by looking for first element match from b to reduce the dataset size for searches -
mask = a[:-len(b)+1]==b[0]
mask[mask] = (view_as_windows(a,len(b))[mask]).all(1)
out = np.flatnonzero(mask)[:,None]+np.arange(len(b))
Approach #5 : For a small sized b, we could simply run a loop for each of the elements in b and perform bitwise and-reduction -
mask = np.bitwise_and.reduce([a[i:len(a)-len(b)+1+i]==b[i] for i in range(len(b))])
out = np.flatnonzero(mask)[:,None]+np.arange(len(b))
Just making the point that #MSeifert's approach can, of course, also be implemented in numpy:
def pp(h,n):
nn = len(n)
NN = len(h)
c = (h[:NN-nn+1]==n[0]).nonzero()[0]
if c.size==0: return
for i,l in enumerate(n[1:].tolist(),1):
c = c[h[i:][c]==l]
if c.size==0: return
return np.arange(c[0],c[0]+nn)
def get_data(l1,l2):
d=defaultdict(list)
[d[item].append(index) for index,item in enumerate(l1)]
print(d)
Using defaultdict to store indices of elements from other list.