Some problems with the arrays dimensions i guess - python

With this code i want to find a minimum in from a two dimensional function using the newton method:
from numpy import array
from numpy.linalg import solve, norm
def newton2d(f, df, x, tol=1e-12, maxit=50):
x = atleast_2d(x)
for i in range(maxit):
s = solve(df(x), f(x))
x -=s
if norm(s)<tol: print(x); print(i); break
f = lambda x: array([x[0]**2-x[1]**4, x[0]-x[1]**3])
df = lambda x: array([[2*x[0], -4*x[1]**3], [1, -3*x[1]**2]])
x = array([0.7, 0.7])
newton2d(f,df,x)
i think this code should work, but i get an error which goes as follows:
IndexError: index 1 is out of bounds for axis 0 with size 1
thanks for any help!!

Related

Replace outlier values with NaN in numpy? (preserve length of array)

I have an array of magnetometer data with artifacts every two hours due to power cycling.
I'd like to replace those indices with NaN so that the length of the array is preserved.
Here's a code example, adapted from https://www.kdnuggets.com/2017/02/removing-outliers-standard-deviation-python.html.
import numpy as np
import plotly.express as px
# For pulling data from CDAweb:
from ai import cdas
import datetime
# Import data:
start = datetime.datetime(2016, 1, 24, 0, 0, 0)
end = datetime.datetime(2016, 1, 25, 0, 0, 0)
data = cdas.get_data(
'sp_phys',
'THG_L2_MAG_'+ 'PG2',
start,
end,
['thg_mag_'+ 'pg2']
)
x =data['UT']
y =data['VERTICAL_DOWN_-_Z']
def reject_outliers(y): # y is the data in a 1D numpy array
n = 5 # 5 std deviations
mean = np.mean(y)
sd = np.std(y)
final_list = [x for x in y if (x > mean - 2 * sd)]
final_list = [x for x in final_list if (x < mean + 2 * sd)]
return final_list
px.scatter(reject_outliers(y))
print('Length of y: ')
print(len(y))
print('Length of y with outliers removed (should be the same): ')
print(len(reject_outliers(y)))
px.line(y=y, x=x)
# px.scatter(y) # It looks like the outliers are successfully dropped.
# px.line(y=reject_outliers(y), x=x) # This is the line I'd like to see work.
When I run 'px.scatter(reject_outliers(y))', it looks like the outliers are successfully getting dropped:
...but that's looking at the culled y vector relative to the index, rather than the datetime vector x as in the above plot. As the debugging text indicates, the vector is shortened because the outlier values are dropped rather than replaced.
How can I edit my 'reject_outliers()` function to assign those values to NaN, or to adjacent values, in order to keep the length of the array the same so that I can plot my data?
Use else in the list comprehension along the lines of:
[x if x_condition else other_value for x in y]
Got a less compact version to work. Full code:
import numpy as np
import plotly.express as px
# For pulling data from CDAweb:
from ai import cdas
import datetime
# Import data:
start = datetime.datetime(2016, 1, 24, 0, 0, 0)
end = datetime.datetime(2016, 1, 25, 0, 0, 0)
data = cdas.get_data(
'sp_phys',
'THG_L2_MAG_'+ 'PG2',
start,
end,
['thg_mag_'+ 'pg2']
)
x =data['UT']
y =data['VERTICAL_DOWN_-_Z']
def reject_outliers(y): # y is the data in a 1D numpy array
mean = np.mean(y)
sd = np.std(y)
final_list = np.copy(y)
for n in range(len(y)):
final_list[n] = y[n] if y[n] > mean - 5 * sd else np.nan
final_list[n] = final_list[n] if final_list[n] < mean + 5 * sd else np.nan
return final_list
px.scatter(reject_outliers(y))
print('Length of y: ')
print(len(y))
print('Length of y with outliers removed (should be the same): ')
print(len(reject_outliers(y)))
# px.line(y=y, x=x)
px.line(y=reject_outliers(y), x=x) # This is the line I wanted to get working - check!
More compact answer, sent via email by a friend:
In numpy you can select/index based on a Boolean array, and then make assignment with it:
def reject_outliers(y): # y is the data in a 1D numpy array
n = 5 # 5 std deviations
mean = np.mean(y)
sd = np.std(y)
final_list = y.copy()
final_list[np.abs(y - mean) > n * sd] = np.nan
return final_list
I also noticed that you didn’t use the value of n in your example code.
Alternatively, you can use the where method (https://numpy.org/doc/stable/reference/generated/numpy.where.html)
np.where(np.abs(y - mean) > n * sd, np.nan, y)
You don’t need the .copy() if you don’t mind modifying the input array.
Replace np.mean and np.std with np.nanmean and np.nanstd if you want the function to work on arrays that already contain nans, i.e. if you want to use this function recursively.
The answer about using if else in a list comprehension would work, but avoiding the list comprehension makes the function much faster if the arrays are large.

How to append the first element of a matrix onto a list over a loop?

I have two loops that runs for a different x and y coordinates and for each different (x,y) coordinates, a linear equation is being solved for force 1 and force 2 using matrices method i.e. finding the inverse of A if Ax = C. For each loop it gives an answer as a matrix where first element is force 1 and 2nd element is force 2 at those specific coordinates. Here's my code:
import numpy as np
from scipy import linalg
def Force():
Force1 = np.zeros((160,90))
Force2 = np.zeros((160,90))
for x in np.arange(0,16.1,0.1):
for y in np.arange(1,9.1,0.1):
l1 = np.hypot(x,y)
l2 = np.hypot(15-x,y)
A = np.array([[(x/l1),((x-15)/l2)],[(y/l1),(y/l2)]])
c = np.array([[0],[70*9.81]])
F = linalg.solve(A,c)
Force1[x,y] = F[0]
Force2[x,y] = F[1]
print("Force 1 = {} \nForce 2 = {}\n".format(F[0], F[1]))
so at each point (x,y) a matrix [[Force 1],[Force 2]] is solved. Now I would like to append all the Force1(s) into a list of Force1[x,y] and similarly for Forces2(s) so that I can do
plt.imshow[Force1]
plt.imshow[Force2]
to plot a 2 heatmaps. How would I go about doing that?
This solves your issue - you were trying to assign to indices in Force1 and Force2 of type float. I've changed the for loops to use enumerate instead, and tweaked the assignment so it assigns F[0][0] and F[1][0].
import numpy as np
from scipy import linalg
def Force():
Force1 = np.zeros((160,90))
Force2 = np.zeros((160,90))
for i, x in enumerate(np.arange(0,16,0.1)):
for j, y in enumerate(np.arange(1,9,0.1)):
l1 = np.hypot(x,y)
l2 = np.hypot(15-x,y)
A = np.array([[(x/l1),((x-15)/l2)],[(y/l1),(y/l2)]])
c = np.array([[0],[70*9.81]])
F = linalg.solve(A,c)
Force1[i, j] = F[0][0]
Force2[i, j] = F[1][0]
# print("Force 1 = {} \nForce 2 = {}\n".format(F[0], F[1]))
plt.imshow(Force1)
plt.show()
plt.imshow(Force2)
plt.show()
Force()
The generated plots are:
and

pandas, correctly handle numpy arrays inside a row element

I'll give a minimal example where I would create numpy arrays inside row elements of a pandas.DataFrame.
TL;DR: see the screenshot of the DataFrame
This code finds the minimum of a certain function, by using scipy.optimize.brute, which returns the minimum, variable at which the minimum is found and two numpy arrays at which it evaluated the function.
import numpy
import scipy.optimize
import itertools
sin = lambda r, phi, x: r * np.sin(phi * x)
def func(r, x):
x0, fval, grid, Jout = scipy.optimize.brute(
sin, ranges=[(-np.pi, np.pi)], args=(r, x), Ns=10, full_output=True)
return dict(phi_at_min=x0[0], result_min=fval, phis=grid, result_at_grid=Jout)
rs = numpy.linspace(-1, 1, 10)
xs = numpy.linspace(0, 1, 10)
vals = list(itertools.product(rs, xs))
result = [func(r, x) for r, x in vals]
# idk whether this is the best way of generating the DataFrame, but it works
df = pd.DataFrame(vals, columns=['r', 'x'])
df = pd.concat((pd.DataFrame(result), df), axis=1)
df.head()
I expect that this is not how I am supposed to do this and should maybe expand the lists somehow. How do I handle this in a correct, beautiful, and clean way?
So, even though "beautiful and clean" is subject to interpretation, I'll give you mine, which should give you in turn some ideas. I'm leveraging a multiindex so that you can later easily select pairs of phi/result_at_grid for each point in the evaluation grid. I'm also using applyinstead of creating two dataframes.
import numpy
import scipy.optimize
import itertools
sin = lambda r, phi, x: r * np.sin(phi * x)
def func(row):
"""
Accepts a row of a dataframe (a pd.Series).
df.apply(func, axis=1)
returns a pd.Series with the initial (r,x) and the results
"""
r = row['r']
x = row['x']
x0, fval, grid, Jout = scipy.optimize.brute(
sin, ranges=[(-np.pi, np.pi)], args=(r, x), Ns=10, full_output=True)
# Create a multi index series for the phis
phis = pd.Series(grid)
phis.index = pd.MultiIndex.from_product([['Phis'], phis.index])
# same for result at grid
result_at_grid = pd.Series(Jout)
result_at_grid.index = pd.MultiIndex.from_product([['result_at_grid'], result_at_grid.index])
# concat
s = pd.concat([phis, result_at_grid])
# Add these two float results
s['phi_at_min'] = x0[0]
s['result_min'] = fval
# add the initial r,x to reconstruct the index later
s['r'] = r
s['x'] = x
return s
rs = numpy.linspace(-1, 1, 10)
xs = numpy.linspace(0, 1, 10)
vals = list(itertools.product(rs, xs))
df = pd.DataFrame(vals, columns=['r', 'x'])
# Apply func to each row (axis=1)
results = df.apply(func, axis=1)
results.set_index(['r','x'], inplace=True)
results.head().T # Transposing so we can see the output in one go...
Now you can select all values at the evaluation grid point 2 for example
print(results.swaplevel(0,1, axis=1)[2].head()) # Showing only 5 first
Phis result_at_grid
r x
-1.0 0.000000 -1.745329 0.000000
0.111111 -1.745329 0.193527
0.222222 -1.745329 0.384667
0.333333 -1.745329 0.571062
0.444444 -1.745329 0.750415

How to create numpy arrays from list of numbers

I am learning numerical computing in python and tried the following code to integrate a function:
import numpy as np
import scipy.integrate as spi
def integration(z):
if np.isscalar(z):
y, err = spi.quad(lambda x: 1/np.sqrt(1+x),0,z)
" spi.quad returns integrated value with error"
print y # result for scalar input
else:
for x in z:
y, err = spi.quad(lambda x: 1/np.sqrt(1+x),0,x)
print y # result for arrays
return
But the result I get is not an array I need an array for further computation. I get the following result:
z = np.linspace(0,1,10)
>>> integration(z)
0.0
0.108185106779
0.21108319357
0.309401076759
0.403700850309
......
Any help here how should I modify my code to get numpy array
Simple
import numpy as np
import scipy.integrate as spi
def integration(z):
if np.isscalar(z): z = np.asarray([z])
y = np.empty_like(z)
for i in range(z.shape[0]):
y[i], err = spi.quad(lambda x: 1/np.sqrt(1+x),0,z[i])
return y
Test:
>>> z = np.linspace(0,1,10)
>>> intg_z = integration(z)
>>> print intg_z
[ 0. 0.10818511 0.21108319 0.30940108 0.40370085 0.49443826
0.5819889 0.66666667 0.74873708 0.82842712]

how to apply a mask from one array to another array?

I've read the masked array documentation several times now, searched everywhere and feel thoroughly stupid. I can't figure out for the life in me how to apply a mask from one array to another.
Example:
import numpy as np
y = np.array([2,1,5,2]) # y axis
x = np.array([1,2,3,4]) # x axis
m = np.ma.masked_where(y>2, y) # filter out values larger than 5
print m
[2 1 -- 2]
print np.ma.compressed(m)
[2 1 2]
So this works fine.... but to plot this y axis, I need a matching x axis. How do I apply the mask from the y array to the x array? Something like this would make sense, but produces rubbish:
new_x = x[m.mask].copy()
new_x
array([5])
So, how on earth is that done (note the new x array needs to be a new array).
Edit:
Well, it seems one way to do this works like this:
>>> import numpy as np
>>> x = np.array([1,2,3,4])
>>> y = np.array([2,1,5,2])
>>> m = np.ma.masked_where(y>2, y)
>>> new_x = np.ma.masked_array(x, m.mask)
>>> print np.ma.compressed(new_x)
[1 2 4]
But that's incredibly messy! I'm trying to find a solution as elegant as IDL...
I had a similar issue, but involving loads more masking commands and more arrays to apply them. My solution is that I do all the masking on one array and then use the finally masked array as the condition in the mask_where command.
For example:
y = np.array([2,1,5,2]) # y axis
x = np.array([1,2,3,4]) # x axis
m = np.ma.masked_where(y>5, y) # filter out values larger than 5
new_x = np.ma.masked_where(np.ma.getmask(m), x) # applies the mask of m on x
The nice thing is you can now apply this mask to many more arrays without going through the masking process for each of them.
Why not simply
import numpy as np
y = np.array([2,1,5,2]) # y axis
x = np.array([1,2,3,4]) # x axis
m = np.ma.masked_where(y>2, y) # filter out values larger than 5
print list(m)
print np.ma.compressed(m)
# mask x the same way
m_ = np.ma.masked_where(y>2, x) # filter out values larger than 5
# print here the list
print list(m_)
print np.ma.compressed(m_)
code is for Python 2.x
Also, as proposed by joris, this do the work new_x = x[~m.mask].copy() giving an array
>>> new_x
array([1, 2, 4])
This may not bee 100% what OP wanted to know,
but it's a cute little piece of code I use all the time -
if you want to mask several arrays the same way, you can use this generalized function to mask a dynamic number of numpy arrays at once:
def apply_mask_to_all(mask, *arrays):
assert all([arr.shape == mask.shape for arr in arrays]), "All Arrays need to have the same shape as the mask"
return tuple([arr[mask] for arr in arrays])
See this example usage:
# init 4 equally shaped arrays
x1 = np.random.rand(3,4)
x2 = np.random.rand(3,4)
x3 = np.random.rand(3,4)
x4 = np.random.rand(3,4)
# create a mask
mask = x1 > 0.8
# apply the mask to all arrays at once
x1, x2, x3, x4 = apply_mask_to_all(m, x1, x2, x3, x4)

Categories