I have a data structure that looks like a list values and I am trying to compute the (x,y) 2d hermite functions from them using numpy. I'm trying to use as many numpy arrays as possible due to the performance boost you get from getting to Fortran as quickly as possible (I'm expecting x to be in practice many thousands of 3-arrays). Specifically, my code looks like this:
x = np.array([[1., 2., 3.], [4., 5., 6.]])
coefs = np.array([[[1., 0.],[0., 1.]], [[0., 1.], [1., 0.]]])
z = np.array([0., 0.])
z[:] = hermval2d(x[:,0], x[:,1], coefs[:])
This returns an error about the shape of hermval2d, which according to just running the hermval2d function instead of assigning it:
In [XX]: hermval2d(x[:,0], x[:,1], coefs[:])
Out[XX]:
array([[ 9., 81.],
[ 6., 18.]])
I would expect the hermval2d to be a scalar for every x, y, and coefficient matrix, which is what you would expect from the documentation. So what am I missing here? What's the score?
It's right there in the docs :)
hermval2d(x, y, c)
[...]
The shape of the result will be c.shape[2:] + x.shape
In your case this seems to return the Hermite values for x and y evaluated for each ith 2d array in c[:,:,i].
Related
I'm trying to figure out how to do the following broadcast:
I have two tensors, of sizes (n1,N) and (n2,N)
What I want to do is to multiply each row of the first tensor, with each row of the second tensor, and then sum each of there multiplied row result, so that my final tensor should be of the form (n1,n2).
I tried this:
x1*torch.reshape(x2,(x2.size(dim=0),x2.size(dim=1),1))
But obviously this doesn't work.. Can't figure out how to do this
What you are looking for is the Tensordot command from PyTorch and Numpy
Since you want to compute dot product along N, which is dimension 1 of x1, and dimension 1 of x2 tensor, you need to perform a contraction along the first axes of both Tensors by supplying a ([1], [1]) to dims arg in Tensordot. This means Torch will sum products of x1 and x2 elements over the specified x1-axes 1 and specified x2-axes 1 respectively. The args to supply to dims is quite confusing, here's a useful thread to help understand how to use Tensordothere
x1 = torch.arange(6.).reshape(2,3)
>>> tensor([[0., 1., 2.],
[3., 4., 5.]])
# x1 is Tensor of shape (2,3)
x2 = torch.arange(9.).reshape(3,3)
>>> tensor([[0., 1., 2.],
[3., 4., 5.],
[6., 7., 8.]])
# x2 is Tensor of shape (3,3)
x = torch.tensordot(x1, x2, dims=([1],[1]))
>>> tensor([[ 5., 14., 23.],
[14., 50., 86.]])
# x is Tensor of shape (2,3)
What you describe seems to be effectively the same as performing a matrix multiplication between the first tensor and the transpose of the second tensor. This can be done as:
torch.matmul(x1, x2.T)
I have a 150x4 matrix X which I created from a pandas dataframe using the following code:
X = df_new.as_matrix()
I have to normalize it using this function:
I know that Uj is the mean val of j, and that σ j is the standard deviation of j, but I don't understand what j is. I'm having a little trouble understanding what the bar on X is, and I'm confused by the commas in the equation (I don't know if they have any significance or not).
Can anyone help me understand what this equation means so I can then write the normalization using sklearn?
You don't actually need to write code for the normalization yourself - it comes ready with sklearn.preprocessing.scale.
Here is an example from the docs:
>>> from sklearn import preprocessing
>>> import numpy as np
>>> X_train = np.array([[ 1., -1., 2.],
... [ 2., 0., 0.],
... [ 0., 1., -1.]])
>>> X_scaled = preprocessing.scale(X_train)
>>> X_scaled
array([[ 0. ..., -1.22..., 1.33...],
[ 1.22..., 0. ..., -0.26...],
[-1.22..., 1.22..., -1.06...]])
When used with the default setting axis=0, the mormalization happens column-wise (i.e. for each column j, as in your equestion). As a result, it is easy to confirm that scaled data has zero mean and unit variance:
>>> X_scaled.mean(axis=0)
array([ 0., 0., 0.])
>>> X_scaled.std(axis=0)
array([ 1., 1., 1.])
The indexes for matrix X are row (i) and column (j). Hence, X,j means column j of matrix X. I.e. normalize each column of matrix X to z-scores.
You can do that using pandas:
df_new_zscores = (df_new - df_new.mean()) / df_new.std()
I do not know pandas but I think that the equation means that the normalized matrix is given by
You subtract the empirical mean and devide by the empirical standard deviation per column.
You sometimes use this for Principal Component Analysis.
I have a numpy.array with a dimension dim_array. I'm looking forward to obtain a median filter like scipy.signal.medfilt(data, window_len).
This in fact doesn't work with numpy.array may be because the dimension is (dim_array, 1) and not (dim_array, ).
How to obtain such filter?
Next, another question, how can I obtain other filter, i.e., min, max, mean?
Based on this post, we could create sliding windows to get a 2D array of such windows being set as rows in it. These windows would merely be views into the data array, so no memory consumption and thus would be pretty efficient. Then, we would simply use those ufuncs along each row axis=1.
Thus, for example sliding-median` could be computed like so -
np.median(strided_app(data, window_len,1),axis=1)
For the other ufuncs, just use the respective ufunc names there : np.min, np.max & np.mean. Please note this is meant to give a generic solution to use ufunc supported functionality.
For the best performance, one must still look into specific functions that are built for those purposes. For the four requested functions, we have the builtins, like so -
Median : scipy.signal.medfilt.
Max : scipy.ndimage.filters.maximum_filter1d.
Min : scipy.ndimage.filters.minimum_filter1d.
Mean : scipy.ndimage.filters.uniform_filter1d
The fact that applying of a median filter with the window size 1 will not change the array gives us a freedom to apply the median filter row-wise or column-wise.
For example, this code
from scipy.ndimage import median_filter
import numpy as np
arr = np.array([[1., 2., 3.], [4., 5., 6.], [7., 8., 9.]])
median_filter(arr, size=3, cval=0, mode='constant')
#with cval=0, mode='constant' we set that input array is extended with zeros
#when window overlaps edges, just for visibility and ease of calculation
outputs an expected filtered with window (3, 3) array
array([[0., 2., 0.],
[2., 5., 3.],
[0., 5., 0.]])
because median_filter automatically extends the size to all dimensions, so the same effect we can get with:
median_filter(arr, size=(3, 3), cval=0, mode='constant')
Now, we can also apply median_filter row-wise with setting 1 to the first element of size
median_filter(arr, size=(1, 3), cval=0, mode='constant')
Output:
array([[1., 2., 2.],
[4., 5., 5.],
[7., 8., 8.]])
And column-wise with the same logic
median_filter(arr, size=(3, 1), cval=0, mode='constant')
Output:
array([[1., 2., 3.],
[4., 5., 6.],
[4., 5., 6.]])
I'm defining a function which will return a 3-d grid. During it, I use a function defined already that returns a 2-d array. I want to join these 2-d arrarys to form the 3-d during an iteration but I've looked at functions like meshgrid(), dstack(), concatenate() but can't seem to get any of them to fit right into the code.
The program models the spread of waves from a point source on the 2-d array, and the 3-d array shows how the displacement of the medium changes over the course of a wavelength.
def make_wave_snapshot(size,wavelength,phase):
waves_array = np.zeros((size,size),np.float)
if size%2==0:
for y in range(size):
for x in range(size):
r = math.hypot((size/2 - x - 0.5),(size/2 - y - 0.5))
d = np.sin((2*math.pi*r/wavelength)-phase)/np.sqrt(r)
waves_array[y,x] = d
dp.display_2d_array(waves_array) #This is in another module altogether
return waves_array #Displays array showing values
else:
return 'Please use integer of size.'
def make_wave_sequence(size,wavelength,nsteps):
waves_sequence = np.zeros((nsteps,size,size),np.float)
if nsteps%1==0:
for z in range(nsteps):
make_wave_snapshot(size,wavelength,(2*math.pi*z/nsteps))
waves_sequence = ???
return waves_sequence #Displays array showing values
else:
return 'Please use positive integer for number of steps'
The issue is turning the 'wave_array's into a 'wave_sequence'. Generous commenting would be very appreciated if you write any code. Many thanks!
If I understand correctly you have a three dimensional array, something like:
wave = np.zeros((2, 2, 2), np.float)
([[[0., 0.],
[0., 0.]],
[[0., 0.],
[0., 0.]]])
And you want to insert a two dimensional array, returned from your function like:
([[ 1., 2.],
[ 3., 4.]])
Such that your 3D array is now:
([[[1., 2.],
[3., 4.]],
[[0., 0.],
[0., 0.]]])
After the first iteration of your for loop. If that is correct, then it's actually pretty simple and you're most of the way there. You can assign an "element" to your 3D array that is a 2D array as long as you select the correct entry:
for z in range(nsteps):
waves_sequence[z] = make_wave_snapshot(size,wavelength,(2*math.pi*z/nsteps))
In numpy if you want to calculate the sinus of each entry of a matrix (elementise) then
a = numpy.arange(0,27,3).reshape(3,3)
numpy.sin(a)
will get the job done! If you want the power let's say to 2 of each entry
a**2
will do it.
But if you have a sparse matrix things seem more difficult. At least I haven't figured a way to do that besides iterating over each entry of a lil_matrix format and operate on it.
I've found this question on SO and tried to adapt this answer but I was not succesful.
The Goal is to calculate elementwise the squareroot (or the power to 1/2) of a scipy.sparse matrix of CSR format.
What would you suggest?
The following trick works for any operation which maps zero to zero, and only for those operations, because it only touches the non-zero elements. I.e., it will work for sin and sqrt but not for cos.
Let X be some CSR matrix...
>>> from scipy.sparse import csr_matrix
>>> X = csr_matrix(np.arange(10).reshape(2, 5), dtype=np.float)
>>> X.A
array([[ 0., 1., 2., 3., 4.],
[ 5., 6., 7., 8., 9.]])
The non-zero elements' values are X.data:
>>> X.data
array([ 1., 2., 3., 4., 5., 6., 7., 8., 9.])
which you can update in-place:
>>> X.data[:] = np.sqrt(X.data)
>>> X.A
array([[ 0. , 1. , 1.41421356, 1.73205081, 2. ],
[ 2.23606798, 2.44948974, 2.64575131, 2.82842712, 3. ]])
Update In recent versions of SciPy, you can do things like X.sqrt() where X is a sparse matrix to get a new copy with the square roots of elements in X.