derivative with numpy.diff problems - python

I have this problem:
I have an array of 7 elements:
vector = [array([ 76.27789424]), array([ 76.06870298]), array([ 75.85016864]), array([ 75.71155968]), array([ 75.16982466]), array([ 73.08832948]), array([ 68.59935515])]
(this array is the result of a lot of operation)
now I want calculate the derivative with numpy.diff(vector) but I know that the type must be a numpy array.
for this, I type:
vector=numpy.array(vector);
if I print the vector, now, the result is:
[[ 76.27789424]
[ 76.06870298]
[ 75.85016864]
[ 75.71155968]
[ 75.16982466]
[ 73.08832948]
[ 68.59935515]]
but If i try to calculate the derivative, the result is [].
Can You help me, please?
Thanks a lot!

vector is a list of arrays, to get a 1-D NumPy array use a list comprehension and pass it to numpy.array:
>>> vector = numpy.array([x[0] for x in vector])
>>> numpy.diff(vector)
array([-0.20919126, -0.21853434, -0.13860896, -0.54173502, -2.08149518,
-4.48897433])

vector = numpy.array(vector);
gives you a two dimensional array with seven rows and one column
>>> vector.shape
(7, 1)
The shape reads like: (length axis 0, length axis 1, length axis 2, ...)
As you can see the last axis is axis 1 and it's length is 1.
from the docs
numpy.diff(a, n=1, axis=-1)
...
axis : int, optional
The axis along which the difference is taken, default is the last axis.
There is no way to take difference of a single value. So lets try to use the first axis which has a length of 7. Since axis counting starts with zero, the first axis is 0
>>> np.diff(vector, axis=0)
array([[-0.20919126],
[-0.21853434],
[-0.13860896],
[-0.54173502],
[-2.08149518],
[-4.48897433]])
Note that every degree of derivative will be one element shorter so the new shape is (7-1, 1) which is (6, 1). Lets verify that
>>> np.diff(vector, axis=0).shape
(6, 1)

Related

Extract 2d ndarray from arbitrarily dimensional ndarray using index arrays

I want to extract parts of an numpy ndarray based on arrays of index positions for some of the dimensions. Let me show this on an example
Example data
dummy = np.random.rand(5,2,100)
X = np.array([[0,1],[4,1],[2,0]])
dummy is the original ndarray with dimensionality 5x2x100. This dimensionality is arbitrary, it could as well be 5x2x4x100.
X is a matrix of index values, here X[:,0] are the indices of the first dimension of dummy, X[:,1] those of the second dimension. The number of columns in X is always the number of dimensions in dummy minus 1.
Example output
I want to extract an ndarray of the following form for this example
[
dummy[0,1,:],
dummy[4,1,:],
dummy[2,0,:]
]
Complications
If the number of dimensions in dummy were fixed, this could just be done by dummy[X[:,0],X[:,1],:] . Sadly the dimensionality can be different, e.g. dummy could be a 5x2x4x6x100 ndarray and X correspondingly would then be 3x4 . My attempts at dealing with it have not yielded the desired result.
dummy[X,:] yields a 3x2x2x100 ndarray for this example same as dummy[X]
Iteratively reducing dummy by doing something like dummy = dummy[X[:,i],:] with i an iterator over the number of columns of X also does not reduce the ndarray in the example past 3x2x100
I have a feeling that this should be pretty simple with numpy indexing, but I guess my search for a solution was missing the right terms for this.
Does anyone have a solution to this?
I will try to provide some explainability to #Michael Szczesny answer.
First, notice that if you have an np.array with dimension n and pass m indexes where m<n, then it will be the same as using : in the dimensions >=m. In your case, for example:
dummy[(0, 0)] == dummy[0, 0, :]
Given that, note that you can also pass an array as an index. Thus:
dummy[([0, 1], [0, 0])]
It would be the same as:
np.array([dummy[(0,0)], dummy[(1,0)]])
You can validate that using:
dummy[([0, 1], [0, 0])] == np.array([dummy[(0,0)], dummy[(1,0)]])
Finally, notice that:
(*X.T,)
# (array([0, 4, 2]), array([1, 1, 0]))
You are here getting each dimension as an array, and then you will get:
[
dummy[0,1],
dummy[4,1],
dummy[2,0]
]
Which is the same as:
[
dummy[0,1,:],
dummy[4,1,:],
dummy[2,0,:]
]
Edit: Instead of using (*X.T,), you can use tuple(X.T), which for me, makes more sense
as Michael Szczesny wrote, the best solution is dummy[(*X.T,)].
Since X[:,0] are the indices of the first dimension of dummy and X[:,1] are the indices of the second dimension of dummy, if you transpose X (X.T) you'll have the the indices of the first dimension of dummy as X.T[0] and the indices of the second dimension of dummy as X.T[1].
Now to slice dummy as you want, you can specify the indices of the first and of the second dimension in this way:
dummy[(first_dim_indices, second_dim_indices)] = dummy[(X.T[0], X.T[1])]
In order to simplify the code (and since you doesn't want to transpose the X matrix twice) you can unpack X.T in a tuple as (*X.T,) and so write X[(*X.T,)] is the same thing to write dummy[(X.T[0], X.T[1])].
This writing is also useful if you have an unfixed number of dimensions to slice trough because you will unpack from X.T as many lines as there are dimensions to slice in dummy. For example suppose you want to retrieve an 1D-array from dummy given the following indices:
first_dim: (0, 4, 2)
second_dim: (1, 1, 0)
third_dim: (9, 8, 7)
You can specify the indices of the 3 dimensions as X = np.array([[0,1,9],[4,1,8],[2,0,7]]) and dim[(*X.T,)] is still valid.

Error with Padlen in signal.filtfilt in Python

I am working with library "scipy.signal" in Python and I have the next code:
from scipy import signal
b = [ 0.001016 0.00507999 0.01015998 0.01015998 0.00507999 0.001016 ]
a = [ 1. -3.0820186 4.04351697 -2.76126457 0.97291013 -0.14063199]
data = [[ 1.]
[ 1.]
[ 1.]
...]
# length = 264
y = signal.filtfilt(b, a, data)
But when I execute the code I get the next error message:
The length of the input vector x must be at least padlen, which is 18.
What could I do?
It appears that data is a two-dimensional array with shape (264, 1). By default, filtfilt filters along the last axis of the input array, so in your case it is trying to filter along an axis where the length of the data is 1, which is not long enough for the default padding method.
I assume you meant to interpret data as a one-dimensional array. You can add the argument axis=0
y = signal.filtfilt(b, a, data, axis=0)
to filter along the first dimension (i.e. down the column), in which case the output y will also have shape (264, 1). Alternatively, you can convert the input to a one-dimensional array by flattening it with np.ravel(data) or by using indexing to select the first (and only) column, data[:, 0]. (The latter will only work if data is, in fact, a numpy array and not a list of lists.) E.g.
y = signal.filtfilt(b, a, np.ravel(data))
In that case, the output y will also be a one-dimensional array, with shape (264,).
Assuming you have a two-dimensional array with shape (264, 2), you can also use np.hsplit() to split data into two separate arrays like so:
import numpy as np
arr1, arr2 = np.hsplit(data,2)
You can view the shape of each individual array, for example:
print(arr1.shape)
Your code will then look something like this:
y1 = signal.filtfilt(b, a, arr1)
y2 = signal.filtfilt(b, a, arr2)

Element wise comparison between 1D and 2D array

Want to perform an element wise comparison between an 1D and 2D array. Each element of the 1D array need to be compared (e.g. greater) against the corresponding row of 2D and a mask will be created. Here is an example:
A = np.random.choice(np.arange(0, 10), (4,100)).astype(np.float)
B = np.array([5., 4., 8., 2. ])
I want to do
A<B
so that first row of A will be compared against B[0] which is 5. and the result will be an boolean array.
If I try this I get:
operands could not be broadcast together with shapes (4,100) (4,)
Any ideas?
You need to insert an extra dimension into array B:
A < B[:, None]
This allows NumPy to properly match up the two shapes for broadcasting; B now has shape (4, 1) and the dimensions can be paired up:
(4, 100)
(4, 1)
The rule is that either the dimensions have the same length, or one of the lengths needs to be 1; here 100 can be paired with 1, and 4 can be paired with 4. Before the new dimension was inserted, NumPy tried to pair 100 with 4 which raised the error.

numpy get std between datasets

I have a dataset array A. A is nĂ—2. It can be plotted on the x and y axis.
A[:,1] gets me all of the Y values ans A[:,0] gets me all the x values.
Now, I have a few other dataset arrays that are similar to A. X values are the same for these similar arrays. How do I calculate the standard deviation of the datasets? There should be a std value for each X. In the end my result std should have a length of n.
I can do this the manual way with loops but I'm not sure how to do this using NumPy in a pythonic and simple manner.
here are some sample data:
A=[[0,2.54],[1,254.5],[2,-43]]
B=[[0,3.34],[1,154.5],[2,-93]]
std_Array=[std(2.54,3.54),std(254.5,154.5),std(-43,-93)]
Suppose your arrays are all the same shape and they are in a list. Then to get the standard deviation of the first column of each you can do
arrays = [np.random.rand(10, 2) for _ in range(8)]
np.dstack(arrays).std(axis=0)[0]
This stacks the 2-D arrays into a 3-D array an then takes the std along the first axis, giving a 2 X 8 (the number of arrays). The first row of the result is the std. devs. of the 8 sets of x-values.
If you post some sample data perhaps we could help more.
Is this pythonic enough?
std_Array = numpy.std((A,B), axis = 0)[:,1]
li_arr = [np.array(x)[: , 1] for x in [A , B]]
This will produce numpy arrays with specifi columns you want to add the result will be
[array([ 2.54, 254.5 , -43. ]), array([ 3.34, 154.5 , -93. ])]
then you stack the values using column_stack
arr = np.column_stack(li_arr)
this will be the result stacking
array([[ 2.54, 3.34],
[ 254.5 , 154.5 ],
[ -43. , -93. ]])
and then finally
np.std(arr , axis = 1)

Numpy array of rank > 1 and one of dimensions == 0

I am implementing a function that reads data from file into a multi-dimensional numpy array. Data is regularly structured in sense of dimension lengths, however, some dimensions may be missing, in which case, I would let the length of that dimension be 0. So I have stumbled upon this behavior:
In [1]: np.random.random((3,3))
Out[1]:
array([[ 0.59756568, 0.47198749, 0.23442854],
[ 0.29374254, 0.58289927, 0.40497268],
[ 0.00481053, 0.63471263, 0.90053086]])
In [2]: np.random.random((0,3,3))
Out[2]: array([], shape=(0, 3, 3), dtype=float64)
OK, so I get an empty array. This makes sense if I look at it as 2nd and 3rd dimensions are subset of the 1st, which is nil, and thus the whole array is nil. However, I would expect np.random.random((3,3,0)) to be equivalent to np.random.random((3,3)). However,
In [3]: np.random.random((3,3,0))
Out[3]: array([], shape=(3, 3, 0), dtype=float64)
An empty array again.
Is this expected behavior? I understand the difference between np.array((3,3)) and np.array((3,3,1)) or np.array((1,3,3)), but I am looking for an explanation why does a dimension of length 0 degenerate the whole array and not only that dimension. Is it just me, or is this one of Python/numpy WTFs?
As I state in a comment, you are getting an empty array because the size of an array is always zero if any of the dimensions are zero. Can I ask what you are trying to do? If you want an empty 3rd dimension you can try something like the following:
>>> x = numpy.random.random((3,3))
>>> y = x[..., numpy.newaxis]
>>> y
array([[[ 0.92418241],
[ 0.76716579],
[ 0.82485034]],
[[ 0.30571695],
[ 0.71012271],
[ 0.54609355]],
[[ 0.98192734],
[ 0.25505518],
[ 0.75473749]]])
>>> y.shape
(3, 3, 1)
>>> x.shape
(3, 3)

Categories