I am a newbie using Python and Numpy.
I thought this would be simple and probably is.
I have an array of times.
For example:
times = (0.5, 0.75, 1.5)
This array will vary in size depending on the files loaded.
I simply want to find the difference in time between each subsequent element.
0.75 - 0.5
then
1.5 - 0.75
and so for the number of elements in the array.
Then I put each result into one column.
I have tried various for loops but unable to do it. There must be a simple way?
Thanks,
Scott
How about this?
>>> import numpy as np
>>> a = np.array([0.5, 0.75, 1.5])
>>> np.diff(a)
array([ 0.25, 0.75])
>>>
First, note that for most everyday uses, you probably won't require an array (which is a special datatype in Numpy). Python's workhorse means of data storage is the list, and is definitely worth reading up on if you're coming from a more rigid programming language. Lists are actually defined using square brackets:
times = [ 0.5, 0.75, 1.5 ]
Then, with special syntax called a list comprehension, we can create a new list that has (length-1) elements. This expression automatically figures out the size of the list needed.
diffs = [ times[i] - times[i-1] for i in range(1, len(times)) ]
And for the sample data provided, returns:
[0.25, 0.75]
Or how about these (all variants on the same theme):
import numpy as np
a = np.array([0.5, 0.75, 1.5])
b = a[1:]-a[:-1]
print (b)
or without numpy:
a=[0.5, 0.75, 1.5]
a1=a[1:]
a2=a[:-1]
b=[ aa-a2[i] for i,aa in enumerate(a1) ]
or
a=[0.5, 0.75, 1.5]
c=[x-y for x,y in zip(a[1:],a[:-1])]
diffs = [b-a for a, b in zip(times[:-1], times[1:])]
[0.25, 0.75]
This approach requires no numpy, simple Python. Subtracting
times[1:]
(0.75, 1.5)
times[:-1]
(0.5, 0.75)
from each other
this should do it.
import numpy as np
times = np.array([0.5,0.75,1.5,2.0])
diff_times = np.zeros(len(times)-1,dtype =float)
for i in range(1,len(times)):
diff_times[i-1] = times[i] - times[i-1]
print diff_times
very basic:
liste = [1,3,8]
difference = []
for i in range(len(liste)-1):
diffs = abs(liste[i] - liste[i+1])
difference.append(diffs)
print difference
Related
I have to round every element inside a numpy array only to .5 or .0 values. I know the np.arange() method, however it is not useful in this specific task since I can only use it to set a precision equal to one.
Here there is an example of what I should do:
x = np.array([2.99845, 4.51845, 0.33365, 0.22501, 2.48523])
x_rounded = some_function(x)
>>> x_rounded
array([3.0, 4.5, 0.5, 0.0, 2.5])
Is there a built-in method to do so or I have to create it?
If I should create that method, is there an efficient? I'm working on a big dataset, so I would like to avoid iterating over each element.
import numpy as np
x = np.array([2.99845, 4.51845, 0.33365, 0.22501, 2.48523])
np.round(2 * x) / 2
Output:
array([3. , 4.5, 0.5, 0. , 2.5])
I am trying to take the reciprocal of every non zero value in a numpy array but am messing something up. Suppose:
norm = np.arange(0,11)
I would like the np.array that would be (maintaining the zeros in place)
[ 0, 1, 0.5 , 0.33, 0.25, 0.2 , 0.17, 0.14, 0.12, 0.11, 0.1]
If I set
mask = norm !=0
and I try
1/norm[mask]
I receive the expected result of
[1, 0.5 , 0.33, 0.25, 0.2 , 0.17, 0.14, 0.12, 0.11, 0.1]
However I'm trying to understand why is it that when I try the following assignment
norm[mask] = 1/norm[mask]
i get the following numpy array.
[0,1,0,0,0,0,0,0,0,0,0]
any ideas on why this is or how to achieve the desired np.array?
Are you sure you didn't accidentally change the value of norm.
Both
mask = norm != 0
norm[mask] = 1 / norm[mask]
and
norm[norm != 0] = 1 / norm[norm != 0]
both do exactly what you want them to do. I also tried it using mask on the left side and norm != 0 on the right side like you do above (why?) and it works fine.
EDIT BY FY: I misread the example. I thought original poster was starting with [0, .5, .333, .25] rather than with [0, 1, 2, 3, 4]. Poster is accidentally creating an int64 array rather than a floating point array, and everything is rounding down to zero. Change it to np.arange(0., 11.)
another option is using numpy.reciprocal as documented here with a parameter where as followed:
import numpy as np
data = np.reciprocal(data,where= data!=0)
example:
In[1]: data = np.array([2.0,4.0,0.0])
in[2]: np.reciprocal(data,where=data!=0)
Out[9]: array([0.5 , 0.25, 0. ])
notice that this function is not intended to work with ints, therefore the initialized values are with the .0 suffix.
if you're not sure of the type, you can always use data.astype(float64)
Here's a custom function that allows stepping through decimal increments:
def my_range(start, stop, step):
i = start
while i < stop:
yield i
i += step
It works like this:
out = list(my_range(0, 1, 0.1))
print(out)
[0, 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6, 0.7, 0.7999999999999999, 0.8999999999999999, 0.9999999999999999]
Now, there's nothing surprising about this. It's understandable this happens because of floating point inaccuracies and that 0.1 has no exact representation in memory. So, those precision errors are understandable.
Take numpy on the other hand:
import numpy as np
out = np.arange(0, 1, 0.1)
print(out)
array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
What's interesting is that there are no visible imprecision accuracies introduced here. I thought this might have to do with what the __repr__ shows, so to confirm, I tried this:
x = list(my_range(0, 1.1, 0.1))[-1]
print(x.is_integer())
False
x = list(np.arange(0, 1.1, 0.1))[-1]
print(x.is_integer())
True
So, my function returns an incorrect upper value (it should be 1.0 but it is actually 1.0999999999999999), but np.arange does it correctly.
I'm aware of Is floating point math broken? but the point of this question is:
How does numpy do this?
The difference in endpoints is because NumPy calculates the length up front instead of ad hoc, because it needs to preallocate the array. You can see this in the _calc_length helper. Instead of stopping when it hits the end argument, it stops when it hits the predetermined length.
Calculating the length up front doesn't save you from the problems of a non-integer step, and you'll frequently get the "wrong" endpoint anyway, for example, with numpy.arange(0.0, 2.1, 0.3):
In [46]: numpy.arange(0.0, 2.1, 0.3)
Out[46]: array([ 0. , 0.3, 0.6, 0.9, 1.2, 1.5, 1.8, 2.1])
It's much safer to use numpy.linspace, where instead of the step size, you say how many elements you want and whether you want to include the right endpoint.
It might look like NumPy has suffered no rounding error when calculating the elements, but that's just due to different display logic. NumPy is truncating the displayed precision more aggressively than float.__repr__ does. If you use tolist to get an ordinary list of ordinary Python scalars (and thus the ordinary float display logic), you can see that NumPy has also suffered rounding error:
In [47]: numpy.arange(0, 1, 0.1).tolist()
Out[47]:
[0.0,
0.1,
0.2,
0.30000000000000004,
0.4,
0.5,
0.6000000000000001,
0.7000000000000001,
0.8,
0.9]
It's suffered slightly different rounding error - for example, in .6 and .7 instead of .8 and .9 - because it also uses a different means of computing the elements, implemented in the fill function for the relevant dtype.
The fill function implementation has the advantage that it uses start + i*step instead of repeatedly adding the step, which avoids accumulating error on each addition. However, it has the disadvantage that (for no compelling reason I can see) it recomputes the step from the first two elements instead of taking the step as an argument, so it can lose a great deal of precision in the step up front.
While arange does step through the range in a slightly different way, it still has the float representation issue:
In [1358]: np.arange(0,1,0.1)
Out[1358]: array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
The print hides that; convert it to a list to see the gory details:
In [1359]: np.arange(0,1,0.1).tolist()
Out[1359]:
[0.0,
0.1,
0.2,
0.30000000000000004,
0.4,
0.5,
0.6000000000000001,
0.7000000000000001,
0.8,
0.9]
or with another iteration
In [1360]: [i for i in np.arange(0,1,0.1)] # e.g. list(np.arange(...))
Out[1360]:
[0.0,
0.10000000000000001,
0.20000000000000001,
0.30000000000000004,
0.40000000000000002,
0.5,
0.60000000000000009,
0.70000000000000007,
0.80000000000000004,
0.90000000000000002]
In this case each displayed item is a np.float64, where as in the first each is float.
Aside from the different representation of lists and arrays NumPys arange works by multiplying instead of repeated adding. It's more like:
def my_range2(start, stop, step):
i = 0
while start+(i*step) < stop:
yield start+(i*step)
i += 1
Then the output is completely equal:
>>> np.arange(0, 1, 0.1).tolist() == list(my_range2(0, 1, 0.1))
True
With repeated addition you would "accumulate" floating point rounding errors. The multiplication is still affected by rounding but the error doesn't accumulate.
As pointed out in the comments it's not really what is happening. As far as I see it it's more like:
def my_range2(start, stop, step):
length = math.ceil((stop-start)/step)
# The next two lines are mostly so the function really behaves like NumPy does
# Remove them to get better accuracy...
next = start + step
step = next - start
for i in range(length):
yield start+(i*step)
But not sure if that's exactly right either because there's a lot more going on in NumPy.
I am having problem fitting the following data to a range of 0.1-1.0:
t=[0.23,0.76,0.12]
Obviously each item in the t-list falls within the range 0.1-1.0, but the output of my code indicates the opposite.
My attempt
import numpy as np
>>> g=np.arange(0.1,1.0,0.1)
>>> t=[0.23,0.76,0.12]
>>> t2=[x for x in t if x in g]
>>> t2
[]
Desired output:[0.23,0.76,0.12]
I clearly understand that using an interval of 0.1 will make it difficult to find any of the t-list items in the specified arange. Could have made some adjustment but my range is fixed and my data is large which makes it practically impossible to keep adjust the range.
Any suggestions on how to get around this? thanks
Did you try to inspect g?
>>> g
array([ 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
So clearly none of your elements is in g.
Probably, you look for something like
>>> [x for x in t if 0.1<=x<=1.0]
[0.23, 0.76, 0.12]
The documentation of the numpy.correlate command says that the cross correlation of two arrays is computed as the general definition for signal processing in the way:
z[k] = sum_n a[n] * conj(v[n+k])
This does not seem to be the case. It looks like the correlation is flipped. This would mean that either the sign in the last term of the formula is switched
z[k] = sum_n a[n] * conj(v[n-k])
or that the two input vectors are in the wrong order. A simple implementation of the given formula would be:
x = [1.0, 2.0, 3.0]
y = [0.0, 0.5, 2.0]
y_padded = numpy.append( [0.0, 0.0] , y)
y_padded = numpy.append(y_padded, [0.0, 0.0] )
crosscorr_numpy = numpy.correlate(x, y, mode='full')
crosscorr_self = numpy.zeros(5)
for k in range(5):
for i in range(3):
crosscorr_self[k] += x[i] * y_padded[i+k]
print crosscorr_numpy
print crosscorr_self
You can easily see that the resulting vector has the wrong order. I was very confused when it did not produce the results I expected and am pretty sure (after discussing it with my colleagues) that this is an error.
Which version of NumPy are you using? On my Debian Squeeze box:
In [1]: import numpy as np
In [2]: np.__version__
Out[2]: '1.4.1'
When I run your example, I get:
/usr/lib/pymodules/python2.6/numpy/core/numeric.py:677: DeprecationWarning:
The current behavior of correlate is deprecated for 1.4.0, and will be removed
for NumPy 1.5.0.
The new behavior fits the conventional definition of correlation: inputs are
never swapped, and the second argument is conjugated for complex arrays.
DeprecationWarning)
[ 2. 4.5 7. 1.5 0. ]
[ 0. 1.5 7. 4.5 2. ]
so you may be right about the (incorrect) behavior, but it probably has been fixed in the new version.