Round a numpy array to .5 or .0 only - python

I have to round every element inside a numpy array only to .5 or .0 values. I know the np.arange() method, however it is not useful in this specific task since I can only use it to set a precision equal to one.
Here there is an example of what I should do:
x = np.array([2.99845, 4.51845, 0.33365, 0.22501, 2.48523])
x_rounded = some_function(x)
>>> x_rounded
array([3.0, 4.5, 0.5, 0.0, 2.5])
Is there a built-in method to do so or I have to create it?
If I should create that method, is there an efficient? I'm working on a big dataset, so I would like to avoid iterating over each element.

import numpy as np
x = np.array([2.99845, 4.51845, 0.33365, 0.22501, 2.48523])
np.round(2 * x) / 2
Output:
array([3. , 4.5, 0.5, 0. , 2.5])

Related

apply max function to numpy.array

Good morning!
I have an np.array (1.1,2.2,3.3), and i want to pass the array to a simple max function, max(0,(x-1.5)**3) and I expect return of an np.array (0,0.343,5.832)
I tried the follow code and received error.
aaa = np.array([1.1, 2.2, 3.3])
max(0, (aaa-1.5)**3)
How can I get the expected result?
Without using a list comprehension, therefore a for loop. You can apply your function with vectorization, create an array of zeros. Take the max of them :
import numpy as np
a = np.array((1.1,2.2,3.3))
b = np.zeros(len(a))
np.maximum((a-1.5)**3,b)
Output :
array([0. , 0.343, 5.832])
You should replace max() (which knows little about NumPy objects) with either numpy.maximum() or numpy.fmax().
Both work similarly: they compare two arrays element-wise outputing the maximum, broadcasting inputs with different shapes.
They only differ in the way they treat NaNs: propagated with np.maximum() and ignored as much as possible with np.fmax().
In your example, the 0 gets broadcasted to the shape of aaa:
import numpy as np
aaa = np.array([1.1, 2.2, 3.3])
np.fmax(0, (aaa - 1.5) ** 3)
# array([0. , 0.343, 5.832])
x = np.array([1.1, 2.2, 3.3])
y = np.array(list(map(lambda t: max(0, (t - 1.5)**3), x)))

How do I take the reciprocal of all non-zero entries in a numpy array

I am trying to take the reciprocal of every non zero value in a numpy array but am messing something up. Suppose:
norm = np.arange(0,11)
I would like the np.array that would be (maintaining the zeros in place)
[ 0, 1, 0.5 , 0.33, 0.25, 0.2 , 0.17, 0.14, 0.12, 0.11, 0.1]
If I set
mask = norm !=0
and I try
1/norm[mask]
I receive the expected result of
[1, 0.5 , 0.33, 0.25, 0.2 , 0.17, 0.14, 0.12, 0.11, 0.1]
However I'm trying to understand why is it that when I try the following assignment
norm[mask] = 1/norm[mask]
i get the following numpy array.
[0,1,0,0,0,0,0,0,0,0,0]
any ideas on why this is or how to achieve the desired np.array?
Are you sure you didn't accidentally change the value of norm.
Both
mask = norm != 0
norm[mask] = 1 / norm[mask]
and
norm[norm != 0] = 1 / norm[norm != 0]
both do exactly what you want them to do. I also tried it using mask on the left side and norm != 0 on the right side like you do above (why?) and it works fine.
EDIT BY FY: I misread the example. I thought original poster was starting with [0, .5, .333, .25] rather than with [0, 1, 2, 3, 4]. Poster is accidentally creating an int64 array rather than a floating point array, and everything is rounding down to zero. Change it to np.arange(0., 11.)
another option is using numpy.reciprocal as documented here with a parameter where as followed:
import numpy as np
data = np.reciprocal(data,where= data!=0)
example:
In[1]: data = np.array([2.0,4.0,0.0])
in[2]: np.reciprocal(data,where=data!=0)
Out[9]: array([0.5 , 0.25, 0. ])
notice that this function is not intended to work with ints, therefore the initialized values are with the .0 suffix.
if you're not sure of the type, you can always use data.astype(float64)

Comparing NumPy arange and custom range function for producing ranges with decimal increments

Here's a custom function that allows stepping through decimal increments:
def my_range(start, stop, step):
i = start
while i < stop:
yield i
i += step
It works like this:
out = list(my_range(0, 1, 0.1))
print(out)
[0, 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6, 0.7, 0.7999999999999999, 0.8999999999999999, 0.9999999999999999]
Now, there's nothing surprising about this. It's understandable this happens because of floating point inaccuracies and that 0.1 has no exact representation in memory. So, those precision errors are understandable.
Take numpy on the other hand:
import numpy as np
out = np.arange(0, 1, 0.1)
print(out)
array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
What's interesting is that there are no visible imprecision accuracies introduced here. I thought this might have to do with what the __repr__ shows, so to confirm, I tried this:
x = list(my_range(0, 1.1, 0.1))[-1]
print(x.is_integer())
False
x = list(np.arange(0, 1.1, 0.1))[-1]
print(x.is_integer())
True
So, my function returns an incorrect upper value (it should be 1.0 but it is actually 1.0999999999999999), but np.arange does it correctly.
I'm aware of Is floating point math broken? but the point of this question is:
How does numpy do this?
The difference in endpoints is because NumPy calculates the length up front instead of ad hoc, because it needs to preallocate the array. You can see this in the _calc_length helper. Instead of stopping when it hits the end argument, it stops when it hits the predetermined length.
Calculating the length up front doesn't save you from the problems of a non-integer step, and you'll frequently get the "wrong" endpoint anyway, for example, with numpy.arange(0.0, 2.1, 0.3):
In [46]: numpy.arange(0.0, 2.1, 0.3)
Out[46]: array([ 0. , 0.3, 0.6, 0.9, 1.2, 1.5, 1.8, 2.1])
It's much safer to use numpy.linspace, where instead of the step size, you say how many elements you want and whether you want to include the right endpoint.
It might look like NumPy has suffered no rounding error when calculating the elements, but that's just due to different display logic. NumPy is truncating the displayed precision more aggressively than float.__repr__ does. If you use tolist to get an ordinary list of ordinary Python scalars (and thus the ordinary float display logic), you can see that NumPy has also suffered rounding error:
In [47]: numpy.arange(0, 1, 0.1).tolist()
Out[47]:
[0.0,
0.1,
0.2,
0.30000000000000004,
0.4,
0.5,
0.6000000000000001,
0.7000000000000001,
0.8,
0.9]
It's suffered slightly different rounding error - for example, in .6 and .7 instead of .8 and .9 - because it also uses a different means of computing the elements, implemented in the fill function for the relevant dtype.
The fill function implementation has the advantage that it uses start + i*step instead of repeatedly adding the step, which avoids accumulating error on each addition. However, it has the disadvantage that (for no compelling reason I can see) it recomputes the step from the first two elements instead of taking the step as an argument, so it can lose a great deal of precision in the step up front.
While arange does step through the range in a slightly different way, it still has the float representation issue:
In [1358]: np.arange(0,1,0.1)
Out[1358]: array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])
The print hides that; convert it to a list to see the gory details:
In [1359]: np.arange(0,1,0.1).tolist()
Out[1359]:
[0.0,
0.1,
0.2,
0.30000000000000004,
0.4,
0.5,
0.6000000000000001,
0.7000000000000001,
0.8,
0.9]
or with another iteration
In [1360]: [i for i in np.arange(0,1,0.1)] # e.g. list(np.arange(...))
Out[1360]:
[0.0,
0.10000000000000001,
0.20000000000000001,
0.30000000000000004,
0.40000000000000002,
0.5,
0.60000000000000009,
0.70000000000000007,
0.80000000000000004,
0.90000000000000002]
In this case each displayed item is a np.float64, where as in the first each is float.
Aside from the different representation of lists and arrays NumPys arange works by multiplying instead of repeated adding. It's more like:
def my_range2(start, stop, step):
i = 0
while start+(i*step) < stop:
yield start+(i*step)
i += 1
Then the output is completely equal:
>>> np.arange(0, 1, 0.1).tolist() == list(my_range2(0, 1, 0.1))
True
With repeated addition you would "accumulate" floating point rounding errors. The multiplication is still affected by rounding but the error doesn't accumulate.
As pointed out in the comments it's not really what is happening. As far as I see it it's more like:
def my_range2(start, stop, step):
length = math.ceil((stop-start)/step)
# The next two lines are mostly so the function really behaves like NumPy does
# Remove them to get better accuracy...
next = start + step
step = next - start
for i in range(length):
yield start+(i*step)
But not sure if that's exactly right either because there's a lot more going on in NumPy.

Truncating decimal digits numpy array of floats

I want to truncate the float values within the numpy array, for .e.g.
2.34341232 --> 2.34
I read the post truncate floating point but its for one float. I don't want to run a loop on the numpy array, it will be quite expensive. Is there any inbuilt method within numpy which can do this easily? I do need output as a float not string.
Try out this modified version of numpy.trunc().
import numpy as np
def trunc(values, decs=0):
return np.trunc(values*10**decs)/(10**decs)
Sadly, numpy.trunc function doesn't allow decimal truncation. Luckily, multiplying the argument and dividing it's result by a power of ten give the expected results.
vec = np.array([-4.79, -0.38, -0.001, 0.011, 0.4444, 2.34341232, 6.999])
trunc(vec, decs=2)
which returns:
>>> array([-4.79, -0.38, -0. , 0.01, 0.44, 2.34, 6.99])
Use numpy.round:
import numpy as np
a = np.arange(4) ** np.pi
a
=> array([ 0. , 1. , 8.82497783, 31.5442807 ])
a.round(decimals=2)
=> array([ 0. , 1. , 8.82, 31.54])

Loop for Subtracting within Array Elements

I am a newbie using Python and Numpy.
I thought this would be simple and probably is.
I have an array of times.
For example:
times = (0.5, 0.75, 1.5)
This array will vary in size depending on the files loaded.
I simply want to find the difference in time between each subsequent element.
0.75 - 0.5
then
1.5 - 0.75
and so for the number of elements in the array.
Then I put each result into one column.
I have tried various for loops but unable to do it. There must be a simple way?
Thanks,
Scott
How about this?
>>> import numpy as np
>>> a = np.array([0.5, 0.75, 1.5])
>>> np.diff(a)
array([ 0.25, 0.75])
>>>
First, note that for most everyday uses, you probably won't require an array (which is a special datatype in Numpy). Python's workhorse means of data storage is the list, and is definitely worth reading up on if you're coming from a more rigid programming language. Lists are actually defined using square brackets:
times = [ 0.5, 0.75, 1.5 ]
Then, with special syntax called a list comprehension, we can create a new list that has (length-1) elements. This expression automatically figures out the size of the list needed.
diffs = [ times[i] - times[i-1] for i in range(1, len(times)) ]
And for the sample data provided, returns:
[0.25, 0.75]
Or how about these (all variants on the same theme):
import numpy as np
a = np.array([0.5, 0.75, 1.5])
b = a[1:]-a[:-1]
print (b)
or without numpy:
a=[0.5, 0.75, 1.5]
a1=a[1:]
a2=a[:-1]
b=[ aa-a2[i] for i,aa in enumerate(a1) ]
or
a=[0.5, 0.75, 1.5]
c=[x-y for x,y in zip(a[1:],a[:-1])]
diffs = [b-a for a, b in zip(times[:-1], times[1:])]
[0.25, 0.75]
This approach requires no numpy, simple Python. Subtracting
times[1:]
(0.75, 1.5)
times[:-1]
(0.5, 0.75)
from each other
this should do it.
import numpy as np
times = np.array([0.5,0.75,1.5,2.0])
diff_times = np.zeros(len(times)-1,dtype =float)
for i in range(1,len(times)):
diff_times[i-1] = times[i] - times[i-1]
print diff_times
very basic:
liste = [1,3,8]
difference = []
for i in range(len(liste)-1):
diffs = abs(liste[i] - liste[i+1])
difference.append(diffs)
print difference

Categories