numpy.arange divide by zero error - python

I have used numpy's arange function to make the following range:
a = n.arange(0,5,1/2)
This variable works fine by itself, but when I try putting it anywhere in my script I get an error that says
ZeroDivisionError: division by zero

First, your step evaluates to zero (on python 2.x that is). Second, you may want to check np.linspace if you want to use a non-integer step.
Docstring:
arange([start,] stop[, step,], dtype=None)
Return evenly spaced values within a given interval.
[...]
When using a non-integer step, such as 0.1, the results will often not
be consistent. It is better to use ``linspace`` for these cases.
In [1]: import numpy as np
In [2]: 1/2
Out[2]: 0
In [3]: 1/2.
Out[3]: 0.5
In [4]: np.arange(0, 5, 1/2.) # use a float
Out[4]: array([ 0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5])

If you're not using a newer version of python (3.1 or later I think) the expression 1/2 evaluates to zero, since it's assuming integer division.
You can fix this by replacing 1/2 with 1./2 or 0.5, or put from __future__ import division at the top of your script.

Related

How to mutliply a number with negative power in python

When I try to multiply this by a negative integer it just returns an error
I use:
A = np.array([[1,2,0], [2,4,-2], [0,-2,3]])
From the screenshot, I can see this is homework.
So it asks for the matrix inverse. In maths this is written as A^(-1)
import numpy as np
A = np.array([[1,2,0], [2,4,-2], [0,-2,3]])
np.linalg.inv(A)
array([[-2. , 1.5 , 1. ],
[ 1.5 , -0.75, -0.5 ],
[ 1. , -0.5 , 0. ]])
In numpy, you can not raise integers by negative integer powers (Read this).
In python, the ** operator returns the value without any error.
In [6]: A = 20
In [7]: print(A ** -1)
0.05
You can also use pow(),
In [1]: A = 20
In [2]: pow(20, -1)
Out[2]: 0.05
If you're working with matrices, it's a good idea to ensure that they are instances of the numpy.matrix type rather than the more-generic numpy.ndarray.
import numpy as np
M = np.matrix([[ ... ]])
To convert an existing generic array to a matrix you can also pass it into np.asmatrix().
Once you have a matrix instance M, one way to get the inverse is M.I
To avoid the "integers not allowed" problem, ensure that the dtype of your matrix is floating-point, not integer (specify dtype=float in the call to matrix() or asmatrix())
To Insert power as negative value assume an another variable and name it "pow" and assign that negative value.
Now put below in your code.
pow = -3
value = 5**pow
print(value)
Execute the code and you will see result.
Hope it helps... 🤗🤗🤗

Numpy argsort vs Scipy.stats rankdata

I've recently used both of these functions, and am looking for input from anyone who can speak to the following:
do argsort and rankdata differ fundamentally in their purpose?
are there performance advantages with one over the other? (specifically: large vs small array performance differences?)
what is the memory overhead associated with importing rankdata?
Thanks in advance.
p.s. I could not create the new tags 'argsort' or 'rankdata'. If anyone with sufficient standing feels they should be added to this question, please do.
Do argsort and rankdata differ fundamentally in their purpose?
In my opinion, they do slightly. The first gives you the positions of the data if the data was sorted, while the second the rank of the data. The difference can become apparent in the case of ties:
import numpy as np
from scipy import stats
a = np.array([ 5, 0.3, 0.4, 1, 1, 1, 3, 42])
almost_ranks = np.empty_like(a)
almost_ranks[np.argsort(a)] = np.arange(len(a))
print(almost_ranks)
print(almost_ranks+1)
print(stats.rankdata(a))
Results to (notice 3. 4. 5 vs. 4. 4. 4 ):
[6. 0. 1. 2. 3. 4. 5. 7.]
[7. 1. 2. 3. 4. 5. 6. 8.]
[7. 1. 2. 4. 4. 4. 6. 8.]
Are there performance advantages with one over the other?
(specifically: large vs small array performance differences?)
Both algorithms seem to me to have the same complexity: O(NlgN) I would expect the numpy implementation to be slightly faster as it has a bit of a smaller overhead, plus it's numpy. But you should test this yourself... Checking the code for scipy.rankdata, it seems to -at present, my python...- be calling np.unique among other functions, so i would guess it would take more in practice...
what is the memory overhead associated with importing rankdata?
Well, you import scipy, if you had not done so before, so it is the overhead of scipy...

How to write conditional code that's compatible with both plain Python values and NumPy arrays?

For writing “piecewise functions” in Python, I'd normally use if (in either the control-flow or ternary-operator form).
def spam(x):
return x+1 if x>=0 else 1/(1-x)
Now, with NumPy, the mantra is to avoid working on single values in favour of vectorisation, for performance. So I reckon something like this would be preferred:As Leon remarks, the following is wrong
def eggs(x):
y = np.zeros_like(x)
positive = x>=0
y[positive] = x+1
y[np.logical_not(positive)] = 1/(1-x)
return y
(Correct me if I've missed something here, because frankly I find this very ugly.)
Now, of course eggs will only work if x is actually a NumPy array, because otherwise x>=0 simply yields a single boolean, which can't be used for indexing (at least doesn't do the right thing).
Is there a good way to write code that looks more like spam but works idiomatic on Numpy arrays, or should I just use vectorize(spam)?
Use np.where. You'll get an array as the output even for plain number input, though.
def eggs(x):
y = np.asarray(x)
return np.where(y>=0, y+1, 1/(1-y))
This works for both arrays and plain numbers:
>>> eggs(5)
array(6.0)
>>> eggs(-3)
array(0.25)
>>> eggs(np.arange(-3, 3))
/home/praveen/.virtualenvs/numpy3-mkl/bin/ipython3:2: RuntimeWarning: divide by zero encountered in true_divide
array([ 0.25 , 0.33333333, 0.5 , 1. , 2. , 3. ])
>>> eggs(1)
/home/praveen/.virtualenvs/numpy3-mkl/bin/ipython3:3: RuntimeWarning: divide by zero encountered in long_scalars
# -*- coding: utf-8 -*-
array(2.0)
As ayhan remarks, this raises a warning, since 1/(1-x) gets evaluated for the whole range. But a warning is just that: a warning. If you know what you're doing, you can ignore the warning. In this case, you're only choosing 1/(1-x) from indices where it can never be inf, so you're safe.
I would use numpy.asarray (which is a no-op if the argument is already an numpy array) if I want to handle both numbers and numpy arrays
def eggs(x):
x = np.asfarray(x)
m = x>=0
x[m] = x[m] + 1
x[~m] = 1 / (1 - x[~m])
return x
(here I used asfarray to enforce a floating-point type, since your function requires floating-point computations).
This is less efficient than your spam function for single inputs, and arguably uglier. However it seems to be the easiest choice.
EDIT: If you want to ensure that x is not modified (as pointed out by Leon) you can replace np.asfarray(x) by np.array(x, dtype=np.float64), the array constructor copies by default.

Matlab range in Python

I must translate some Matlab code into Python 3 and I often come across ranges of the form start:step:stop. When these arguments are all integers, I easily translate this command with np.arange(), but when some of the arguments are floats, especially the step parameter, I don't get the same output in Python. For example,
7:8 %In Matlab
7 8
If I want to translate it in Python I simply use :
np.arange(7,8+1)
array([7, 8])
But if I have, let's say :
7:0.3:8 %In Matlab
7.0000 7.3000 7.6000 7.9000
I can't translate it using the same logic :
np.arange(7, 8+0.3, 0.3)
array([ 7. , 7.3, 7.6, 7.9, 8.2])
In this case, I must not add the step to the stop argument.
But then, if I have :
7:0.2:8 %In Matlab
7.0000 7.2000 7.4000 7.6000 7.8000 8.0000
I can use my first idea :
np.arange(7,8+0.2,0.2)
array([ 7. , 7.2, 7.4, 7.6, 7.8, 8. ])
My problem comes from the fact that I am not translating hardcoded lines like these. In fact, each parameters of these ranges can change depending on the inputs of the function I am working on. Thus, I can sometimes have 0.2 or 0.3 as the step parameter. So basically, do you guys know if there is another numpy/scipy or whatever function that really acts like Matlab range, or if I must add a little bit of code by myself to make sure that my Python range ends up at the same number as Matlab's?
Thanks!
You don't actually need to add your entire step size to the max limit of np.arange but just a very tiny number to make sure that that max is enclose. For example the machine epsilon:
eps = np.finfo(np.float32).eps
adding eps will give you the same result as MATLAB does in all three of your scenarios:
In [13]: np.arange(7, 8+eps)
Out[13]: array([ 7., 8.])
In [14]: np.arange(7, 8+eps, 0.3)
Out[14]: array([ 7. , 7.3, 7.6, 7.9])
In [15]: np.arange(7, 8+eps, 0.2)
Out[15]: array([ 7. , 7.2, 7.4, 7.6, 7.8, 8. ])
Matlab docs for linspace say
linspace is similar to the colon operator, ":", but gives direct control over the number of points and always includes the endpoints. "lin" in the name "linspace" refers to generating linearly spaced values as opposed to the sibling function logspace, which generates logarithmically spaced values.
numpy arange has a similar advise.
When using a non-integer step, such as 0.1, the results will often not
be consistent. It is better to use linspace for these cases.
End of interval. The interval does not include this value, except
in some cases where step is not an integer and floating point
round-off affects the length of out.
So differences in how step size gets translated into number of steps can produces differences in the number of steps. If you need consistency between the two codes, linspace is the better choice (in both).

Normalization using Numpy vs hard coded

import numpy as np
import math
def normalize(array):
mean = sum(array) / len(array)
deviation = [(float(element) - mean)**2 for element in array]
std = math.sqrt(sum(deviation) / len(array))
normalized = [(float(element) - mean)/std for element in array]
numpy_normalized = (array - np.mean(array)) / np.std(array)
print normalized
print numpy_normalized
print ""
normalize([2, 4, 4, 4, 5, 5, 7, 9])
normalize([1, 2])
normalize(range(5))
Outputs:
[-1.5, -0.5, -0.5, -0.5, 0.0, 0.0, 1.0, 2.0]
[-1.5 -0.5 -0.5 -0.5 0. 0. 1. 2. ]
[0.0, 1.414213562373095]
[-1. 1.]
[-1.414213562373095, -0.7071067811865475, 0.0, 0.7071067811865475, 1.414213562373095]
[-1.41421356 -0.70710678 0. 0.70710678 1.41421356]
Can someone explain to me why this code behaves differently in the second example, but similarly in the other two examples?
Did I do anything wrong in the hard coded example? What does NumPy do to end up with [-1, 1]?
As seaotternerd explains, you're using integers. And in Python 2 (unless you from __future__ import division), dividing an integer by an integer gives you an integer.
So, why aren't all three wrong? Well, look at the values. In the first one, the sum is 40 and the len is 8, and 40 / 8 = 5. And in the third one, 10 / 5 = 2. But in the second one, 3 / 2 = 1.5. Which is why only that one gets the wrong answer when you do integer division.
So, why doesn't NumPy also get the second one wrong? NumPy doesn't treat an array of integers as floats, it treats them as integers—print np.array(array).dtype and you'll see int64. However, as the docs for np.mean explain, "float64 intermediate and return values are used for integer inputs". And, although I don't know this for sure, I'd guess they designed it that way specifically to avoid problems like this.
As a side note, if you're interested in taking the mean of floats, there are other problems with just using sum / div. For example, the mean of [1, 2, 1e200, -1e200] really ought to be 0.75, but if you just do sum / div, you're going to get 0. (Why? Well, 1 + 2 + 1e200 == 1e200.) You may want to look at a simple stats library, even if you're not using NumPy, to avoid all these problems. In Python 3 (which would have avoided your problem in the first place), there's one in the stdlib, called statistics; in Python 2, you'll have to go to PyPI.
You aren't converting the numbers in the array to floats when calculating the mean. This isn't a problem for your second or third inputs, because they happen to work out neatly (as explained by #abarnert), but since the second input does not, and is composed exclusively of ints, you end up calculating the mean as 1 when it should be 1.5. This propagates through, resulting in your discrepancy with the results of using NumPy's functions.
If you replace the line where you calculate the mean with this, which forces Python to use float division:
mean = sum(array) / float(len(array))
you will ultimately get [-1, 1] as a result for the second set of inputs, just like NumPy.

Categories