I have searched long and hard and cannot find a way to do this.
x = random.normal(100,100)
This is a variable X of type float. I want to pass all the elements of the first column as X coordinates and the elements of the second column as Y coordinates to the matplotlib.pyplot function. How do I do it ?
Also how to determine the shape of a float array ? In this case it is clearly 100x100 but since float objects do not have a float.shape attribute.
Your np.random.normal(100,100) is a simple, single float...
Like so?
import matplotlib.pyplot as plt
import numpy as np
data = np.random.normal((100,100)*100) # 2 * 100 values = 200 values normalized around 100
x = data[0::2] take even as X
y = data[1::2] take uneven as Y
plt.scatter(x,y)
plt.plot(x,y)
plt.grid(True)
plt.show()
To elaborate slightly on #Patrick Artner's answer...
x = random.normal(100,100)
This generates one random variable from a normal distribution with mean = 100 and standard deviation = 100. To see the answer more clearly, you could specify keyword arguments as
x = np.random.normal(loc=100, scale=100)
Note: loc = mean and scale = standard deviation.
See numpy's documentation: https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.random.normal.html
To answer your question about how to determine the shape of a float array, you simply call the .shape function on a float array. For example:
x = np.random.normal(0, 1, (100, 2))
print("The shape of x is %s" % (x.shape,))
Related
Let's say I have two arrays X and Y of floats (same length len) and a 2D array img (grayscale Image).
I want to calculate the values of img[X[i]][Y[i]] for i in {0..len} with a good approximation.
Is it enough to just convert X and Y to arrays of integers? Or is there a nice interpolation function that gives better results? (I looked for them but there are so many I got confused).
Thanks
import scipy.interpolate
y_interp = scipy.interpolate.interp1d(x, y)
print(y_interp(5.0))
scipy.interpolate.interp1d does linear interpolation by and can be customized to handle error conditions.
Just using python I would do something like this:
# Calculate tops (just in case the rounding goes over this)
x_top, y_top = len(img) - 1, len(img[0]) - 1
for x, y in zip(map(round, X), map(round, Y)):
x, y = min(x, x_top), min(y, y_top) # Check tops
val = img[x][y] # Here is your value
map(round, X) applies the function round to each element
zip takes to iterators and returns the elements in pairs
I am trying two multiply two ore more different arrays with one constant factor. For example, I have two arrays from pressure measurement in bar and want to convert every array seperately to Pa by multiplying every row by the factor 1e5. The return should be also two arrays. I thought about a for loop, but I am new to Python and I have no idea how to deal with it.
# for example
import numpy as np
p1=np.array([2,3,4]) # pressure measurement p1 in bar
p2=np.array([8,7,6]) # pressure measurement p2 in bar
# loop to multiply p1 and p2 seperately with 1e5
# return
# p1[2e5,3e5,4e5]
# p2[8e5,7e5,6e5]
Can anybody help?
Thank you very much!
Jonas
NumPy arrays support scalar multiplication (it's a special case of broadcasting). Just directly multiply the array by the constant: p1 *= 1e5
If you get a UFuncTypeError, it means that your array datatype doesn't match the type of the constant multiplier. For example a = np.array([1,2,3]) will create an array with int32 datatype by default, and NumPy casting rules don't allow it to by multiplied by a float. To fix this, you can explicitly specify the datatype: a = np.array([1,2,3], dtype=float) or you can give the entries as floats: a = np.array([1.0,2.0,3.0])
use numpy.multiply for this
x = np.array([2,3,4])
y = np.multiply(x, 1e5)
print(y)
Output:
[200000. 300000. 400000.]
x is not changed in the process
def multiply_two_arrays(a1, a2, factor):
return a1*factor, a2*factor
a1, a2 = multiply_two_arrays(p1, p2, 10)
When calling:
interpolator = scipy.interpolate.RegularGridInterpolator((X, Y, Z), data, method='linear')
I get the error "The points in dimension 0 must be strictly ascending".
Why must the points have strictly ascending x values? Surely I can create an interpolator with data with the same x values at time, for example with the coordinates into the data array of
0,0,0 and 0,0,1
(or X = [0,0], y = [0,0] and Z = [0,1]
I must be missing something about the input format, but can't see what.
Ok, it looks like RegularGridInterpolator isn't what I need, because it requires all values in the grid to be defined. LinearNDInterpolator is what I need.
I saw in tutorial (there were no further explanation) that we can process data to zero mean with x -= np.mean(x, axis=0) and normalize data with x /= np.std(x, axis=0). Can anyone elaborate on these two pieces on code, only thing I got from documentations is that np.mean calculates arithmetic mean calculates mean along specific axis and np.std does so for standard deviation.
This is also called zscore.
SciPy has a utility for it:
>>> from scipy import stats
>>> stats.zscore([ 0.7972, 0.0767, 0.4383, 0.7866, 0.8091,
... 0.1954, 0.6307, 0.6599, 0.1065, 0.0508])
array([ 1.1273, -1.247 , -0.0552, 1.0923, 1.1664, -0.8559, 0.5786,
0.6748, -1.1488, -1.3324])
Follow the comments in the code below
import numpy as np
# create x
x = np.asarray([1,2,3,4], dtype=np.float64)
np.mean(x) # calculates the mean of the array x
x-np.mean(x) # this is euivalent to subtracting the mean of x from each value in x
x-=np.mean(x) # the -= means can be read as x = x- np.mean(x)
np.std(x) # this calcualtes the standard deviation of the array
x/=np.std(x) # the /= means can be read as x = x/np.std(x)
From the given syntax you have I conclude, that your array is multidimensional. Hence I will first discuss the case where your x is just a linear array:
np.mean(x) will compute the mean, by broadcasting x-np.mean(x) the mean of x will be subtracted form all the entries. x -=np.mean(x,axis = 0) is equivalent to x = x-np.mean(x,axis = 0). Similar for x/np.std(x).
In the case of multidimensional arrays the same thing happens, but instead of computing the mean over the entire array, you just compute the mean over the first "axis". Axis is the numpy word for dimension. So if your x is two dimensional, then np.mean(x,axis =0) = [np.mean(x[:,0], np.mean(x[:,1])...]. Broadcasting again will ensure, that this is done to all elements.
Note, that this only works with the first dimension, otherwise the shapes will not match for broadcasting. If you want to normalize wrt another axis you need to do something like:
x -= np.expand_dims(np.mean(x, axis = n), n)
Key here are the assignment operators. They actually performs some operations on the original variable.
a += c is actually equal to a=a+c.
So indeed a (in your case x) has to be defined beforehand.
Each method takes an array/iterable (x) as input and outputs a value (or array if a multidimensional array was input), which is thus applied in your assignment operations.
The axis parameter means that you apply the mean or std operation over the rows. Hence, you take values for each row in a given column and perform the mean or std.
Axis=1 would take values of each column for a given row.
What you do with both operations is that first you remove the mean so that your column mean is now centered around 0. Then, when you divide by std, you happen to reduce the spread of the data around this zero, and now it should roughly be in a [-1, +1] interval around 0.
So now, each of your column values is centered around zero and standardized.
There are other scaling techniques, such as removing the minimal or maximal value and dividing by the range of values.
I want to interpolate between values in each row of a matrix (x-values) given a fixed vector of y-values. I am using python and essentially I need something like scipy.interpolate.interp1d but with x values being a matrix input. I implemented this by looping, but I want to make the operation as fast as possible.
Edit
Below is an example of a code of what I am doing right now, note that my matrix has more rows on order of millions:
import numpy as np
x = np.linspace(0,1,100).reshape(10,10)
results = np.zeros(10)
for i in range(10):
results[i] = np.interp(0.1,x[i],range(10))
As #Joe Kington suggested you can use map_coordinates:
import scipy.ndimage as nd
# your data - make sure is float/double
X = np.arange(100).reshape(10,10).astype(float)
# the points where you want to interpolate each row
y = np.random.rand(10) * (X.shape[1]-1)
# the rows at which you want the data interpolated -- all rows
r = np.arange(X.shape[0])
result = nd.map_coordinates(X, [r, y], order=1, mode='nearest')
The above, for the following y:
array([ 8.00091648, 0.46124587, 7.03994936, 1.26307275, 1.51068952,
5.2981205 , 7.43509764, 7.15198457, 5.43442468, 0.79034372])
Note, each value indicates the position in which the value is going to be interpolated for each row.
Gives the following result:
array([ 8.00091648, 10.46124587, 27.03994936, 31.26307275,
41.51068952, 55.2981205 , 67.43509764, 77.15198457,
85.43442468, 90.79034372])
which makes sense considering the nature of the aranged data, and the columns (y) at which it is interpolated.