I think I'm following the same online tutorial as what is mentioned in the post:
How to convert deep learning gradient descent equation into python
I understand we have to calculate the cost and db but my question is why do they put axis=0 in both equations? In other words, I do not understand the axis=0, what is it used for in this calculation. What would be the result if you do the calculation without axis=0
import numpy as np
cost = -1*((np.sum(np.dot(Y,np.log(A))+np.dot((1-Y),(np.log(1-A))),axis=0))/m)
db = np.sum((A-Y),axis=0)/m
This is an example of a type of question that you could have tried out in the interpreter yourself to get an understanding of it in the same or less amount of time you probably took to compose this question.
Another way is to look at documentation. It is always a good habit to consult documentation here. Documentation on np.sum() can be found here
Some excerpts from the documentation, if you still feel lazy:
...
axis : None or int or tuple of ints, optional
Axis or axes along which a sum is performed. The default, axis=None,
will sum all of the elements of the input array. If axis is negative it
counts from the last to the first axis.
...
Some examples from the documentation:
>>> np.sum([0.5, 1.5])
2.0
>>> np.sum([[0, 1], [0, 5]])
6
>>> np.sum([[0, 1], [0, 5]], axis=0)
array([0, 6])
>>> np.sum([[0, 1], [0, 5]], axis=1)
array([1, 5])
Visualization
-----> axis = 1
| [[0, 1
| [0, 5]]
v
axis = 0
Just for clarity: in many deep learning frameworks, all parameters are treated as tensors, and so scalars are simply treated as 0-th order tensors (size 1x1). If you do a np.sum(), you flatten the tensor and sum-up all components to produce a scalar (not a tensor). By explicitly using axis=1, you create a 0-th order tensor (in your case). I don't know if this is required by the code you linked in your question, but I can imagine that this plays a role in some deep learning frameworks.
Here is a quick example that illustrates my point:
import numpy as np
x = np.ones((1, 10))
no_ax = np.sum(x)
ax0 = np.sum(x, axis=0)
ax1 = np.sum(x, axis=1)
print(no_ax, ax0, ax1)
Result:
(10.0, array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]), array([10.]))
Related
I have a problem with a numpy array.
In particular, suppose to have a matrix
x = np.array([[1., 2., 3.], [4., 5., 6.]])
with shape (2,3), I want to convert the float numbers into list so to obtain the array [[[1.], [2.], [3.]], [[4.], [5.], [6.]]] with shape (2,3,1).
I tried to convert each float number to a list (i.e., x[0][0] = [x[0][0]]) but it does not work.
Can anyone help me? Thanks
What you want is adding another dimension to your numpy array. One way of doing it is using reshape:
x = x.reshape(2,3,1)
output:
[[[1.]
[2.]
[3.]]
[[4.]
[5.]
[6.]]]
There is a function in Numpy to perform exactly what #Valdi_Bo mentions. You can use np.expand_dims and add a new dimension along axis 2, as follows:
x = np.expand_dims(x, axis=2)
Refer:
np.expand_dims
Actually, you want to add a dimension (not level).
To do it, run:
result = x[...,np.newaxis]
Its shape is just (2, 3, 1).
Or save the result back under x.
You are trying to add a new dimension to the numpy array. There are multiple ways of doing this as other answers mentioned np.expand_dims, np.new_axis, np.reshape etc. But I usually use the following as I find it the most readable, especially when you are working with vectorizing multiple tensors and complex operations involving broadcasting (check this Bounty question that I solved with this method).
x[:,:,None].shape
(2,3,1)
x[None,:,None,:,None].shape
(1,2,1,3,1)
Well, maybe this is an overkill for the array you have, but definitely the most efficient solution is to use np.lib.stride_tricks.as_strided. This way no data is copied.
import numpy as np
x = np.array([[1., 2., 3.], [4., 5., 6.]])
newshape = x.shape[:-1] + (x.shape[-1], 1)
newstrides = x.strides + x.strides[-1:]
a = np.lib.stride_tricks.as_strided(x, shape=newshape, strides=newstrides)
results in:
array([[[1.],
[2.],
[3.]],
[[4.],
[5.],
[6.]]])
>>> a.shape
(2, 3, 1)
Is there a way to perform multiple simultaneous (but unrelated) least-squares fits with different coefficient matrices in either numpy.linalg.lstsq or scipy.linalg.lstsq? For example, here is a trivial linear fit that I would like to be able to do with different x-values but the same y-values. Currently, I have to write a loop:
x = np.arange(12.0).reshape(4, 3)
y = np.arange(12.0, step=3.0)
m = np.stack((x, np.broadcast_to(1, x.shape)), axis=0)
fit = np.stack(tuple(np.linalg.lstsq(w, y, rcond=-1)[0] for w in m), axis=-1)
This results in a set of fits with the same slope and different intercepts, such that fit[n] corresponds to coefficients m[n].
Linear least squares is not a great example since it is invertible, and both functions have an option for multiple y-values. However, it serves to illustrate my point.
Ideally, I would like to extend this to any "broadcastable" combination of a and b, where a.shape[-2] == b.shape[0] exactly, and the last dimensions have to either match or be one (or missing). I am not really hung up on which dimension of a is the one representing the different matrices: it was just convenient to make it the first one to shorten the loop.
Is there a built in method in numpy or scipy to avoid the Python loop? I am very much interested in using lstsq rather than manually transposing, multiplying and inverting the matrices.
You could use scipy.sparse.linalg.lsqr together with scipy.sparse.block_diag. I'm just not sure it will be any faster.
Example:
>>> import numpy as np
>>> from scipy.sparse import block_diag
>>> from scipy.sparse import linalg as sprsla
>>>
>>> x = np.random.random((3,5,4))
>>> y = np.random.random((3,5))
>>>
>>> for A, b in zip(x, y):
... print(np.linalg.lstsq(A, b))
...
(array([-0.11536962, 0.22575441, 0.03597646, 0.52014899]), array([0.22232195]), 4, array([2.27188101, 0.69355384, 0.63567141, 0.21700743]))
(array([-2.36307163, 2.27693405, -1.85653264, 3.63307554]), array([0.04810252]), 4, array([2.61853881, 0.74251282, 0.38701194, 0.06751288]))
(array([-0.6817038 , -0.02537582, 0.75882223, 0.03190649]), array([0.09892803]), 4, array([2.5094637 , 0.55673403, 0.39252624, 0.18598489]))
>>>
>>> sprsla.lsqr(block_diag(x), y.ravel())
(array([-0.11536962, 0.22575441, 0.03597646, 0.52014899, -2.36307163,
2.27693405, -1.85653264, 3.63307554, -0.6817038 , -0.02537582,
0.75882223, 0.03190649]), 2, 15, 0.6077437777160813, 0.6077437777160813, 6.226368324510392, 106.63227777368986, 1.3277892240815807e-14, 5.36589277249043, array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]))
I'm trying to add a new column to an empty NumPy array and am facing some troubles. I've looked at a lot of other questions, but for some reason they don't seem to be helping me solve the problem I'm facing, so I decided to ask my own question.
I have an empty NumPy array such that:
array1 = np.array([])
Let's say I have data that is of shape (100, 100), and want to append each column to array1 one by one. However, if I do for example:
array1 = np.append(array1, some_data[:, 0])
array1 = np.append(array1, some_data[:, 1])
I noticed that I won't be getting a (100, 2) matrix, but a (200,) array. So I tried to specify the axis as
array1 = np.append(array1, some_data[:, 0], axis=1)
which produces a AxisError: axis 1 is out of bounds for array of dimension 1.
Next I tried to use the np.c_[] method:
array1 = np.c_[array1, somedata[:, 0]]
which gives me a ValueError: all the input array dimensions except for the concatenation axis must match exactly.
Is there any way that I would be able to add columns to the NumPy array sequentially?
Thank you.
EDIT
I learned that my initial question didn't contain enough information for others to offer help, and made this update to make up for the initial mistake.
My big objective is to make a program that selects features in a "greedy fashion." Basically, I'm trying to take the design matrix some_data, which is a (100, 100) matrix containing floating point numbers as entries, and fitting a linear regression model with an increasing number of features until I find the best set of features.
For example, since I have a total of 100 features, the first round would fit the model on each 100, select the best one and store it, then continue with the remaining 99.
That's what I'm trying to do in my head, but I got stuck from the beginning with the problem I mentioned.
You start with a (0,) array and (n,) shaped one:
In [482]: arr1 = np.array([])
In [483]: arr1.shape
Out[483]: (0,)
In [484]: arr2 = np.array([1,2,3])
In [485]: arr2.shape
Out[485]: (3,)
np.append uses concatenate (but with some funny business when axis is not provided):
In [486]: np.append(arr1, arr2)
Out[486]: array([1., 2., 3.])
In [487]: np.append(arr1, arr2,axis=0)
Out[487]: array([1., 2., 3.])
In [489]: np.concatenate([arr1, arr2])
Out[489]: array([1., 2., 3.])
And trying axis=1
In [488]: np.append(arr1, arr2,axis=1)
---------------------------------------------------------------------------
AxisError Traceback (most recent call last)
<ipython-input-488-457b8657453e> in <module>()
----> 1 np.append(arr1, arr2,axis=1)
/usr/local/lib/python3.6/dist-packages/numpy/lib/function_base.py in append(arr, values, axis)
4526 values = ravel(values)
4527 axis = arr.ndim-1
-> 4528 return concatenate((arr, values), axis=axis)
AxisError: axis 1 is out of bounds for array of dimension 1
Look at the whole message - the error occurs in the concatenate step. You can't concatenate 1d arrays along axis=1.
Using np.append or even np.concatenate iteratively is slow (it creates a new array each time), and hard to initialize correctly. It is a poor substitute for the widely use list append-to-empty-list recipe.
np.c_ is also just a cover function for concatenate.
There isn't just one empty array. np.array([[]]) and np.array([[[]]]) also have 0 elements.
If you want to add a column to an array, you need to start with a 2d array, and the column also needs to be 2d.
Here's an example of a proper concatenation of 2 2d arrays:
In [490]: np.concatenate([ np.zeros((3,0),int), np.arange(3)[:,None]], axis=1)
Out[490]:
array([[0],
[1],
[2]])
column_stack is another cover function for concatenate that makes sure the inputs are 2d. But even with that getting an initial 'empty' array is tricky.
In [492]: np.column_stack([np.zeros(3,int), np.arange(3)])
Out[492]:
array([[0, 0],
[0, 1],
[0, 2]])
In [493]: np.column_stack([np.zeros((3,0),int), np.arange(3)])
Out[493]:
array([[0],
[1],
[2]])
np.c_ is a lot like column_stack, though implemented in a different way:
In [496]: np.c_[np.zeros(3,int), np.arange(3)]
Out[496]:
array([[0, 0],
[0, 1],
[0, 2]])
The basic message is, that when using np.concatenate you need to pay attention to dimensions. Its variants allow you to fudge things a bit, but you really need to understand that fudging to get things right, especially when starting from this poorly defined idea of a 'empty' array.
I usually use concatenate method and do it like this:
# Some stuff
alldata = None
....
array1 = np.random.random((100,1))
if alldata is None: alldata = array1
...
array2 = np.random.random((100,1))
alldata = np.concatenate((alldata,array2),axis=1)
In case, you are working with vectors:
alldata = None
....
array1 = np.random.random((100,))
if alldata is None: alldata = array1[:,np.newaxis]
...
array2 = np.random.random((100,))
alldata = np.concatenate((alldata,array2[:,np.newaxis]),axis=1)
Say, I have a rank-k tensor X of shape [n1, n2, ..., nk] and a rank-(k-1) tensor IDX of shape [n2, n3, ..., nk], where IDX has the same shape as the last (k-1) dimensions of X. The entries of IDX are all integers in [0, n1). I would like to fetch some values from X where the first dimension positions are specified by IDX while the other dimensions are iterated all through.
Example:
X = tf.constant([[1,2], [3,4], [5,6],
[7,8], [9,10],[11,12]]) # 2 x 3 x 2 tensor
IDX = tf.constant([[1,0], [1,1], [0,1]]) # 3 x 2 tensor
...
# would like to get [[7,2],[9,10],[5,12]]
How to achieve this in Tensorflow efficiently? Thanks!
You can wrap np.choose() in a python function and embed it in your tensorflow graph with tf.py_func(). But you would also define gradient for your function if you would like automatic gradient computation of your graph for training to be available for you. Defining gradient for np.choose() might be very tricky task I suppose, if actually being solvable at all.
Did you see the note for choose?
Notes
To reduce the chance of misinterpretation, even though the following
"abuse" is nominally supported, choices should neither be, nor be
thought of as, a single array, i.e., the outermost sequence-like container
should be either a list or a tuple.
That is, they want you to treat it like:
In [432]: list(X)
Out[432]: [array([1, 2]), array([3, 4]), array([5, 6])]
In [433]: np.choose(IDX,list(X))
Out[433]: array([3, 6])
The indexing equivalent is:
In [436]: X[IDX,np.arange(2)]
Out[436]: array([3, 6])
choose also has some mode options.
The docs also say it's equivalent to (minus these mode issues):
np.choose(a,c) == np.array([c[a[I]][I] for I in ndi.ndindex(a.shape)])
Another nuance with choose. It can't work with more than 32 choices.
In [440]: np.choose(IDX,np.ones((33,2)))
...
ValueError: Need at least 1 and at most 32 array objects.
In [442]: np.ones((33,2))[IDX,np.arange(2)]
Out[442]: array([ 1., 1.])
So I'm new to python AND data analysis, but have been tasked to create a scatter plot. The data set that I'm using has many elements containing None values. When I use the polyfit method to create a trendline(best-fit line) I get errors for the Nones. I've tried using lists and numpy arrays with dismal results. I've also tried masked_array, masked_invalid, ect. in MULTIPLE configurations, but it kept giving me an array filled with Nones. Is there a way of creating a trendline in such a way that I don't need to remove the elements that have None values? I need them to keep my plot dimensions correct. I'm using Python 2.7. This is what I got so far:
import matplotlib.pyplot as plt
import numpy as np
import numpy.ma as ma
import pylab
#The InterpolatedUnivariateSpline method popped up during my endeavor
#to extrapolate the trendline through the gaps in data.
#To be honest, I don't think its doing anything for me...
from scipy.interpolate import InterpolatedUnivariateSpline
fig, ax = plt.subplots(1,1)
ax.scatter(y, dbm, color = 'purple', marker = 'o', s = 100)
plt.xlim(min(y), max(y))
plt.xlabel('Temp - C')
dbm_array = np.asarray(dbm) #dbm and y are lists earlier in the program
y_array = np.asarray(y)
x = np.linspace(min(y), max(y), len(y))
order = 1
s = InterpolatedUnivariateSpline(y, dbm, k=order)
blah = s(x)
plt.plot(y, blah, '--k')
This gives me the scatter plot without the trendline for some reason. No errors, so I guess I got that going for me....
Thank you so much in advance!
First of all, if you have arrays, there should be no Nones in them, just nans. This is because None is an object which cannot be expressed as a number. So, the first problem may be here. Let's have a look:
import numpy as np
a = np.array([None, 1, 2, 3, 4, None])
What do we get?
>>> a
array([None, 1, 2, 3, 4, None], dtype=object)
This is most certainly something we did not. It is an array of objects, which is most of the time something not very useful. You cannot perform any calculations on that one:
>>> 2*a
unsupported operand type(s) for *: 'int' and 'NoneType'
This happens because the element-wise multiplication tries to multiply 2*None.
So, what you really want to have is:
>>> a = np.array([np.nan, 1, 2, 3, 4, np.nan])
>>> a
array([ nan, 1., 2., 3., 4., nan])
>>> a.dtype
dtype('float64')
>>> 2 * a
array([ nan, 2., 4., 6., 8., nan])
Now everything works as expected.
So, the first thing is to check that your input arrays have the correct form. If you then have problems with curve fitting, you may create an array without the nasty nans in there:
import numpy as np
a = np.array([[0,np.nan], [1, 1], [2, 1.5], [3.2, np.nan], [4, 5]])
b = a[-np.isnan(a[:,1])]
Let's see the contents of a and b:
>>> a
array([[ 0. , nan],
[ 1. , 1. ],
[ 2. , 1.5],
[ 3.2, nan],
[ 4. , 5. ]])
>>> b
array([[ 1. , 1. ],
[ 2. , 1.5],
[ 4. , 5. ]])
And this is what you want. The curve is fitted with b without any nans which have the habit of migrating around and making the results of calculations nans. (This is by design.)
How does this work, then? The np.isnan(a[:,1]) returns a boolean array with True at each position with a nan in column 1 in a and False for each valid number. As this is exactly the opposite of what we want, we'll negate it by adding the minus sign in front. And then the indexing picks only the rows which have numbers.
In case you have your X data and Y data in two different 1-D vectors, do this:
# original y data: Y
# original x data: X
# both have the same length
# calculate a mask to be used (a boolean vector)
msk = -np.isnan(Y)
# use the mask to plot both X and Y only at the points where Y is not NaN
plot(X[msk], Y[msk])
In some cases you may not have the X data at all, but you would like to number the points from, e.g. 0 onwards (as matplotlib does if you only give it one vector). There are a couple of possibilities, but this is one:
msk = -np.isnan(Y)
X = np.arange(len(Y))
plot(X[msk], Y[msk])