Time Series Clustering of Numpy Objects - python

Every idea or suggestion would be appreciated! I have several "the same style" numpy objects(u1,u2,u3...) each of them is :
Object 1:
[[Timestamp('2004-02-28 00:59:16'), 19.9884],
[Timestamp('2004-02-28 01:03:16'), 19.3024],
...
[Timestamp('2004-02-28 01:06:16'), 19.1652]]
Object 2:
[[Timestamp('2004-02-28 01:08:17'), 19.567],
[Timestamp('2004-02-28 01:10:16'), 19.5376],
...
[Timestamp('2004-02-28 01:26:47'), 19.4788]]
I would like to find which of the these objects has the same "trends"in the time series by clustering them. I tried several ways including:
from sklearn.neighbors import NearestNeighbors
X = np.array([u1, u2, u3])
nbrs = NearestNeighbors(n_neighbors=2, algorithm='ball_tree').fit(X)
distances, indices = nbrs.kneighbors(X)
print(distances)
Some of my errors:
TypeError: float() argument must be a string or a number, not 'Timestamp'
ValueError: setting an array element with a sequence.
TypeError: only size-1 arrays can be converted to Python scalars
Conclusion
Can someone atleast give me a suggestion what should I do. Thanks!

(1) Your first error means that Timestamp must be converted into a string or a number. Just convert them to numbers by .value, which means nanoseconds since Unix epoch time (1970-01-01). Operation in lists:
u1 = list(map(lambda el: (el[0].value / 1e9, el[1]), u1))
u2 = list(map(lambda el: (el[0].value / 1e9, el[1]), u2))
...
(2) np.array([u1, u2, u3]) produces a 3D array instead of the usually expected 2D. This may be the cause of the second error (expected a number but got a sequence instead because of a redundant dimension). Replace this by one of the following:
X = np.array(u1 + u2 + ...) # for lists
X = pd.concat([u1, u2, ...], axis=0) # for dataframes
The revised code can run. Output using your sample data:
[[ 0. 240.00098041]
[ 0. 180.00005229]
[ 0. 121.00066712]
[ 0. 119.00000363]
[ 0. 119.00000363]
[ 0. 991.00000174]]

Related

How to mutliply a number with negative power in python

When I try to multiply this by a negative integer it just returns an error
I use:
A = np.array([[1,2,0], [2,4,-2], [0,-2,3]])
From the screenshot, I can see this is homework.
So it asks for the matrix inverse. In maths this is written as A^(-1)
import numpy as np
A = np.array([[1,2,0], [2,4,-2], [0,-2,3]])
np.linalg.inv(A)
array([[-2. , 1.5 , 1. ],
[ 1.5 , -0.75, -0.5 ],
[ 1. , -0.5 , 0. ]])
In numpy, you can not raise integers by negative integer powers (Read this).
In python, the ** operator returns the value without any error.
In [6]: A = 20
In [7]: print(A ** -1)
0.05
You can also use pow(),
In [1]: A = 20
In [2]: pow(20, -1)
Out[2]: 0.05
If you're working with matrices, it's a good idea to ensure that they are instances of the numpy.matrix type rather than the more-generic numpy.ndarray.
import numpy as np
M = np.matrix([[ ... ]])
To convert an existing generic array to a matrix you can also pass it into np.asmatrix().
Once you have a matrix instance M, one way to get the inverse is M.I
To avoid the "integers not allowed" problem, ensure that the dtype of your matrix is floating-point, not integer (specify dtype=float in the call to matrix() or asmatrix())
To Insert power as negative value assume an another variable and name it "pow" and assign that negative value.
Now put below in your code.
pow = -3
value = 5**pow
print(value)
Execute the code and you will see result.
Hope it helps... 🤗🤗🤗

sum in python where list of list are in exponential

I would like to write the following summation in python
The coefficients are given as the following list
cn=[1.2,0.4,0.6]
vn=[1e-6,5e-5,1e-6]
gn=[4.5e3,6.5e3,9e3]
t=np.linspace(0,10,100)
I tried the following
import numpy as np
cn=[1.2,0.4,0.6]
vn=[1e-6,5e-5,1e-6]
gn=[4.5e3,6.5e3,9e3]
t=np.linspace(0,10,100)
yt=np.sum(cn *np.exp(-vn*(t-gn)**2))
but am getting the error
TypeError: bad operand type for unary -: 'list'
I would like to know where am getting it wrong or how to do this task
This run:
import numpy as np
cn=np.array([1.2,0.4,0.6])
vn=np.array([1e-6,5e-5,1e-6])
gn=np.array([4.5e3,6.5e3,9e3])
t=np.linspace(0,10,3)
yt=np.sum(cn * np.exp(-vn * (t - gn)**2))
Transform lists into numpy arrays
Make sure the matrix / array sizes are compatible, (ie. You can't add arrays of different lengths)
Example:
Add int to python list:
cn=[1.2,0.4,0.6]
cn+1
# output: TypeError: can only concatenate list (not "int") to list
Add int to numpy array:
cn=np.array([1.2,0.4,0.6])
cn+1
# output: array([2.2, 1.4, 1.6])
Add numpy arrays with different dimensions:
cn = np.arange(1,3)
t = np.arange(1,100)
cn + t
# output: ValueError: operands could not be broadcast together with shapes (2,) (99,)
Add numpy arrays with the same dimensions:
cn = np.arange(1,3)
t = np.arange(3,5)
cn + t
# output: array([4, 6])
Here is a lazy way of fixing it:
yt=np.sum(cn *np.exp(0-vn*(np.c_[t]-gn)**2), 1)
^ ^------^ ^-^
I've highlighted the changes. The most important change is the np.c_ which does two things:
It converts t to array
It makes t a column vector
1) serves as a "germ" for converting all the other lists to arrays via overloaded arithmetic operators.
Exception: the unary - in front of vn hits vn before it gets the chance to become an array. We put a zero in front the - to make it binary, thereby reducing it's precedence and closing the array coercion chain. This is not the most obvious fix but the one involving the least typing.
2) separates the time dimension from the summation dimension which is likely the correct interpretation. We have to add an eplicit axis argument to the sum which is the 1 we inserted at the very end of the expression.
If found two issues which I fixed but I am not sure is what you intended.
you don't need to convert the list to numpy array because you can perform arithmetic array between ndarray and list which will result ndarray.
Two error found are
1. shape of t was not matching with other arrays
2. you were trying to negate python list which doesn't support it
Also as you haven't put tn in your mathematical expression of summation above so I doubt it you want the length of t to be 3
import numpy as np
cn=[1.2,0.4,0.6]
vn=[1e-6,5e-5,1e-6]
gn=[4.5e3,6.5e3,9e3]
t=np.linspace(0,10,3) # shape of t what 100 and not matching with other arrays
yt=np.sum(cn *np.exp(-(vn*(t-gn)**2))) # -(vn*(t-gn)**2)) wasn't wrapped in brackets

How to generate numbers between 0-1 in 4200 steps

I want to generate floating point numbers between 0 and 1 that are not random. I would like the range to consist of 4200 values so In python I did 1/4200 to get, what number is needed to get from 0-1 in 4200 steps. This gave me the value 0.0002380952380952381, I confirmed this by doing 0.0002380952380952381*4200 = 1 (in Python) I have tried:
y_axis = [0.1964457, 0.20904465, 0.22422191, 0.68414455, 0.5341106, 0.49412863]
x1 = [0.18536805, 0.22449078, 0.26378343 ,0.73328144 ,0.63372454, 0.60280087,0.49412863]
y2_axis = [0.18536805 0.22449078 0.26378343 ... 0.73328144 0.63372454 0.60280087] 0.49412863]
plt.plot(pl.frange(0,1,0.0002380952380952381) , y_axis)
plt.plot(x1,y2)
This returns: ValueError: x and y must have same first dimension, but have shapes (4201,) and (4200,)
I would like help with resolving this, otherwise any other method that would also work would also be appreciated. I am sure other solutions are available and this maybe long winded. Thank you
To generate the numbers, you can use a list comprehension:
[i/4200 for i in range(4201)]
Numpy makes this really easy:
>>> import numpy as np
>>> np.linspace(0, 1, 4200)
array([ 0.00000000e+00, 2.38151941e-04, 4.76303882e-04, ...,
9.99523696e-01, 9.99761848e-01, 1.00000000e+00])

Error "TypeError: type numpy.ndarray doesn't define __round__ method"

import numpy
......
# Prediction
predictions = model.predict(X_test)
# round predictions
rounded = [round(x) for x in predictions]
print(rounded)
"predictions" is a list of decimals between [0,1] with sigmoid output.
Why does it always report this error:
File "/home/abigail/workspace/ml/src/network.py", line 41, in <listcomp>
rounded = [round(x) for x in predictions]
TypeError: type numpy.ndarray doesn't define __round__ method
If i don't use the 'round', it prints decimals correctly. This "round" should be the Python built-in function. Why does it have anything to do with numpy?
Edited:
for x in predictions:
print(x, end=' ')
The output is:
[ 0.79361773] [ 0.10443521] [ 0.90862566] [ 0.10312044] [ 0.80714297]
[ 0.23282401] [ 0.1730803] [ 0.55674052] [ 0.94095331] [ 0.11699325]
[ 0.1609294]
TypeError: type numpy.ndarray doesn't define round method
You tried applying round to numpy.ndarray. Apparently, this isn't supported.
Try this, use numpy.round:
rounded = [numpy.round(x) for x in predictions]
x is numpy array. You can also try this:
rounded = [round(y) for y in x for x in predictions]
What is model? From what module? It looks like predictions is a 2d array. What is predictions.shape? The error indicates that the x in [x for x in predictions] is an array. It may be a single element array, but it is never the less an array. You could try [x.shape for x in predictions] to see the shape of each element (row) of predictions.
I haven't had much occasion to use round, but evidently the Python function delegates the action to a .__round__ method (much as + delegates to __add__).
In [932]: round?
Docstring:
round(number[, ndigits]) -> number
Round a number to a given precision in decimal digits (default 0 digits).
This returns an int when called with one argument, otherwise the
same type as the number. ndigits may be negative.
Type: builtin_function_or_method
In [933]: x=12.34
In [934]: x.__round__?
Docstring:
Return the Integral closest to x, rounding half toward even.
When an argument is passed, work like built-in round(x, ndigits).
Type: builtin_function_or_method
In [935]: y=12
In [936]: y.__round__?
Docstring:
Rounding an Integral returns itself.
Rounding with an ndigits argument also returns an integer.
Type: builtin_function_or_method
Python integers have a different implementation than python floats.
Python lists and strings don't have definition for this, so round([1,2,3]) will return an AttributeError: 'list' object has no attribute '__round__'.
Same goes for a ndarray. But numpy has defined a np.round function, and a numpy array has a .round method.
In [942]: np.array([1.23,3,34.34]).round()
Out[942]: array([ 1., 3., 34.])
In [943]: np.round(np.array([1.23,3,34.34]))
Out[943]: array([ 1., 3., 34.])
help(np.around) gives the fullest documentation of the numpy version(s).
===================
From your last print I can reconstruct part of your predictions as:
In [955]: arr = np.array([[ 0.79361773], [ 0.10443521], [ 0.90862566]])
In [956]: arr
Out[956]:
array([[ 0.79361773],
[ 0.10443521],
[ 0.90862566]])
In [957]: for x in arr:
...: print(x, end=' ')
...:
[ 0.79361773] [ 0.10443521] [ 0.90862566]
arr.shape is (3,1) - a 2d array with 1 column.
np.round works fine, without needing the iteration:
In [958]: np.round(arr)
Out[958]:
array([[ 1.],
[ 0.],
[ 1.]])
the iteration produces your error.
In [959]: [round(x) for x in arr]
TypeError: type numpy.ndarray doesn't define __round__ method
I encountered the same error when I was trying the tutorial of Keras.
At first, I tried
rounded = [numpy.round(x) for x in predictions]
but it showed the result like this:
[array([1.], dtype=float32), array([0.],dtype=float32), ...]
then I tried this:
rounded = [float(numpy.round(x)) for x in predictions]
it showed the right outputs.
I think the "numpy.round(x)" returns list of ndarray, and contains the dtype parameter. but the outputs are correct with the value. So converting each element of the list to float type will show the right outputs as same as the tutorial.
My machine is Linux Mint 17.3(ubuntu 14.04) x64, and python interpreter is python 3.5.2, anaconda3(4.1.1), numpy 1.11.2
You're using a function that uses Numpy to store values. Instead of being a regular Python list, it is actually a Numpy array. This is generally because with machine learning, Numpy does a much better job at storing massive amounts of data compared to an ordinary list in Python. You can refer to the following documentation to convert to a regular list which you can then preform a comprehension:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.tolist.html
Edit:
What happens if you try:
for x in predictions:
for y in x.:
print(y, end=' ')
This was driving me nuts too. I had stored a reference to a scipy function with type <class 'scipy.interpolate.interpolate.interp1d'>. This was returning a single value of type <class 'numpy.ndarray'> containing a single float. I had assumed this was actually a float and it propagated back up through my library code until round produced the same error described above.
It was a case of debugging the call stack to check what actual type was being passed on after each function return. I then cast the return value from my original function call along the lines of result = float(interp1d_reference(x)). Then my code behaved as I had expected/wanted.

Numpy array being rounded? subtraction of small floats

I am assigning the elements of a numpy array to be equal to the subtraction of "small" valued, python float-type numbers. When I do this, and try to verify the results by printing to the command line, the array is reported as all zeros. Here is my code:
import numpy as np
np.set_printoptions(precision=20)
pc1x = float(-0.438765)
pc2x = float(-0.394747)
v1 = np.array([0,0,0])
v1[0] = pc1x-pc2x
print pc1x
print pc2x
print v1
The output looks like this:
-0.438765
-0.394747
[0 0 0]
I expected this for v1:
[-0.044018 0 0]
I am new to numpy, I admit, this may be an obvious mis-understanding of how numpy and float work. I thought that changing the numpy print options would fix, but no luck. Any help is great! Thanks!
You're declaring the array with v1 = np.array([0,0,0]), which numpy assumes you want an int array for. Any subsequent actions on it will maintain this int array status, so after adding your small number element wise, it casts back to int (resulting in all zeros). Declare it with
v1 = np.array([0,0,0],dtype=float)
There's a whole wealth of numpy specific/platform specific datatypes for numpy that are detailed in the dtype docs page.
You are creating the array with an integer datatype (since you don't specify it, NumPy uses the type of the initial data you gave it). Make it a float:
>>> v1 = np.array([0,0,0], dtype=np.float)
>>> v1[0] = pc1x-pc2x
>>> print v1
[-0.04401800000000000157 0. 0. ]
Or change the incoming datatype:
>>> v1 = np.array([0.0, 0.0, 0.0])
>>> v1[0] = pc1x-pc2x
>>> print v1
[-0.04401800000000000157 0. 0. ]

Categories