Kymatio Scattering Transform 1D Error 'The input must be contiguous.' - python

I am starting to work with the Kymatio library, in order to use the Scattering transform as an extractor of 1D signal characteristics. The final idea is to classify 1D signals.
I followed the example available at the link
https://www.kymat.io/gallery_1d/plot_classif_torch.html#sphx-glr-gallery-1d-plot-classif-torch-py
Based on this example, I imported three .mat files that contain the compiled data from the COOLL dataset (https://coolldataset.github.io/). Two variables were imported:
x2 contains the values ​​of the appliance currents. x2 is a matrix with 840 lines and 4 * 8192 columns.
y2 contains the Label list. It has 840 positions, one for each appliance.
I'm trying to calculate the coefficients of the Scattering1D transform for each of the signals that x2 contains. For this, I am doing the following:
T=32768;
J=8;
Q=12;
if use_cuda:
scattering.cuda()
x2 = x2.cuda()
y2 = y2.cuda()
Sx_all = scattering.forward(x2)
When I do this, the following error appears:
RuntimeError Traceback (most recent call last)
<ipython-input-62-26c538d90a70> in <module>()
1 #Sx_all = scattering(x2)
----> 2 Sx_all = scattering.forward(x2)
1 frames
/usr/local/lib/python3.6/dist-packages/kymatio/backend/torch_backend.py in input_checks(x)
9
10 if not x.is_contiguous():
---> 11 raise RuntimeError('The input must be contiguous.')
12
13 def _is_complex(x):
RuntimeError: The input must be contiguous.
This error does not appear when I run the original program, from the example available at https://www.kymat.io/gallery_1d/plot_classif_torch.html#sphx-glr-gallery-1d-plot-classif-torch-py.
What exactly does the error message 'The input must be contiguous' mean, and how do you suggest I fix the problem? I tried to read the library documentation but I still haven't solved the problem.

I believe that I have found the solution, using tensor.contiguous().
x2 = x_all_import['x_all']
x2 = torch.from_numpy(x2)
x2 = x2.contiguous();
y2 = y_all_import['y_all']
y2 = y2.flatten()
y2 = torch.from_numpy(y2)
y2 = y2.contiguous();

Related

pyclustering visualising xmeans when the matrix has more than three dimensions

I'm trying to cluster and visualise some data with xmeans from the pyclustering lib.
I copied the code directly from the example in the documentation,
from pyclustering.cluster import cluster_visualizer
from pyclustering.cluster.xmeans import xmeans
from pyclustering.cluster.center_initializer import kmeans_plusplus_initializer
from pyclustering.utils import read_sample
from pyclustering.samples.definitions import SIMPLE_SAMPLES
sample = X # read_sample(SIMPLE_SAMPLES.SAMPLE_SIMPLE3)
# Prepare initial centers - amount of initial centers defines amount of clusters from which X-Means will
# start analysis.
amount_initial_centers = 2
initial_centers = kmeans_plusplus_initializer(sample, amount_initial_centers).initialize()
# Create instance of X-Means algorithm. The algorithm will start analysis from 2 clusters, the maximum
# number of clusters that can be allocated is 20.
xmeans_instance = xmeans(sample, initial_centers, 20)
xmeans_instance.process()
# Extract clustering results: clusters and their centers
clusters = xmeans_instance.get_clusters()
centers = xmeans_instance.get_centers()
# Print total sum of metric errors
print("Total WCE:", xmeans_instance.get_total_wce())
# Visualize clustering results
visualizer = cluster_visualizer()
visualizer.append_clusters(clusters, sample)
visualizer.append_cluster(centers, None, marker='*', markersize=10)
visualizer.show()
The only difference is that I assigned to sample the value of my matrix X instead of loading a sample dataset.
When I try to visualise the clustering result I get this error:
Only objects with size dimension 1 (1D plot), 2 (2D plot) or 3 (3D plot) can be displayed. For multi-dimensional data use 'cluster_visualizer_multidim'.
My X matrix is generated in this way:
features = ["I", "Iu", other 7 column names]
data = df[features]
...
X = scaler.fit_transform(data)
Is there a way to visualise the clusters and plotting only two/three features at a time?
I can't find anything on the documentation.
I tried this:
visualizer.append_clusters(clusters, sample[:,[0,1]])
in order to visualise only the first two features and got this error
Only clusters with the same dimension of objects can be displayed on canvas.
EDIT:
I updated the code as suggested in the answer by annoviko but now I get the following error:
ValueError Traceback (most recent call last)
<ipython-input-69-6fd7d2ce5fcd> in <module>
20 visualizer.append_clusters(clusters, X)
21 visualizer.append_cluster(centers, None, marker='*', markersize=10)
---> 22 visualizer.show(pair_filter=[[0, 1], [0, 2]])
/usr/local/lib/python3.8/site-packages/pyclustering/cluster/__init__.py in show(self, pair_filter, **kwargs)
224 raise ValueError("There is no non-empty clusters for visualization.")
225
--> 226 cluster_data = self.__clusters[0].data or self.__clusters[0].cluster
227 dimension = len(cluster_data[0])
228
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
It is raised by visualizer.show(), and it happens even if I remove the pair_filter from within the function call.
In line with the error that you got:
Only objects with size dimension 1 (1D plot), 2 (2D plot) or 3 (3D plot) can be displayed. For multi-dimensional data use 'cluster_visualizer_multidim'.
You have to use cluster_visualizer_multidim as it was mentioned. There is a documentation (pyclustering 0.10.1) with an example: https://pyclustering.github.io/docs/0.10.1/html/dc/d6b/classpyclustering_1_1cluster_1_1cluster__visualizer__multidim.html
For example, if you have a data (D > 3) and you want to display (x0, x1) and (x0, x2) then you can display it in the following way:
visualizer = cluster_visualizer_multidim()
visualizer.append_clusters(clusters, sample_4d)
visualizer.show(pair_filter=[[0, 1], [0, 2]])
Where pair_filter specifies which features should be shown. In example above, it will show only (x0, x1) - [0, 1] and (x0, x2) - [0, 2].
So, in your particular case where you have to display only first two features it should be:
visualizer = cluster_visualizer_multidim()
visualizer.append_clusters(clusters, sample)
visualizer.show(pair_filter=[[0, 1]])
I think I have to make error more readable and make a proposal to use another class in the first sentence. Let me know if it helps (if it is still relevant for you).

I'm unable to transform my DataFrame into a Variable to store data-values/features (Linear Discriminant Analysis)

I'm using LDA to reduce two tables I've created, holds and latency, down from 9 and 18 features respectively (along with a target each). I planned on using LDA for this and am currently trying to parse in the features into a variable. However that doesn't seem to be working. I receive a KeyError(1) whenever I do this. My data is perfectly fine and here is the code. If anyone could tell me what's wrong with it, I'd be very grateful. Here is a tail of both my DataFrames:
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
lda = LDA(n_components=2)
X = holds[[0,1,2,3,4,5,6,7,8]].values
Y = holds[9].values
X2 = latency[[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17]].values
Y2 = latency[9].values
This error has nothing to do with the LDA or scikit-learn in general.
The error is coming from the way you try to index the pandas dataframe that you have.
Use this:
X = holds.iloc[: , [0,1,2,3,4,5,6,7,8]].values
Y = holds.iloc[:, 9].values
Similarly, for X2 and Y2.

Why does `scipy.interpolate.griddata` fail for readonly arrays?

I have some data which I try to interpolate using scipy.interpolate.griddata. In my use-case I marked some of the numpy arrays read-only, which apparently breaks the interpolation:
import numpy as np
from scipy import interpolate
x0 = 10 * np.random.randn(100, 2)
y0 = np.random.randn(100)
x1 = np.random.randn(3, 2)
x0.flags.writeable = False
# x1.flags.writeable = False
interpolate.griddata(x0, y0, x1)
yields the following exception:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-14-a6e09dbdd371> in <module>()
6 # x1.flags.writeable = False
7
----> 8 interpolate.griddata(x0, y0, x1)
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/interpolate/ndgriddata.pyc in griddata(points, values, xi, method, fill_value, rescale)
216 ip = LinearNDInterpolator(points, values, fill_value=fill_value,
217 rescale=rescale)
--> 218 return ip(xi)
219 elif method == 'cubic' and ndim == 2:
220 ip = CloughTocher2DInterpolator(points, values, fill_value=fill_value,
scipy/interpolate/interpnd.pyx in scipy.interpolate.interpnd.NDInterpolatorBase.__call__ (scipy/interpolate/interpnd.c:3930)()
scipy/interpolate/interpnd.pyx in scipy.interpolate.interpnd.LinearNDInterpolator._evaluate_double (scipy/interpolate/interpnd.c:5267)()
scipy/interpolate/interpnd.pyx in scipy.interpolate.interpnd.LinearNDInterpolator._do_evaluate (scipy/interpolate/interpnd.c:6006)()
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/interpolate/interpnd.so in View.MemoryView.memoryview_cwrapper (scipy/interpolate/interpnd.c:17829)()
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/interpolate/interpnd.so in View.MemoryView.memoryview.__cinit__ (scipy/interpolate/interpnd.c:14104)()
ValueError: buffer source array is read-only
Clearly, the interpolation function doesn't like that the arrays are write-protected. However, I don't understand why they want to change this – I certainly don't expect my input to be mutated by a call to the interpolation function and this is also not mentioned in the documentation as far as I can tell. Why would the function behave like this?
Note that setting x1 readonly instead of x0 leads to a similar error.
The relevant code is written in Cython, and when Cython requests a memoryview of the input array, it always asks for a writeable one, even if you don't need it.
Since an array flagged as non-writeable will refuse to provide a writeable memoryview, the code fails, even though it didn't need to write to the array in the first place.

warning during py-faster-rcnn training on custom datasets

While training py-faster-rcnn on a custom dataset following the instructions at https://github.com/deboc/py-faster-rcnn/blob/master/help/Readme.md
I encountered some errors like
AttributeError: 'numpy.ndarray' object has no attribute 'toarray' in py-faster-rcnn
which I managed to bypass by editing https://github.com/rbgirshick/py-faster-rcnn/blob/master/lib/roi_data_layer/roidb.py
gt_overlaps = roidb[i]['gt_overlaps']
gt_overlaps = sp.sparse.csr_matrix(gt_overlaps).toarray()
However, during the training process, I received a warning twice
RuntimeWarning: invalid value encountered in log targets_dw = np.log(gt_widths / ex_widths)
in the file https://github.com/rbgirshick/py-faster-rcnn/blob/master/lib/fast_rcnn/bbox_transform.py
Are the results going to be affected by this ?
Do I need to do something different ?
maybe u should try to modify “lib/datasets/pascal_voc.py”
in the function “_load_pascal_annotation(,)”
the right one should be:
x1 = float(bbox.find('xmin').text)
y1 = float(bbox.find('ymin').text)
x2 = float(bbox.find('xmax').text)
y2 = float(bbox.find('ymax').text)
the reason is in your own data, x1 or y1 maybe equals to 1, if minus 1,then the number is negative, which caused the error

Kalman filter implementation in python for speed estimation

I try to implement Kalman filter for predicting speed one step ahead.
Implementing in python
H=np.diag([1,1])
H
Result:
array([[1, 0],
[0, 1]])
For measurement vector
datafile is csv file containing time as one column and speed in another column
measurements=np.vstack((mx,my,datafile.speed))
#length of meassurement
m=measurements.shape[1]
print(measurements.shape)
Output: (3, 1069)
Kalman
for filterstep in range(m-1):
#Time Update
#=============================
#Project the state ahead
x=A*x
#Project the error covariance ahead
P=A*P*A.T+Q
#Measurement Update(correction)
#===================================
#if there is GPS measurement
if GPS[filterstep]:
#COmpute the Kalman Gain
S =(H*P*H).T + R
S_inv=S.inv()
K=(P*H.T)*S_inv
#Update the estimate via z
Z = measurements[:,filterstep].reshape(H.shape[0],1)
y=Z-(H*x)
x = x + (K*y)
#Update the error covariance
P=(I-(K*H))*P
# Save states for Plotting
x0.append(float(x[0]))
x1.append(float(x[1]))
Zx.append(float(Z[0]))
Zy.append(float(Z[1]))
Px.append(float(P[0,0]))
Py.append(float(P[1,1]))
Kx.append(float(K[0,0]))
Ky.append(float(K[1,0]))
Error comes as:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-80-9b15fccbaca8> in <module>()
20
21 #Update the estimate via z
---> 22 Z = measurements[:,filterstep].reshape(H.shape[0],1)
23 y=Z-(H*x)
24 x = x + (K*y)
ValueError: total size of new array must be unchanged
How can i remove such error
This line is incorrect:
S =(H*P*H).T + R
The correct code is:
S =(H*P*H.T) + R
I'm having trouble following what the measurements are. You stated
" array([[1, 0], [0, 1]]) For measurement vector datafile is csv file containing time as one column and speed in another column"
So that reads to me as a CSV file with two columns, one time, and one speed. In that case you have only one measurement at each time, the speed. For a single measurement, your H-matrix should be a row-vector.

Categories