How to plot with dtaidistance - python

I tried an example using dtaidistance and dtw, but it did not plot the result.
from dtaidistance import dtw
from dtaidistance import dtw_visualisation as dtwvis
import numpy as np
s1 = np.array([0., 0, 1, 2, 1, 0, 1, 0, 0, 2, 1, 0, 0])
s2 = np.array([0., 1, 2, 3, 1, 0, 0, 0, 2, 1, 0, 0, 0])
path = dtw.warping_path(s1, s2)
dtwvis.plot_warping(s1, s2, path, filename="warp.png")
It should have looked like this:
Unfortunately, it did not show up.
I tried to add "plt.show()" at the end of the code (previously importing pyplot of course). But, in this case, it has not helped.
What is the rason it does not plot the graph, like in the example above?

The picture in the example is not meant to be plotted within python but saved as .png to hard drive where the python code is stored.

Related

How does scipy.ndimage.filters.convolve when the mode is reflective

I am trying to figure out how to do this with numpy, so I can then convert it to c++ from scratch. I have figured out how to do it when the mode is constant. The way that is done is shown below.
import numpy as np
from scipy import signal
a = np.array([[1, 2, 0, 0], [5, 3, 0, 4], [0, 0, 0, 7], [9, 3, 0, 0]])
k = np.array([[1,0,0],[0,1,0],[0,0,0]])
a = np.pad(a, 1)
k = np.flip(k)
output = signal.convolve(a, k, 'valid')
Which then comes out to the same output as scipy.ndimage.filters.convolve(a, k, mode='constant) So I thought that when the mode was reflect it would work the same way. Except, that the line a = np.pad(a, 1) would be changed to a = np.pad(a, 1, mode='reflect'). However, that does not seem to be the case. Could someone explain how it would work from scratch using numpy and scipy.signal.convolve? Thank you.

How do i find the row echelon form (REF)

import numpy as np
import sympy as sp
Vec = np.matrix([[1,1,1,5],[1,2,0,3],[2,1,3,12]])
Vec_rref = sp.Matrix(Vec).rref()
print(Vec_rref) ##<-- this code prints the RREF, but i am looking for the code for REF (See below)
I have found plenty of codes which solves the RREF but not codes for REF, if **it makes sense. The code i have developed gives the following:
(Matrix([
[1, 0, 2, 7],
[0, 1, -1, -2],
[0, 0, 0, 0]]), (0, 1))
I am looking for a code which should solve the following:
1XXX
REF = 01XX
001X
and not
100X
RREF = 010X
001X
New here so bare with me guys. Thanks in advance :-)
You are using the function of sympy: rref wich is associated to "reduced row-echelon form". You might want to use .echelon_form() instead
import numpy as np
import sympy as sp
from scipy import linalg
Vec = np.matrix([[1,1,1,5],
[1,2,0,3],
[2,1,3,12]])
Vec_rref =sp.Matrix(Vec).echelon_form()
print(Vec_rref)
wich outputs:
Matrix([[1, 1, 1, 5], [0, 1, -1, -2], [0, 0, 0, 0]])

Range_color attribute seemingly not working in my code that uses plotly express?

I want to fix it from 0 to 200 using the range_color attribute.
The range_color attribute does not do this for some reason, and all the values are within the given range. I am also then exporting the resultant figure as a png using the kaleido backend, does this maybe override it?
This is the section of code that does this. Each point within the data array does not exceed 200 and none are negative. Why does the range_color attribute not override the scale?
Section of Code that creates the plot and exports it as a png:
fig = px.choropleth(locationmode='USA-states', locations=location, color=data, scope="usa", range_color=(0,200))
fig.write_image("figure.png", engine="kaleido")
Array example:
[4, 2, 0, 8, 2, 1, 0, 0, 1, 1, 42, 17, 0, 0, 6, 0, 0, 1, 0, 0, 1, 2, 104, 0, 5, 0, 1, 0, 0, 1, 0, 0, 0, 0, 54, 21, 0, 36, 8, 4, 8, 0, 10, 0, 0, 7, 47, 0, 0, 0, 29, 21, 1, 0, 0]
Result:
A choropleth map of the USA where the scale does not match the range_color values I put in.
I have tried it over several differing data sets as well and the scale is never made to be 0 to 200.
Edit:
I just tried recreating the code in a different file and it worked, I did nothing differently except not including some of the code that gets the data from a CSV, instead manually creating the same data set. I'm very confused now.
have created a MWE to define the two arrays that you code snippet depends on
clearly range_color is working as expected. This points to the fact that your code that sets up these two arrays contains values you do not expect
import pandas as pd
import numpy as np
import plotly.express as px
# source and structure US states...
df = pd.read_html("https://www.bls.gov/respondents/mwr/electronic-data-interchange/appendix-d-usps-state-abbreviations-and-fips-codes.htm")[0]
df.columns = df.iloc[0]
df = pd.concat([df.iloc[1:-1, 0:3], df.iloc[1:-1,3:6]]).reset_index(drop=True).iloc[0:-1]
# set up arrays used by sample code
location = df["Postal Abbr."].values
data = np.random.randint(1,300, len(location))
fig = px.choropleth(locationmode='USA-states', locations=location, color=data, scope="usa", range_color=(0,200))
# fig.write_image("figure.png", engine="kaleido")
fig

Scikit image: proper way of counting cells in the objects of an image

Say you have an image in the form of a numpy.array:
vals=numpy.array([[3,24,25,6,2],[8,7,6,3,2],[1,4,23,23,1],[45,4,6,7,8],[17,11,2,86,84]])
And you want to compute how many cells are inside each object, given a threshold value of 17 (example):
from scipy import ndimage
from skimage.measure import regionprops
blobs = numpy.where(vals>17, 1, 0)
labels, no_objects = ndimage.label(blobs)
props = regionprops(blobs)
If you check, this gives an image with 4 distinct objects over the threshold:
In[1]: blobs
Out[1]:
array([[0, 1, 1, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 1, 1, 0],
[1, 0, 0, 0, 0],
[0, 0, 0, 1, 1]])
In fact:
In[2]: no_objects
Out[2]: 4
I want to compute the number of cells (or area) of each object. The intended outcome is a dictionary with the object ID: number of cells format:
size={0:2,1:2,2:1,3:2}
My attempt:
size={}
for label in props:
size[label]=props[label].area
Returns an error:
Traceback (most recent call last):
File "<ipython-input-76-e7744547aa17>", line 3, in <module>
size[label]=props[label].area
TypeError: list indices must be integers, not _RegionProperties
I understand I am using label incorrectly, but the intent is to iterate over the objects. How to do this?
A bit of testing and research sometimes goes a long way.
The problem is both with blobs, because it is not carrying the different labels but only 0,1 values, and label, which needs to be replaced by an iterator looping over range(0,no_objects).
This solution seems to be working:
import skimage.measure as measure
import numpy
from scipy import ndimage
from skimage.measure import regionprops
vals=numpy.array([[3,24,25,6,2],[8,7,6,3,2],[1,4,23,23,1],[45,4,6,7,8],[17,11,2,86,84]])
blobs = numpy.where(vals>17, 1, 0)
labels, no_objects = ndimage.label(blobs)
#blobs is not in an amicable type to be processed right now, so:
labelled=ndimage.label(blobs)
resh_labelled=labelled[0].reshape((vals.shape[0],vals.shape[1])) #labelled is a tuple: only the first element matters
#here come the props
props=measure.regionprops(resh_labelled)
#here come the sought-after areas
size={i:props[i].area for i in range (0, no_objects)}
Result:
In[1]: size
Out[1]: {0: 2, 1: 2, 2: 1, 3: 2}
And if anyone wants to check for the labels:
In[2]: labels
Out[2]:
array([[0, 1, 1, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 2, 2, 0],
[3, 0, 0, 0, 0],
[0, 0, 0, 4, 4]])
And if anyone wants to plot the 4 objects found:
import matplotlib.pyplot as plt
plt.set_cmap('OrRd')
plt.imshow(labels,origin='upper')
To answer the original question:
You have to apply regionprops to the labeled image: props = regionprops(labels)
You can then construct the dictionary using:
size = {r.label: r.area for r in props}
which yields
{1: 2, 2: 2, 3: 1, 4: 2}
That regionprops will generate a lot more information than just the area of each blob. So, if you are just looking to get the count of pixels for the blobs, as an alternative and with focus on performance, we can use np.bincount on labels obtained with ndimage.label, like so -
np.bincount(labels.ravel())[1:]
Thus, for the given sample -
In [53]: labeled_areas = np.bincount(labels.ravel())[1:]
In [54]: labeled_areas
Out[54]: array([2, 2, 1, 2])
To have these results in a dictionary, one additional step would be -
In [55]: dict(zip(range(no_objects), labeled_areas))
Out[55]: {0: 2, 1: 2, 2: 1, 3: 2}

How do I perform dimensionality reduction on two independent XOR gates?

Take the probability distribution of a XOR gate in which every configuration is equally probable (configurations are given by outcomes_sub; the probability mass function by pmf_xor_sub):
import numpy as np
import itertools as it
outcomes_sub = [list(item) for item in list(it.product([0,1], repeat=3))]
pmf_xor_sub = np.array([1/4, 0, 0, 1/4, 0, 1/4, 1/4, 0])
Now take the probability distribution corresponding to two uncorrelated such XORs:
outcomes = [outcome1 + outcome2 for (outcome1, outcome2)
in it.product(outcomes_sub, outcomes_sub)]
pmf_xor = [pmf1 * pmf2 for (pmf1, pmf2)
in it.product(pmf_xor_sub, pmf_xor_sub)]
And create some data based on it:
indices = np.random.choice(len(outcomes), 10000, p=pmf_xor)
data_xor = np.array([outcomes[index] for index in indices])
data_xor looks like this:
array([[1, 1, 0, 0, 0, 0],
[1, 0, 1, 0, 0, 0],
[0, 1, 1, 1, 1, 0],
...,
[0, 1, 1, 1, 1, 0],
[1, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0]])
I.e., two independent XORs back to back. What's the right way to perform dimensionality reduction on it? PCA won't work (because the dependence is non-linear, right?):
from sklearn import decomposition
pca_xor = decomposition.PCA()
pca_xor.fit(data_xor)
Now, pca_xor.explained_variance_ratio_ gives:
array([ 0.17145045, 0.17018817, 0.16758773, 0.16575979, 0.16410862,
0.16090524], dtype=float32)
No two components stand out. I understand that a non-linear method such as kernel PCA should work here, but I am struggling to find pointers to ways of applying it to my problem.
To give a bit more context: what I am actually after is ways to bring out the structure in data_xor: two big XOR blobs, each of which is composed of some finer-grained stuff. If I am going about it all wrong, feel free to point that out too.

Categories