Plotting value of each node using Python - python

I would like to plot the values of matrix A on y-axis as a function of node number on x-axis. However, since I have a 5x5 matrix, I don't wish to define the node numbers manually. For instance, node 1 corresponds to 2.53734572e-01, node 2 to -1.08940733e-01,..., node 6 to -5.02000098e-01 and so on.
import numpy as np
import matplotlib.pyplot as plt
Node=np.array([[1,2,3,4,5],[6,7,8,9,10]])
A=np.array([[ 2.53734572e-01, -1.08940733e-01, 3.26138649e-03,
-6.10246692e-03, -2.59115145e-02],
[-5.02000098e-01, 1.08933714e-01, -3.65540228e-02,
5.93536044e-03, 3.88767438e-02],
[-1.42775456e+00, 4.52103243e-01, -2.33067190e-02,
7.27554880e-03, 1.15638039e-01],
[ 4.81030592e-01, -8.91302226e-02, 1.40486724e-03,
2.28801066e-02, -3.83389182e-02],
[ 8.39965176e-01, -2.81589587e-01, 2.24843962e-01,
-8.47758268e-03, -6.84721033e-02]])
plt.scatter(Node, A)
plt.xlabel('Node')
plt.ylabel('Velocity')

We can reduce the matrix to one dimension and use numpy.arange on the length of the matrix:
import numpy as np
import matplotlib.pyplot as plt
ys=np.array([[ 2.53734572e-01, -1.08940733e-01, 3.26138649e-03,
-6.10246692e-03, -2.59115145e-02],
[-5.02000098e-01, 1.08933714e-01, -3.65540228e-02,
5.93536044e-03, 3.88767438e-02],
[-1.42775456e+00, 4.52103243e-01, -2.33067190e-02,
7.27554880e-03, 1.15638039e-01],
[ 4.81030592e-01, -8.91302226e-02, 1.40486724e-03,
2.28801066e-02, -3.83389182e-02],
[ 8.39965176e-01, -2.81589587e-01, 2.24843962e-01,
-8.47758268e-03, -6.84721033e-02]]).flatten()
nodes=np.arange(len(ys))
plt.scatter(nodes, ys)
plt.xlabel('Node')
plt.ylabel('Velocity')
plt.show()

Related

How to use values of find_peak function Python

I have to analyse a PPG signal. I found something to find the peaks but I can't use the values of the heights. They are stored in like a dictionary array or something and I don't know how to extract the values out of it. I tried using dict.values() but that didn't work.
import matplotlib.pyplot as plt
import numpy as np
from scipy.signal import savgol_filter
data = pd.read_excel('test_heartpy.xlsx')
arr = np.array(data)
time = arr[1:,0] # time in s
ECG = arr[1:,1] # ECG
PPG = arr[1:,2] # PPG
filtered = savgol_filter(PPG, 251, 3)
plt.plot(time, filtered)
plt.xlabel('Time (in s)')
plt.ylabel('PPG')
plt.grid('on')
The PPG signal looks like this. To search for the peaks I used:
# searching peaks
from scipy.signal import find_peaks
peaks, heights_peak_0 = find_peaks(PPG, height=0.2)
heights_peak = heights_peak_0.values()
plt.plot(PPG)
plt.plot(peaks, np.asarray(PPG)[peaks], "x")
plt.plot(np.zeros_like(PPG), "--", color="gray")
plt.title("PPG peaks")
plt.show()
print(heights_peak_0)
print(heights_peak)
print(peaks)
Printing:
{'peak_heights': array([0.4822998 , 0.4710083 , 0.43884277, 0.46728516, 0.47094727,
0.44702148, 0.43029785, 0.44146729, 0.43933105, 0.41400146,
0.45318604, 0.44335938])}
dict_values([array([0.4822998 , 0.4710083 , 0.43884277, 0.46728516, 0.47094727,
0.44702148, 0.43029785, 0.44146729, 0.43933105, 0.41400146,
0.45318604, 0.44335938])])
[787 2513 4181 5773 7402 9057 10601 12194 13948 15768 17518 19335]
Signal with highlighted peaks looks like this.
heights_peak_0 is the properties dict returned by scipy.signal.find_peaks
You can find more information about what is returned here
You can extract the array containing all the heights of the peaks with heights_peak_0["peak_heights"]
# the following will give you an array with the values of peaks
heights_peak_0['peak_heights']
# peaks seem to be the indices where find_peaks function foud peaks in the original signal. So you can get the peak values this way also
PPG[peaks]
According to the docs, the find_peaks() functions returns a tuple consisting of the peaks itself and a properties dict. As you are only interested in the peak values, you can simply ignore the second element of the tuple and only use the first one.
Assuming you want to have the 'coordinates' of your peaks you could then combine the peak heights (y-values) with its positions (x-values) like so (based on the first code snippet given in the docs):
import matplotlib.pyplot as plt
from scipy.misc import electrocardiogram
from scipy.signal import find_peaks
x = electrocardiogram()[2000:4000]
peaks, _ = find_peaks(x, distance=150)
peaks_x_values = peaks
peaks_y_values = x[peaks]
peak_coordinates = list(zip(peaks_x_values, peaks_y_values))
print(peak_coordinates)
plt.plot(x)
plt.plot(peaks_x_values, peaks_y_values, "x")
plt.show()
Printing:
[(65, 0.705), (251, 1.155), (431, 1.705), (608, 1.96), (779, 1.925), (956, 2.09), (1125, 1.745), (1292, 1.37), (1456, 1.2), (1614, 0.81), (1776, 0.665), (1948, 0.665)]

Cutting Dendrogram/Clustering Tree from SciPy at distance height

I'm trying to learn how to use dendrograms in Python using SciPy . I want to get clusters and be able to visualize them; I heard hierarchical clustering and dendrograms are the best way.
How can I "cut" the tree at a specific distance?
In this example, I just want to cut it at distance 1.6
I looked up a tutorial on https://joernhees.de/blog/2015/08/26/scipy-hierarchical-clustering-and-dendrogram-tutorial/#Inconsistency-Method but the guy did some really confusing wrapper function using **kwargs (he calls his threshold max_d)
Here is my code and plot below; I tried annotating it as best as I could for reproducibility:
from __future__ import print_function
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from scipy.cluster.hierarchy import dendrogram,linkage,fcluster
from scipy.spatial import distance
np.random.seed(424173239) #43984
#Dims
n,m = 20,7
#DataFrame: rows = Samples, cols = Attributes
attributes = ["a" + str(j) for j in range(m)]
DF_data = pd.DataFrame(np.random.random((n, m)), columns = attributes)
A_dist = distance.cdist(DF_data.as_matrix().T, DF_data.as_matrix().T)
#(i) . Do the labels stay in place from DF_data for me to do this?
DF_dist = pd.DataFrame(A_dist, index = attributes, columns = attributes)
#Create dendrogram
fig, ax = plt.subplots()
Z = linkage(distance.squareform(DF_dist.as_matrix()), method="average")
D_dendro = dendrogram(Z, labels = attributes, ax=ax) #create dendrogram dictionary
threshold = 1.6 #for hline
ax.axhline(y=threshold, c='k')
plt.show()
#(ii) How can I "cut" the tree by giving it a distance threshold?
#i.e. If I cut at 1.6 it would make (a5 : cluster_1 or not in a cluster), (a2,a3 : cluster_2), (a0,a1 : cluster_3), and (a4,a6 : cluster_4)
#link_1 says use fcluster
#This -> fcluster(Z, t=1.5, criterion='inconsistent', depth=2, R=None, monocrit=None)
#gives me -> array([1, 1, 1, 1, 1, 1, 1], dtype=int32)
print(
len(set(D_dendro["color_list"])), "^ # of colors from dendrogram",
len(D_dendro["ivl"]), "^ # of labels",sep="\n")
#3
#^ # of colors from dendrogram it should be 4 since clearly (a6, a4) and a5 are in different clusers
#7
#^ # of labels
link_1 : How to compute cluster assignments from linkage/distance matrices in scipy in Python?
color_threshold is the method I was looking for. It doesn't really help when the color_palette is too small for the amount of clusters being generated. Migrated the next step to Bigger color-palette in matplotlib for SciPy's dendrogram (Python) if anyone can help.
For a bigger color palette this should work:
from scipy.cluster import hierarchy as hc
import matplotlib.cm as cm
import matplotlib.colors as col
#get a color spectrum "gist_ncar" from matplotlib cm.
#When you have a spectrum it begins with 0 and ends with 1.
#make tinier steps if you need more than 10 colors
colors = cm.gist_ncar(np.arange(0, 1, 0.1))
colorlst=[]# empty list where you will put your colors
for i in range(len(colors)): #get for your color hex instead of rgb
colorlst.append(col.to_hex(colors[i]))
hc.set_link_color_palette(colorlst) #sets the color to use.
Put all of that infront of your code and it should work

Matplotlib,how to represent array as image?

This is what I have tried so far
import itertools
import numpy as np
import matplotlib.pyplot as plt
with open('base.txt','r') as f:
vst = map(int, itertools.imap(float, f))
v1=vst[::3]
print type(v1)
a=np.asarray(v1)
print len(a)
a11=a.reshape(50,100)
plt.imshow(a11, cmap='hot')
plt.colorbar()
plt.show()
I have (50,100) array and each element has numerical value(range 1200-5400).I would like to have image that would represent array.But I got this
What should I change to get proper image?
I don't have data from base.txt.
However, in order to simulate your problem, I created random numbers between 1500 to 5500 and created a 50 x 100 numpy array , which I believe is close to your data and requirement.
Then I simply plotted the data as per your plot code.
I am getting true representation of the array.
See if this helps.
Demo Code
#import itertools
import numpy as np
from numpy import array
import matplotlib.pyplot as plt
import random
#Generate a list of 5000 int between 1200,5500
M = 5000
myList = [random.randrange(1200,5500) for i in xrange(0,M)]
#Convert to 50 x 100 list
n = 50
newList = [myList[i:i+n] for i in range(0, len(myList), n)]
#Convert to 50 x 100 numpy array
nArray = array(newList)
print nArray
a11=nArray.reshape(50,100)
plt.imshow(a11, cmap='hot')
plt.colorbar()
plt.show()
Plot

Expanding "pixels" on matplotlib + numpy array

I have created a random data source that looks like this:
This is the code I use to gennerate and plot the first image.
import pandas as pd
import numpy as np
import numpy.ma as ma
import matplotlib.pyplot as plt
msize=25
rrange=5
jump=3
start=1
dpi=96
h=500
w=500
X,Y=np.meshgrid(range(0,msize),range(0,msize))
dat=np.random.rand(msize,msize)*rrange
msk=np.zeros_like(dat)
msk[start::jump,start::jump].fill(1)
mdat=msk*dat
mdat[mdat==0]=np.nan
mmdat = ma.masked_where(np.isnan(mdat),mdat)
fig = plt.figure(figsize=(w/dpi,h/dpi),dpi=dpi)
cmap = plt.get_cmap('RdYlBu')
cmap.set_bad(color='#cccccc', alpha=1.)
plot = plt.pcolormesh(X,Y,mmdat,cmap=cmap)
plot.axes.set_ylim(0,msize-1)
plot.axes.set_xlim(0,msize-1)
fig.savefig("masked.png",dpi=dpi)
Often this data source isn't so evenly distributed (but this is another subject).
Is there any kind of interpolation that makes the points "spill out" from its position?
Something like we take that light yellow point #(1,1) and turn all region around it (1 radius in taxi driver metric + diagonals) with the same color/value (for every valid point on image, nans will not be expanded)?
As I "gimped" on this image, on the three most lower/left values, the idea is find a way to do the same in all valid points, and not use gimp for that ;-):
After some thinking I arrived on this solution
import numpy as np
import matplotlib.pyplot as plt
t=np.array([
[ 0,0,0,0,0,0,0,0 ],
[ 0,0,0,0,0,0,0,0 ],
[ 0,0,2,0,0,4,0,0 ],
[ 0,0,0,0,0,0,0,0 ],
[ 0,0,0,0,0,0,0,0 ],
[ 0,0,3,0,0,1,0,0 ],
[ 0,0,0,0,0,0,0,0 ],
[ 0,0,0,0,0,0,0,0 ]])
def spill(arr, nval=0, m=1):
narr=np.copy(arr)
for i in range(arr.shape[0]):
for j in range(arr.shape[1]):
if arr[i][j] != nval:
narr[i-m:i+m+1:1,j-m:j+m+1:1]=arr[i][j]
return narr
l=spill(t)
plt.figure()
plt.pcolormesh(t)
plt.savefig("notspilled.png")
plt.figure()
plt.pcolormesh(l)
plt.savefig("spilled.png")
plt.show()
This solution didn't make me very happy because the double for loop inside the spill() function :-/
Here are the output from the last code
This one isn't spilled
This one was sppilled:
How can I enhance the code above to eliminate the double loop.
You could do this with a 2D convolution. For example:
from scipy.signal import convolve2d
def spill2(arr, nval=0, m=1):
return convolve2d(arr, np.ones((2*m+1, 2*m+1)), mode='same')
np.allclose(spill(t), spill2(t))
# True
Be aware that as written, the results will not match if nval != 0 or if the spilled pixels overlap, but you can probably modify this to suit your needs.

Find local maximums in numpy array

I am looking to find the peaks in some gaussian smoothed data that I have. I have looked at some of the peak detection methods available but they require an input range over which to search and I want this to be more automated than that. These methods are also designed for non-smoothed data. As my data is already smoothed I require a much more simple way of retrieving the peaks. My raw and smoothed data is in the graph below.
Essentially, is there a pythonic way of retrieving the max values from the array of smoothed data such that an array like
a = [1,2,3,4,5,4,3,2,1,2,3,2,1,2,3,4,5,6,5,4,3,2,1]
would return:
r = [5,3,6]
There exists a bulit-in function argrelextrema that gets this task done:
import numpy as np
from scipy.signal import argrelextrema
a = np.array([1,2,3,4,5,4,3,2,1,2,3,2,1,2,3,4,5,6,5,4,3,2,1])
# determine the indices of the local maxima
max_ind = argrelextrema(a, np.greater)
# get the actual values using these indices
r = a[max_ind] # array([5, 3, 6])
That gives you the desired output for r.
As of SciPy version 1.1, you can also use find_peaks. Below are two examples taken from the documentation itself.
Using the height argument, one can select all maxima above a certain threshold (in this example, all non-negative maxima; this can be very useful if one has to deal with a noisy baseline; if you want to find minima, just multiply you input by -1):
import matplotlib.pyplot as plt
from scipy.misc import electrocardiogram
from scipy.signal import find_peaks
import numpy as np
x = electrocardiogram()[2000:4000]
peaks, _ = find_peaks(x, height=0)
plt.plot(x)
plt.plot(peaks, x[peaks], "x")
plt.plot(np.zeros_like(x), "--", color="gray")
plt.show()
Another extremely helpful argument is distance, which defines the minimum distance between two peaks:
peaks, _ = find_peaks(x, distance=150)
# difference between peaks is >= 150
print(np.diff(peaks))
# prints [186 180 177 171 177 169 167 164 158 162 172]
plt.plot(x)
plt.plot(peaks, x[peaks], "x")
plt.show()
If your original data is noisy, then using statistical methods is preferable, as not all peaks are going to be significant. For your a array, a possible solution is to use double differentials:
peaks = a[1:-1][np.diff(np.diff(a)) < 0]
# peaks = array([5, 3, 6])
>> import numpy as np
>> from scipy.signal import argrelextrema
>> a = np.array([1,2,3,4,5,4,3,2,1,2,3,2,1,2,3,4,5,6,5,4,3,2,1])
>> argrelextrema(a, np.greater)
array([ 4, 10, 17]),)
>> a[argrelextrema(a, np.greater)]
array([5, 3, 6])
If your input represents a noisy distribution, you can try smoothing it with NumPy convolve function.
If you can exclude maxima at the edges of the arrays you can always check if one elements is bigger than each of it's neighbors by checking:
import numpy as np
array = np.array([1,2,3,4,5,4,3,2,1,2,3,2,1,2,3,4,5,6,5,4,3,2,1])
# Check that it is bigger than either of it's neighbors exluding edges:
max = (array[1:-1] > array[:-2]) & (array[1:-1] > array[2:])
# Print these values
print(array[1:-1][max])
# Locations of the maxima
print(np.arange(1, array.size-1)[max])

Categories