Plotting in matplotlib - python

I wrote the following code,
import numpy as np
from random import gauss
from random import seed
from pandas import Series
import matplotlib.pyplot as plt
import math
###variable declaration
R=0.000001 #radiaus of particle
beta=0.23 # shape factor
Ad=9.2#characteristic nanoscale defect area
gamma=4*10**-2 #surface tension
tau=.0001 #line tension
phi=-(math.pi/2)#spatial perturbation
lamda=((Ad)/(2*3.14*R)) #averge contcat line position
mu=0.001#viscosity of liquid
lamda_m=10**-9# characteristic size of adsorption site
KbT=(1.38**-24)*293 # boltzman constant with tempartaure
nu=0.001#moleculer volume in liquid phase 1
khi=3 #scaling factor
#deltaF=(beta*gamma*Ad)#surface energy perturbation
deltaF=19*KbT
# seed random number generator
seed(0)
# create white noise series
series = [gauss(0.0, 1.0) for i in range(1)]
series = Series(series)
#########################################
Z=0.0000001 #particle position
time=1
dt=1
for time in np.arange(1, 100, dt):
#####simulation loop#######
theta=np.arccos(-Z/R) #contact angle
theta_e=((math.pi*110)/180) #equilibrium contact angle
Z_e=-R*np.cos(theta_e)#equilibrium position of particle
C=3.14*gamma*(R-Z_e) #additive constant
Fsz= (gamma*math.pi*(Z-Z_e)**2)+(tau*2*math.pi*math.sqrt(R**2-Z**2))+C
Fz=Fsz+(0.5*deltaF*np.sin((2*math.pi/lamda)*(Z-Z_e)-phi))#surface force
#dFz=(((gamma*Ad)/2)*np.sin(2*math.pi/lamda))+((Z-Z_e)*(2*gamma*math.pi))-((tau*2*math.pi*Z)/(math.sqrt(R**2-Z**2)))
dFz=(deltaF*np.sin(2*math.pi/lamda))+((Z-Z_e)*(2*gamma*math.pi))-((tau*2*math.pi*Z)/(math.sqrt(R**2-Z**2)))
w_a=gamma*lamda_m**2*(1-np.cos(theta_e)) #work of adhesion
epsilon_z=2*math.pi*R*np.sin(theta)*mu*(nu/(lamda_m**3))*np.exp(w_a/KbT)#transitional drag
epsilon_s=khi*mu*((4*math.pi**2*R**2)/math.sqrt(Ad))*(1-(Z/R)**2)
epsilon=epsilon_z+epsilon_s
Ft=math.sqrt(2*KbT*epsilon)*series #thermal force
v=(dFz+Ft)/epsilon ##new velocity
Z=Z+v*dt #new position
print('z=',Z)
print('v=',v)
print('Fz=',Fz)
print('dFz',dFz)
print('time',time)
plt.plot(Z,time)
plt.show()
According to my code I suppose to have 99 values for everything (Fz, Z, v , time). While I print , I can see all the values but while I was trying to plot them with different parameters with respect to each other for analyzing, I never get any graph. Can anyone tell me, what is missing from my code with explanation?

#AnttiA's answer is basically correct, but can be easily misunderstood, as can be seen from the OP's comment. Therefore here the complete code altered such that a plot is actually produced. Instead of making Z a list, define another variable as list, say Z_all = [], and then append the updated Z-values to that list. The same can be done for the time variable, i.e. time_all = np.arange(1,100,dt). Finally, take the plot command out of the loop and plot the entire data series at once.
Note that in your example you don't really have a series of random numbers, you pull one fixed number for one fixed seed, thus the plot is not really meaningful (it appears to be producing a straight line). Trying to interpret your intentions correctly, you probably want a series of random numbers that is as long as your time series. This is most easily done using np.random.normal
There are a lot of other ways that your code could be optimised. For instance, all mathematical functions from the math module are also found in numpy, so you could just not import math at all. The same goes for pandas. Also, you define some constant values inside the for-loop, which could be computed once before the loop. Lastly, #AnttiA is probably right, that you want time on the x axis and Z on the y axis. I therefore generate two plots -- on the left time over Z and on the right Z over time. Now finally the altered code:
import numpy as np
#from random import gauss
#from random import seed
#from pandas import Series
import matplotlib.pyplot as plt
#import math
###variable declaration
R=0.000001 #radiaus of particle
beta=0.23 # shape factor
Ad=9.2#characteristic nanoscale defect area
gamma=4*10**-2 #surface tension
tau=.0001 #line tension
phi=-(np.pi/2)#spatial perturbation
lamda=((Ad)/(2*3.14*R)) #averge contcat line position
mu=0.001#viscosity of liquid
lamda_m=10**-9# characteristic size of adsorption site
KbT=(1.38**-24)*293 # boltzman constant with tempartaure
nu=0.001#moleculer volume in liquid phase 1
khi=3 #scaling factor
#deltaF=(beta*gamma*Ad)#surface energy perturbation
deltaF=19*KbT
##quantities moved out of the for-loop:
theta_e=((np.pi*110)/180) #equilibrium contact angle
Z_e=-R*np.cos(theta_e)#equilibrium position of particle
C=3.14*gamma*(R-Z_e) #additive constant
w_a=gamma*lamda_m**2*(1-np.cos(theta_e)) #work of adhesion
#########################################
Z=0.0000001 #particle position
##time=1
dt=1
Z_all = []
time_all = np.arange(1, 100, dt)
# seed random number generator
# seed(0)
np.random.seed(0)
# create white noise series
##series = [gauss(0.0, 1.0) for i in range(1)]
##series = Series(series)
series = np.random.normal(0.0, 1.0, len(time_all))
for time, S in zip(time_all,series):
#####simulation loop#######
Z_all.append(Z)
theta=np.arccos(-Z/R) #contact angle
Fsz= (gamma*np.pi*(Z-Z_e)**2)+(tau*2*np.pi*np.sqrt(R**2-Z**2))+C
Fz=Fsz+(0.5*deltaF*np.sin((2*np.pi/lamda)*(Z-Z_e)-phi))#surface force
#dFz=(((gamma*Ad)/2)*np.sin(2*np.pi/lamda))+((Z-Z_e)*(2*gamma*np.pi))-((tau*2*np.pi*Z)/(np.sqrt(R**2-Z**2)))
dFz=(deltaF*np.sin(2*np.pi/lamda))+((Z-Z_e)*(2*gamma*np.pi))-((tau*2*np.pi*Z)/(np.sqrt(R**2-Z**2)))
epsilon_z=2*np.pi*R*np.sin(theta)*mu*(nu/(lamda_m**3))*np.exp(w_a/KbT)#transitional drag
epsilon_s=khi*mu*((4*np.pi**2*R**2)/np.sqrt(Ad))*(1-(Z/R)**2)
epsilon=epsilon_z+epsilon_s
Ft=np.sqrt(2*KbT*epsilon)*S #series #thermal force
v=(dFz+Ft)/epsilon ##new velocity
Z=Z+v*dt #new position
print('z=',Z)
print('v=',v)
print('Fz=',Fz)
print('dFz',dFz)
print('time',time)
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(8,4))
axes[0].plot(Z_all,time_all)
axes[0].set_xlabel('Z')
axes[0].set_ylabel('t')
axes[1].plot(time_all, Z_all)
axes[1].set_xlabel('t')
axes[1].set_ylabel('Z')
fig.tight_layout()
plt.show()
The result looks like this:

I suppose, you will get plot anyway, y-values are probably from 94 to 104.
Now you are plotting line with one point. Its length is zero, that's why you cannot see it, try: plt.plot(Z,time,'*').
Now you should get graph with an asterix in the middle.
As Thomas suggested, you should use arrays instead of using last calculated value. If you prefer loops (sometimes they are easier to modify), modify the lines...
Before loop:
Z = [0.0000001] # Initialize Z for time 0
time_vec = np.arange(1, 100, dt)
Inside loop:
Z.append(Z[-1] + v*dt) # new position
After loop:
plt.plot(Z[1:], time_vec)
Have no time to test it, hopefully works...
Note that first argument in plot command is x-axis values and second y-axis, I'd prefer time in x-axis.

Related

Time series dBFS plot output modification - current output plot not as expected (matplotlib)

I'm trying to plot the Amplitude (dBFS) vs. Time (s) plot of an audio (.wav) file using matplotlib. I managed to do that with the following code:
def convert_to_decibel(sample):
ref = 32768 # Using a signed 16-bit PCM format wav file. So, 2^16 is the max. value.
if sample!=0:
return 20 * np.log10(abs(sample) / ref)
else:
return 20 * np.log10(0.000001)
from scipy.io.wavfile import read as readWav
from scipy.fftpack import fft
import matplotlib.pyplot as gplot1
import matplotlib.pyplot as gplot2
import numpy as np
import struct
import gc
wavfile1 = '/home/user01/audio/speech.wav'
wavsamplerate1, wavdata1 = readWav(wavfile1)
wavdlen1 = wavdata1.size
wavdtype1 = wavdata1.dtype
gplot1.rcParams['figure.figsize'] = [15, 5]
pltaxis1 = gplot1.gca()
gplot1.axhline(y=0, c="black")
gplot1.xticks(np.arange(0, 10, 0.5))
gplot1.yticks(np.arange(-200, 200, 5))
gplot1.grid(linestyle = '--')
wavdata3 = np.array([convert_to_decibel(i) for i in wavdata1], dtype=np.int16)
yvals3 = wavdata3
t3 = wavdata3.size / wavsamplerate1
xvals3 = np.linspace(0, t3, wavdata3.size)
pltaxis1.set_xlim([0, t3 + 2])
pltaxis1.set_title('Amplitude (dBFS) vs Time(s)')
pltaxis1.plot(xvals3, yvals3, '-')
which gives the following output:
I had also plotted the Power Spectral Density (PSD, in dBm) using the code below:
from scipy.signal import welch as psd # Computes PSD using Welch's method.
fpsd, wPSD = psd(wavdata1, wavsamplerate1, nperseg=1024)
gplot2.rcParams['figure.figsize'] = [15, 5]
pltpsdm = gplot2.gca()
gplot2.axhline(y=0, c="black")
pltpsdm.plot(fpsd, 20*np.log10(wPSD))
gplot2.xticks(np.arange(0, 4000, 400))
gplot2.yticks(np.arange(-150, 160, 10))
pltpsdm.set_xlim([0, 4000])
pltpsdm.set_ylim([-150, 150])
gplot2.grid(linestyle = '--')
which gives the output as:
The second output above, using the Welch's method plots a more presentable output. The dBFS plot though informative is not very presentable IMO. Is this because of:
the difference in the domains (time in case of 1st output vs frequency in the 2nd output)?
the way plot function is implemented in pyplot?
Also, is there a way I can plot my dBFS output as a peak-to-peak style of plot just like in my PSD (dBm) plot rather than a dense stem plot?
Would be much helpful and would appreciate any pointers, answers or suggestions from experts here as I'm just a beginner with matplotlib and plots in python in general.
TLNR
This has nothing to do with pyplot.
The frequency domain is different from the time domain, but that's not why you didn't get what you want.
The calculation of dbFS in your code is wrong.
You should frame your data, calculate RMSs or peaks in every frame, and then convert that value to dbFS instead of applying this transformation to every sample point.
When we talk about the amplitude, we are talking about a periodic signal. And when we read in a series of data from a sound file, we read in a series of sample points of a signal(may be or be not periodic). The value of every sample point represents a, say, voltage value, or sound pressure value sampled at a specific time.
We assume that, within a very short time interval, maybe 10ms for example, the signal is stationary. Every such interval is called a frame.
Some specific function is applied to each frame usually, to reduce the sudden change at the edge of this frame, and these functions are called window functions. If you did nothing to every frame, you added rectangle windows to them.
An example: when the sampling frequency of your sound is 44100Hz, in a 10ms-long frame, there are 44100*0.01=441 sample points. That's what the nperseg argument means in your psd function but it has nothing to do with dbFS.
Given the knowledge above, now we can talk about the amplitude.
There are two methods a get the value of amplitude in every frame:
The most straightforward one is to get the maximum(peak) values in every frame.
Another one is to calculate the RMS(Root Mean Sqaure) of every frame.
After that, the peak values or RMS values can be converted to dbFS values.
Let's start coding:
import numpy as np
import matplotlib.pyplot as plt
from scipy.io import wavfile
# Determine full scall(maximum possible amplitude) by bit depth
bit_depth = 16
full_scale = 2 ** bit_depth
# dbFS function
to_dbFS = lambda x: 20 * np.log10(x / full_scale)
# Read in the wave file
fname = "01.wav"
fs,data = wavfile.read(fname)
# Determine frame length(number of sample points in a frame) and total frame numbers by window length(how long is a frame in seconds)
window_length = 0.01
signal_length = data.shape[0]
frame_length = int(window_length * fs)
nframes = signal_length // frame_length
# Get frames by broadcast. No overlaps are used.
idx = frame_length * np.arange(nframes)[:,None] + np.arange(frame_length)
frames = data[idx].astype("int64") # Convert to in 64 to avoid integer overflow
# Get RMS and peaks
rms = ((frames**2).sum(axis=1)/frame_length)**.5
peaks = np.abs(frames).max(axis=1)
# Convert them to dbfs
dbfs_rms = to_dbFS(rms)
dbfs_peak = to_dbFS(peaks)
# Let's start to plot
# Get time arrays of every sample point and ever frame
frame_time = np.arange(nframes) * window_length
data_time = np.linspace(0,signal_length/fs,signal_length)
# Plot
f,ax = plt.subplots()
ax.plot(data_time,data,color="k",alpha=.3)
# Plot the dbfs values on a twin x Axes since the y limits are not comparable between data values and dbfs
tax = ax.twinx()
tax.plot(frame_time,dbfs_rms,label="RMS")
tax.plot(frame_time,dbfs_peak,label="Peak")
tax.legend()
f.tight_layout()
# Save serval details
f.savefig("whole.png",dpi=300)
ax.set_xlim(1,2)
f.savefig("1-2sec.png",dpi=300)
ax.set_xlim(1.295,1.325)
f.savefig("1.2-1.3sec.png",dpi=300)
The whole time span looks like(the unit of the right axis is dbFS):
And the voiced part looks like:
You can see that the dbFS values become greater while the amplitudes become greater at the vowel start point:

How can I cut a piece away from a plot and set the point I need to zero?

In my work I have the task to read in a CSV file and do calculations with it. The CSV file consists of 9 different columns and about 150 lines with different values acquired from sensors. First the horizontal acceleration was determined, from which the distance was derived by double integration. This represents the lower plot of the two plots in the picture. The upper plot represents the so-called force data. The orange graph shows the plot over the 9th column of the CSV file and the blue graph shows the plot over the 7th column of the CSV file.
As you can see I have drawn two vertical lines in the lower plot in the picture. These lines represent the x-value, which in the upper plot is the global minimum of the orange function and the intersection with the blue function. Now I want to do the following, but I need some help: While I want the intersection point between the first vertical line and the graph to be (0,0), i.e. the function has to be moved down. How do I achieve this? Furthermore, the piece of the function before this first intersection point (shown in purple) should be omitted, so that the function really only starts at this point. How can I do this?
In the following picture I try to demonstrate how I would like to do that:
If you need my code, here you can see it:
import numpy as np
import matplotlib.pyplot as plt
import math as m
import loaddataa as ld
import scipy.integrate as inte
from scipy.signal import find_peaks
import pandas as pd
import os
# Loading of the values
print(os.path.realpath(__file__))
a,b = os.path.split(os.path.realpath(__file__))
print(os.chdir(a))
print(os.chdir('..'))
print(os.chdir('..'))
path=os.getcwd()
path=path+"\\Data\\1 Fabienne\\Test1\\left foot\\50cm"
print(path)
dataListStride = ld.loadData(path)
indexStrideData = 0
strideData = dataListStride[indexStrideData]
#%%Calculation of the horizontal acceleration
def horizontal(yAngle, yAcceleration, xAcceleration):
a = ((m.cos(m.radians(yAngle)))*yAcceleration)-((m.sin(m.radians(yAngle)))*xAcceleration)
return a
resultsHorizontal = list()
for i in range (len(strideData)):
strideData_yAngle = strideData.to_numpy()[i, 2]
strideData_xAcceleration = strideData.to_numpy()[i, 4]
strideData_yAcceleration = strideData.to_numpy()[i, 5]
resultsHorizontal.append(horizontal(strideData_yAngle, strideData_yAcceleration, strideData_xAcceleration))
resultsHorizontal.insert(0, 0)
#plt.plot(x_values, resultsHorizontal)
#%%
#x-axis "convert" into time: 100 Hertz makes 0.01 seconds
scale_factor = 0.01
x_values = np.arange(len(resultsHorizontal)) * scale_factor
#Calculation of the global high and low points
heel_one=pd.Series(strideData.iloc[:,7])
plt.scatter(heel_one.idxmax()*scale_factor,heel_one.max(), color='red')
plt.scatter(heel_one.idxmin()*scale_factor,heel_one.min(), color='blue')
heel_two=pd.Series(strideData.iloc[:,9])
plt.scatter(heel_two.idxmax()*scale_factor,heel_two.max(), color='orange')
plt.scatter(heel_two.idxmin()*scale_factor,heel_two.min(), color='green')#!
#Plot of force data
plt.plot(x_values[:-1],strideData.iloc[:,7]) #force heel
plt.plot(x_values[:-1],strideData.iloc[:,9]) #force toe
# while - loop to calculate the point of intersection with the blue function
i = heel_one.idxmax()
while strideData.iloc[i,7] > strideData.iloc[i,9]:
i = i-1
# Length calculation between global minimum orange function and intersection with blue function
laenge=(i-heel_two.idxmin())*scale_factor
print(laenge)
#%% Integration of horizontal acceleration
velocity = inte.cumtrapz(resultsHorizontal,x_values)
plt.plot(x_values[:-1], velocity)
#%% Integration of the velocity
s = inte.cumtrapz(velocity, x_values[:-1])
plt.plot(x_values[:-2],s)
I hope it's clear what I want to do. Thanks for helping me!
I didn't dig all the way through your code, but the following tricks may be useful.
Say you have x and y values:
x = np.linspace(0,3,100)
y = x**2
Now, you only want the values corresponding to, say, .5 < x < 1.5. First, create a boolean mask for the arrays as follows:
mask = np.logical_and(.5 < x, x < 1.5)
(If this seems magical, then run x < 1.5 in your interpreter and observe the results).
Then use this mask to select your desired x and y values:
x_masked = x[mask]
y_masked = y[mask]
Then, you can translate all these values so that the first x,y pair is at the origin:
x_translated = x_masked - x_masked[0]
y_translated = y_masked - y_masked[0]
Is this the type of thing you were looking for?

How'd I use the Pyplot function in Python?

I'm having trouble trying to display my results after the program is finished. I'm expecting to see a velocity vs position graph, but for some reason, it's not showing up. Is there something wrong with my code.
import numpy
from matplotlib import pyplot
import time,sys
#specifying parameters
nx=41 # number of space steps
nt=25 #number of time steps
dt=0.025 #width between time intervals
dx=2/(nx-1) #width between space intervals
c=1 # the speed of the initial wave
#boundary conditions
u = numpy.ones(nx)
u[int(0.5/dx):int(1/(dx+1))] = 2
un = numpy.ones(nx)
#initializing the velocity function
for i in range(nt):
un= u.copy()
for i in range(1,nx):
u[i]= un[i] -c*(dt/dx)*(u[i]-u[i-1])
pyplot.xlabel('Position')
pyplot.ylabel('Velocity')
pyplot.plot(numpy.linspace(0,2,nx),u)
There are a few things going on here
You don't need to write out the full name of the packages you are importing. You can just use aliasing to call those packages and use them with those aliases later on. This, for example.
import numpy as np
Your dx value initalization will give you 0 beause you are dividing 2 by 40 which will give you a zero. You can initialize the value of dx by making one of the values in that expression a float, so something like this.
dx=float(2)/(nx-1) #width between space intervals
As Meowcolm Law in the comments suggested in the comments, add pyplot.show() to
show the graph. This is what the edited version of your code will like
import numpy as np
import matplotlib.pyplot as plt
import time,sys
#specifying parameters
nx=41 # number of space steps
nt=25 #number of time steps
dt=0.025 #width between time intervals
dx=float(2)/(nx-1) #width between space intervals
c=1 # the speed of the initial wave
#boundary conditions
u = np.ones(nx)
u[int(0.5/dx):int(1/(dx+1))] = 2
un = np.ones(nx)
#initializing the velocity function
for i in range(nt):
un= u.copy()
for i in range(1,nx):
u[i]= un[i] -c*(dt/dx)*(u[i]-u[i-1])
plt.xlabel('Position')
plt.ylabel('Velocity')
plt.plot(np.linspace(0,2,nx),u)
plt.show()
You can add
%matplotlib inline
in order to view the plot inside the notebook. I missed this step following the same guide.

How to remove/omit smaller contour lines using matplotlib

I am trying to plot contour lines of pressure level. I am using a netCDF file which contain the higher resolution data (ranges from 3 km to 27 km). Due to higher resolution data set, I get lot of pressure values which are not required to be plotted (rather I don't mind omitting certain contour line of insignificant values). I have written some plotting script based on the examples given in this link http://matplotlib.org/basemap/users/examples.html.
After plotting the image looks like this
From the image I have encircled the contours which are small and not required to be plotted. Also, I would like to plot all the contour lines smoother as mentioned in the above image. Overall I would like to get the contour image like this:-
Possible solution I think of are
Find out the number of points required for plotting contour and mask/omit those lines if they are small in number.
or
Find the area of the contour (as I want to omit only circled contour) and omit/mask those are smaller.
or
Reduce the resolution (only contour) by increasing the distance to 50 km - 100 km.
I am able to successfully get the points using SO thread Python: find contour lines from matplotlib.pyplot.contour()
But I am not able to implement any of the suggested solution above using those points.
Any solution to implement the above suggested solution is really appreciated.
Edit:-
# Andras Deak
I used print 'diameter is ', diameter line just above del(level.get_paths()[kp]) line to check if the code filters out the required diameter. Here is the filterd messages when I set if diameter < 15000::
diameter is 9099.66295612
diameter is 13264.7838257
diameter is 445.574234531
diameter is 1618.74618114
diameter is 1512.58974168
However the resulting image does not have any effect. All look same as posed image above. I am pretty sure that I have saved the figure (after plotting the wind barbs).
Regarding the solution for reducing the resolution, plt.contour(x[::2,::2],y[::2,::2],mslp[::2,::2]) it works. I have to apply some filter to make the curve smooth.
Full working example code for removing lines:-
Here is the example code for your review
#!/usr/bin/env python
from netCDF4 import Dataset
import matplotlib
matplotlib.use('agg')
import matplotlib.pyplot as plt
import numpy as np
import scipy.ndimage
from mpl_toolkits.basemap import interp
from mpl_toolkits.basemap import Basemap
# Set default map
west_lon = 68
east_lon = 93
south_lat = 7
north_lat = 23
nc = Dataset('ncfile.nc')
# Get this variable for later calucation
temps = nc.variables['T2']
time = 0 # We will take only first interval for this example
# Draw basemap
m = Basemap(projection='merc', llcrnrlat=south_lat, urcrnrlat=north_lat,
llcrnrlon=west_lon, urcrnrlon=east_lon, resolution='l')
m.drawcoastlines()
m.drawcountries(linewidth=1.0)
# This sets the standard grid point structure at full resolution
x, y = m(nc.variables['XLONG'][0], nc.variables['XLAT'][0])
# Set figure margins
width = 10
height = 8
plt.figure(figsize=(width, height))
plt.rc("figure.subplot", left=.001)
plt.rc("figure.subplot", right=.999)
plt.rc("figure.subplot", bottom=.001)
plt.rc("figure.subplot", top=.999)
plt.figure(figsize=(width, height), frameon=False)
# Convert Surface Pressure to Mean Sea Level Pressure
stemps = temps[time] + 6.5 * nc.variables['HGT'][time] / 1000.
mslp = nc.variables['PSFC'][time] * np.exp(9.81 / (287.0 * stemps) * nc.variables['HGT'][time]) * 0.01 + (
6.7 * nc.variables['HGT'][time] / 1000)
# Contour only at 2 hpa interval
level = []
for i in range(mslp.min(), mslp.max(), 1):
if i % 2 == 0:
if i >= 1006 and i <= 1018:
level.append(i)
# Save mslp values to upload to SO thread
# np.savetxt('mslp.txt', mslp, fmt='%.14f', delimiter=',')
P = plt.contour(x, y, mslp, V=2, colors='b', linewidths=2, levels=level)
# Solution suggested by Andras Deak
for level in P.collections:
for kp,path in enumerate(level.get_paths()):
# include test for "smallness" of your choice here:
# I'm using a simple estimation for the diameter based on the
# x and y diameter...
verts = path.vertices # (N,2)-shape array of contour line coordinates
diameter = np.max(verts.max(axis=0) - verts.min(axis=0))
if diameter < 15000: # threshold to be refined for your actual dimensions!
#print 'diameter is ', diameter
del(level.get_paths()[kp]) # no remove() for Path objects:(
#level.remove() # This does not work. produces ValueError: list.remove(x): x not in list
plt.gcf().canvas.draw()
plt.savefig('dummy', bbox_inches='tight')
plt.close()
After the plot is saved I get the same image
You can see that the lines are not removed yet. Here is the link to mslp array which we are trying to play with http://www.mediafire.com/download/7vi0mxqoe0y6pm9/mslp.txt
If you want x and y data which are being used in the above code, I can upload for your review.
Smooth line
You code to remove the smaller circles working perfectly. However the other question I have asked in the original post (smooth line) does not seems to work. I have used your code to slice the array to get minimal values and contoured it. I have used the following code to reduce the array size:-
slice = 15
CS = plt.contour(x[::slice,::slice],y[::slice,::slice],mslp[::slice,::slice], colors='b', linewidths=1, levels=levels)
The result is below.
After searching for few hours I found this SO thread having simmilar issue:-
Regridding regular netcdf data
But none of the solution provided over there works.The questions similar to mine above does not have proper solutions. If this issue is solved then the code is perfect and complete.
General idea
Your question seems to have 2 very different halves: one about omitting small contours, and another one about smoothing the contour lines. The latter is simpler, since I can't really think of anything else other than decreasing the resolution of your contour() call, just like you said.
As for removing a few contour lines, here's a solution which is based on directly removing contour lines individually. You have to loop over the collections of the object returned by contour(), and for each element check each Path, and delete the ones you don't need. Redrawing the figure's canvas will get rid of the unnecessary lines:
# dummy example based on matplotlib.pyplot.clabel example:
import matplotlib
import numpy as np
import matplotlib.cm as cm
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
delta = 0.025
x = np.arange(-3.0, 3.0, delta)
y = np.arange(-2.0, 2.0, delta)
X, Y = np.meshgrid(x, y)
Z1 = mlab.bivariate_normal(X, Y, 1.0, 1.0, 0.0, 0.0)
Z2 = mlab.bivariate_normal(X, Y, 1.5, 0.5, 1, 1)
# difference of Gaussians
Z = 10.0 * (Z2 - Z1)
plt.figure()
CS = plt.contour(X, Y, Z)
for level in CS.collections:
for kp,path in reversed(list(enumerate(level.get_paths()))):
# go in reversed order due to deletions!
# include test for "smallness" of your choice here:
# I'm using a simple estimation for the diameter based on the
# x and y diameter...
verts = path.vertices # (N,2)-shape array of contour line coordinates
diameter = np.max(verts.max(axis=0) - verts.min(axis=0))
if diameter<1: # threshold to be refined for your actual dimensions!
del(level.get_paths()[kp]) # no remove() for Path objects:(
# this might be necessary on interactive sessions: redraw figure
plt.gcf().canvas.draw()
Here's the original(left) and the removed version(right) for a diameter threshold of 1 (note the little piece of the 0 level at the top):
Note that the top little line is removed while the huge cyan one in the middle doesn't, even though both correspond to the same collections element i.e. the same contour level. If we didn't want to allow this, we could've called CS.collections[k].remove(), which would probably be a much safer way of doing the same thing (but it wouldn't allow us to differentiate between multiple lines corresponding to the same contour level).
To show that fiddling around with the cut-off diameter works as expected, here's the result for a threshold of 2:
All in all it seems quite reasonable.
Your actual case
Since you've added your actual data, here's the application to your case. Note that you can directly generate the levels in a single line using np, which will almost give you the same result. The exact same can be achieved in 2 lines (generating an arange, then selecting those that fall between p1 and p2). Also, since you're setting levels in the call to contour, I believe the V=2 part of the function call has no effect.
import numpy as np
import matplotlib.pyplot as plt
# insert actual data here...
Z = np.loadtxt('mslp.txt',delimiter=',')
X,Y = np.meshgrid(np.linspace(0,300000,Z.shape[1]),np.linspace(0,200000,Z.shape[0]))
p1,p2 = 1006,1018
# this is almost the same as the original, although it will produce
# [p1, p1+2, ...] instead of `[Z.min()+n, Z.min()+n+2, ...]`
levels = np.arange(np.maximum(Z.min(),p1),np.minimum(Z.max(),p2),2)
#control
plt.figure()
CS = plt.contour(X, Y, Z, colors='b', linewidths=2, levels=levels)
#modified
plt.figure()
CS = plt.contour(X, Y, Z, colors='b', linewidths=2, levels=levels)
for level in CS.collections:
for kp,path in reversed(list(enumerate(level.get_paths()))):
# go in reversed order due to deletions!
# include test for "smallness" of your choice here:
# I'm using a simple estimation for the diameter based on the
# x and y diameter...
verts = path.vertices # (N,2)-shape array of contour line coordinates
diameter = np.max(verts.max(axis=0) - verts.min(axis=0))
if diameter<15000: # threshold to be refined for your actual dimensions!
del(level.get_paths()[kp]) # no remove() for Path objects:(
# this might be necessary on interactive sessions: redraw figure
plt.gcf().canvas.draw()
plt.show()
Results, original(left) vs new(right):
Smoothing by resampling
I've decided to tackle the smoothing problem as well. All I could come up with is downsampling your original data, then upsampling again using griddata (interpolation). The downsampling part could also be done with interpolation, although the small-scale variation in your input data might make this problem ill-posed. So here's the crude version:
import scipy.interpolate as interp #the new one
# assume you have X,Y,Z,levels defined as before
# start resampling stuff
dN = 10 # use every dN'th element of the gridded input data
my_slice = [slice(None,None,dN),slice(None,None,dN)]
# downsampled data
X2,Y2,Z2 = X[my_slice],Y[my_slice],Z[my_slice]
# same as X2 = X[::dN,::dN] etc.
# upsampling with griddata over original mesh
Zsmooth = interp.griddata(np.array([X2.ravel(),Y2.ravel()]).T,Z2.ravel(),(X,Y),method='cubic')
# plot
plt.figure()
CS = plt.contour(X, Y, Zsmooth, colors='b', linewidths=2, levels=levels)
You can freely play around with the grids used for interpolation, in this case I just used the original mesh, as it was at hand. You can also play around with different kinds of interpolation: the default 'linear' one will be faster, but less smooth.
Result after downsampling(left) and upsampling(right):
Of course you should still apply the small-line-removal algorithm after this resampling business, and keep in mind that this heavily distorts your input data (since if it wasn't distorted, then it wouldn't be smooth). Also, note that due to the crude method used in the downsampling step, we introduce some missing values near the top/right edges of the region under consideraton. If this is a problem, you should consider doing the downsampling based on griddata as I've noted earlier.
This is a pretty bad solution, but it's the only one that I've come up with. Use the get_contour_verts function in this solution you linked to, possibly with the matplotlib._cntr module so that nothing gets plotted initially. That gives you a list of contour lines, sections, vertices, etc. Then you have to go through that list and pop the contours you don't want. You could do this by calculating a minimum diameter, for example; if the max distance between points is less than some cutoff, throw it out.
That leaves you with a list of LineCollection objects. Now if you make a Figure and Axes instance, you can use Axes.add_collection to add all of the LineCollections in the list.
I checked this out really quick, but it seemed to work. I'll come back with a minimum working example if I get a chance. Hope it helps!
Edit: Here's an MWE of the basic idea. I wasn't familiar with plt._cntr.Cntr, so I ended up using plt.contour to get the initial contour object. As a result, you end up making two figures; you just have to close the first one. You can replace checkDiameter with whatever function works. I think you could turn the line segments into a Polygon and calculate areas, but you'd have to figure that out on your own. Let me know if you run into problems with this code, but it at least works for me.
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
def checkDiameter(seg, tol=.3):
# Function for screening line segments. NB: Not actually a proper diameter.
diam = (seg[:,0].max() - seg[:,0].min(),
seg[:,1].max() - seg[:,1].min())
return not (diam[0] < tol or diam[1] < tol)
# Create testing data
x = np.linspace(-1,1, 21)
xx, yy = np.meshgrid(x,x)
z = np.exp(-(xx**2 + .5*yy**2))
# Original plot with plt.contour
fig0, ax0 = plt.subplots()
# Make sure this contour object actually has a tiny contour to remove
cntrObj = ax0.contour(xx,yy,z, levels=[.2,.4,.6,.8,.9,.95,.99,.999])
# Primary loop: Copy contours into a new LineCollection
lineNew = list()
for lineOriginal in cntrObj.collections:
# Get properties of the original LineCollection
segments = lineOriginal.get_segments()
propDict = lineOriginal.properties()
propDict = {key: value for (key,value) in propDict.items()
if key in ['linewidth','color','linestyle']} # Whatever parameters you want to carry over
# Filter out the lines with small diameters
segments = [seg for seg in segments if checkDiameter(seg)]
# Create new LineCollection out of the OK segments
if len(segments) > 0:
lineNew.append(mpl.collections.LineCollection(segments, **propDict))
# Make new plot with only these line collections; display results
fig1, ax1 = plt.subplots()
ax1.set_xlim(ax0.get_xlim())
ax1.set_ylim(ax0.get_ylim())
for line in lineNew:
ax1.add_collection(line)
plt.show()
FYI: The bit with propDict is just to automate bringing over some of the line properties from the original plot. You can't use the whole dictionary at once, though. First, it contains the old plot's line segments, but you can just swap those for the new ones. But second, it appears to contain a number of parameters that are in conflict with each other: multiple linewidths, facecolors, etc. The {key for key in propDict if I want key} workaround is my way to bypass that, but I'm sure someone else can do it more cleanly.

Can anyone please explain how this python code works line by line?

I am working in image processing right now in python using numpy and scipy all the time. I have one piece of code that can enlarge an image, but not sure how this works.
So please some expert in scipy/numpy in python can explain to me line by line. I am always eager to learn.
import numpy as N
import os.path
import scipy.signal
import scipy.interpolate
import matplotlib.pyplot as plt
import matplotlib.cm as cm
def enlarge(img, rowscale, colscale, method='linear'):
x, y = N.meshgrid(N.arange(img.shape[1]), N.arange(img.shape[0]))
pts = N.column_stack((x.ravel(), y.ravel()))
xx, yy = N.mgrid[0.:float(img.shape[1]):1/float(colscale),
0.:float(img.shape[0]):1/float(rowscale)]
large = scipy.interpolate.griddata(pts, img.flatten(), (xx, yy), method).T
large[-1,:] = large[-2,:]
large[:,-1] = large[:,-2]
return large
Thanks a lot.
First, a grid of empty points is created with point per pixel.
x, y = N.meshgrid(N.arange(img.shape[1]), N.arange(img.shape[0]))
The actual image pixels are placed into the variable pts which will be needed later.
pts = N.column_stack((x.ravel(), y.ravel()))
After that, it creates a mesh grid with one point per pixel for the enlarged image; if the original image was 200x400, the colscale set to 4 and rowscale set to 2, the mesh grid would have (200*4)x(400*2) or 800x800 points.
xx, yy = N.mgrid[0.:float(img.shape[1]):1/float(colscale),
0.:float(img.shape[0]):1/float(rowscale)]
Using scipy, the points in pts variable are interpolated into the larger grid. Interpolation is the manner in which missing points are filled or estimated usually when going from a smaller set of points to a larger set of points.
large = scipy.interpolate.griddata(pts, img.flatten(), (xx, yy), method).T
I am not 100% certain what the last two lines do without going back and looking at what the griddata method returns. It appears to be throwing out some additional data that isn't needed for the image or performing a translation.
large[-1,:] = large[-2,:]
large[:,-1] = large[:,-2]
return large

Categories