Matplotlib lines do not join smoothly, Python - python

I am using matplotlib to draw the outline of a cylindrical body, however the lines do not want to join up smoothly, as seen in the range x[40,60].
It is really subtle in this image I know, but it is unfortunately not acceptable for my purposes. I hope it is visible for you to see.
Using more data points does not seem to make a difference.
Is there a way to get curved lines to join up more smoothly in matplotlib?
Original code:
import numpy as np
import matplotlib.pylab as plt
length = 100.
a = 40
b = 20
n = 2.
alpha = np.radians(25.)
d = 18.
x_nose = np.linspace(0,a,1000)
r_nose = (0.5*d*(1 - ((x_nose-a)/a)**2)**(1/n))
x_mid = np.linspace(x_nose[-1],a+b,2)
r_mid = np.array([r_nose[-1],r_nose[-1]])
x_tail = np.linspace(x_mid[-1],length,1000)
l_tail = length-a-b
r_tail = (0.5*d - ((3*d)/(2*l_tail**2) - np.tan(alpha)/l_tail)*(x_tail-a-b)**2 + (d/l_tail**3 - np.tan(alpha)/l_tail**2)*(x_tail-a-b)**3)
fig = plt.figure()
plt.plot(x_nose,r_nose,'k',linewidth=2,antialiased=True)
plt.plot(x_mid,r_mid,'k',linewidth=2,antialiased=True)
plt.plot(x_tail,r_tail,'k',linewidth=2,antialiased=True)
plt.axis('equal')
plt.show()
You can see the effect more easily when zoomed in:

I'm not sure why this is happening, but you may be able to mitigate by constructing a single x and r array with the full line to draw.
x = np.append(x_nose, x_mid)
x = np.append(x, x_tail )
r = np.append(r_nose, r_mid)
r = np.append(r, r_tail )
plt.plot(x,r,'k',linewidth=2,antialiased=True)
This obviously prevents you altering line styles of individual elements, but it looks like you don't want to do that. This works for me:

Related

Visualizing SOM and adding labels to the map

I have been trying to apply SOM on my dataframe, my dataframe has 25 columns where each column represents a house, each house has a values for power consumption for two years, and I want to cluster the data with number of clusters = 3.
I have done the following:
import sys
sys.path.insert(0, '../')
%load_ext autoreload
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from pylab import plot,axis,show,pcolor,colorbar,bone
from matplotlib.patches import Patch
%matplotlib inline
from minisom import MiniSom
from sklearn.preprocessing import minmax_scale, scale
%autoreload 2
data1 = pd.read_excel(r"C:\Users\user\Desktop\Thesis\Tarek\Consumption.xlsx")
data1['h1'] = data1['h1'].str.split(';').str[2].astype('float')
data1['h2'] = data1['h2'].str.split(';').str[2].astype('float')
data1['h3'] = data1['h3'].str.split(';').str[2].astype('float')
data1['h4'] = data1['h4'].str.split(';').str[2].astype('float')
data1['h5'] = data1['h5'].str.split(';').str[2].astype('float')
data1['h6'] = data1['h6'].str.split(';').str[2].astype('float')
data1['h7'] = data1['h7'].str.split(';').str[2].astype('float')
data1['h8'] = data1['h8'].str.split(';').str[2].astype('float')
data1['h9'] = data1['h9'].str.split(';').str[2].astype('float')
data1['h10'] = data1['h10'].str.split(';').str[2].astype('float')
data1['h11'] = data1['h11'].str.split(';').str[2].astype('float')
data1['h12'] = data1['h12'].str.split(';').str[2].astype('float')
data1['h13'] = data1['h13'].str.split(';').str[2].astype('float')
data1['h14'] = data1['h14'].str.split(';').str[2].astype('float')
data1['h15'] = data1['h15'].str.split(';').str[2].astype('float')
data1['h16'] = data1['h16'].str.split(';').str[2].astype('float')
data1['h17'] = data1['h17'].str.split(';').str[2].astype('float')
data1['h18'] = data1['h18'].str.split(';').str[2].astype('float')
data1['h19'] = data1['h19'].str.split(';').str[2].astype('float')
data1['h20'] = data1['h20'].str.split(';').str[2].astype('float')
data1['h21'] = data1['h21'].str.split(';').str[2].astype('float')
data1['h22'] = data1['h22'].str.split(';').str[2].astype('float')
data1['h23'] = data1['h23'].str.split(';').str[2].astype('float')
data1['h24'] = data1['h24'].str.split(';').str[2].astype('float')
data1['h25'] = data1['h25'].str.split(';').str[2].astype('float')
data1.fillna(0,inplace=True)
data1=data1.round(decimals=2)
X=data1.values
som =MiniSom(x=3,y=3,input_len=25,sigma=1.0, learning_rate=0.5)
som.random_weights_init(X)
som.train_batch(data=X ,num_iteration=1000,verbose=True)
bone()
pcolor(som.distance_map().T)
colorbar()
markers = ['o' , 's','v']
colors = ['r', 'g','y']
for i, x in enumerate(X):
w = som.winner(x)
plot(w[0] + 0.5,
w[1] + 0.5,
markers[i],
markeredgecolor = colors[i],
markerfacecolor = 'None',
markersize = 10,
markeredgewidth = 2)
show()
when I am running the code, I am getting this error:
IndexError: list index out of range
please any tips to add the markers and colors in the right way without having any problems, and I would be glad if any one can help, I am a bit new to Python and tried to find a solution but I couldn`t find any.
The problem seems to be that the length of your X=data1.values is around 25 but the length of your markers and colors is only 3. So in the following for loop, when i is 3, you are trying to access markers[3] and colors[3] which throws an IndexError because both markers and colors goes up to index 2 (indexing starts from 0 in python)
for i, x in enumerate(X):
One solution is to define custom list of 25 markers and 25 colors. While you might want to define your own markers, you can leave the colors out and let the code choose automatic colors for the markeredgecolor

Change the scale of the graph image

I try to generate a graph and save an image of the graph in python. Although the "plotting" of the values seems ok and I can get my picture, the scale of the graph is badly shifted.
If you compare the correct graph from tutorial example with my bad graph generated from different dataset, the curves are cut at the bottom to early: Y-axis should start just above the highest values and I should also see the curves for the highest X-values (in my case around 10^3).
But honestly, I think that problem is the scale of the y-axis, but actually do not know what parameteres should I change to fix it. I tried to play with some numbers (see below script), but without any good results.
This is the code for calculation and generation of the graph image:
import numpy as np
hic_data = load_hic_data_from_reads('/home/besy/Hi-C/MOREX/TCC35_parsedV2/TCC35_V2_interaction_filtered.tsv', resolution=100000)
min_diff = 1
max_diff = 500
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(12, 12))
for cnum, c in enumerate(hic_data.chromosomes):
if c in ['ChrUn']:
continue
dist_intr = []
for diff in xrange(min_diff, min((max_diff, 1 + hic_data.chromosomes[c]))):
beg, end = hic_data.section_pos[c]
dist_intr.append([])
for i in xrange(beg, end - diff):
dist_intr[-1].append(hic_data[i, i + diff])
mean_intrp = []
for d in dist_intr:
if len(d):
mean_intrp.append(float(np.nansum(d)) / len(d))
else:
mean_intrp.append(0.0)
xp, yp = range(min_diff, max_diff), mean_intrp
x = []
y = []
for k in xrange(len(xp)):
if yp[k]:
x.append(xp[k])
y.append(yp[k])
l = plt.plot(x, y, '-', label=c, alpha=0.8)
plt.hlines(mean_intrp[2], 3, 5.25 + np.exp(cnum / 4.3), color=l[0].get_color(),
linestyle='--', alpha=0.5)
plt.text(5.25 + np.exp(cnum / 4.3), mean_intrp[2], c, color=l[0].get_color())
plt.plot(3, mean_intrp[2], '+', color=l[0].get_color())
plt.xscale('log')
plt.yscale('log')
plt.ylabel('number of interactions')
plt.xlabel('Distance between bins (in 100 kb bins)')
plt.grid()
plt.ylim(2, 250)
_ = plt.xlim(1, 110)
fig.savefig('/home/besy/Hi-C/MOREX/TCC35_V2_results/filtered/TCC35_V2_decay.png', dpi=fig.dpi)
I think that problem is in scale I need y-axis to start from 10^-1 (0.1), in order to change this I tried this:
min_diff = 0.1
.
.
.
dist_intr = []
for diff in xrange(min_diff, min((max_diff, 0.1 + hic_data.chromosomes[c]))):
.
.
.
plt.ylim((0.1, 20))
But this values return: "integer argument expected, got float"
I also tried to play with:
max_diff, plt.ylim and plt.xlim parameters little bit, but nothing changed to much.
I would like to ask you what parameter/s and how I need change to generate image of the correctly focused graph. Thank you in advance.

df.plot adding to itself instead of in separate figure

The following code should give me separate charts each with one line of data, but for some reason the first figure shows the 'GrowthVsValue' line, then the second figure shows me the 'GrowthVsValue' line again and adds the 'LargeVsSmall' line. But I want them to be on there own in separate figures. What do I need to add/do to make this work??
from matplotlib.backends.backend_pdf import PdfPages
pp = PdfPages('Relative Strength.pdf')
Output = pd.DataFrame({
'GrowthVsValueDIFF': 1 + (df_ch['IVV'] - df_ch['IVE']),
'LargeVsSmallDIFF': 1 + (df_ch['IVV'] - df_ch['IJR']),
}, index = df_ch.index)
Output['GrowthVsValue'] = 100
Output.loc[1:, 'GrowthVsValue'] = Output.GrowthVsValueDIFF[1:].cumprod() * 100
Output.GrowthVsValue.plot.line(legend=None)
Output.GrowthVsValue_L = plt.title('Growth v. Value RS')
plt.savefig(pp, format='pdf')
Output['LargeVsSmall'] = 100
Output.loc[1:, 'LargeVsSmall'] = Output.LargeVsSmallDIFF[1:].cumprod() * 100
Output.LargeVsSmall.plot.line(legend=None)
Output.LargeVsSmall_L = plt.title('Large v. Small RS')
plt.savefig(pp, format='pdf')
pp.close()
Use plt.close() after the first plt.savefig()

Create a static (not redrawn everytime I zoom, or move around) plot in matplotlib

I am rather new to python in general, but have found it very useful and much more intuitive than other programming languages.
I'm currently trying to plot a spectrum with a lot of data points, and it seems that every time I move around, zoom in our out, matplotlib redraws the figure, which takes some time (5 to 20 seconds every time I move).
This makes scanning through the spectrum pretty tedious, and I was thinking if maybe I could find a way to create the figure once and for all, and then just show a part of it, and move around in the static figure, without redrawing it, that would save me a lot of idle time.
My question is this : is there a (reasonably easy) way to do this in matplotlib or should I start looking into other plotting software ?
I've looked around in the documentation but to be honest, I don't understand most of it.
Thanks for the input !
Cheers
In case anyone is wondering, here is my code :
import numpy as np
import matplotlib.pyplot as plt
def show_shifted_lines(z, nb_lines, color='red'):
# Draws a vertical line with its name at the location
# of the line (shifted by (1+z) )
# Writes the name of the line
for i in range(nb_lines):
plt.text(rest_wl[i]*(1.+z), 200., line_name[i], color=color, rotation=50, rotation_mode='anchor')
# Writes the vertical line
for j in range(20):
plt.text(rest_wl[i]*(1.+z), 190-8*j, '|', color=color)
return
f = open('spectrum.dat', 'r')
## Plot Spectrum ##
wavelength = []
flux = []
for line in f :
line = line.strip()
columns = line.split()
wavelength.append(float(columns[0]))
flux.append(float(columns[1]))
flux = np.asarray(flux, dtype=float)
wavelength = np.asarray(wavelength, dtype=float)
plt.plot(wavelength, flux, color='black')
plt.xlabel(r'Wavelength (A)')
plt.ylabel(r'Flux (erg s$^{-1}$ cm$^{-2}$ A$^{-1}$)')
plt.grid(True)
f.close()
## Show location of redshifted lines ##
f = open('list_of_restframe_lines.txt','r')
line_name = []
rest_wl = []
# Redshifts of various absorption-line systems
z1 = 3.04976
z2 = 2.27831
z3 = 1.80335
z4 = 2.218
z5 = 2.2155
z6 = 2.2164
z7 = 2.8913
# Create array for not-shifted lines
for line in f :
line = line.strip()
columns = line.split()
line_name.append(columns[0])
rest_wl.append(float(columns[1]))
rest_wl = np.asarray(rest_wl, dtype=float)
f.close()
show_shifted_lines(z1, len(rest_wl))
show_shifted_lines(z2, len(rest_wl), 'magenta')
show_shifted_lines(z3, len(rest_wl), 'lightgreen')
show_shifted_lines(z4, len(rest_wl), 'green')
show_shifted_lines(z5, len(rest_wl), 'darkorange')
show_shifted_lines(z6, len(rest_wl), 'orange')
show_shifted_lines(z7, len(rest_wl), 'blue')
plt.show()
Ultimately, my spectrum looks something like this : Example of small part of a spectrum with redshifted absorbing systems shown

Plot really big file in python (5GB) with x axis offset

I am trying to plot a very big file (~5 GB) using python and matplotlib. I am able to load the whole file in memory (the total available in the machine is 16 GB) but when I plot it using simple imshow I get a segmentation fault. This is most probable to the ulimit which I have set to 15000 but I cannot set higher. I have come to the conclusion that I need to plot my array in batches and therefore made a simple code to do that. My main isue is that when I plot a batch of the big array the x coordinates start always from 0 and there is no way I can overlay the images to create a final big one. If you have any suggestion please let me know. Also I am not able to install new packages like "Image" on this machine due to administrative rights. Here is a sample of the code that reads the first 12 lines of my array and make 3 plots.
import os
import sys
import scipy
import numpy as np
import pylab as pl
import matplotlib as mpl
import matplotlib.cm as cm
from optparse import OptionParser
from scipy import fftpack
from scipy.fftpack import *
from cmath import *
from pylab import *
import pp
import fileinput
import matplotlib.pylab as plt
import pickle
def readalllines(file1,rows,freqs):
file = open(file1,'r')
sizer = int(rows*freqs)
i = 0
q = np.zeros(sizer,'float')
for i in range(rows*freqs):
s =file.readline()
s = s.split()
#print s[4],q[i]
q[i] = float(s[4])
if i%262144 == 0:
print '\r ',int(i*100.0/(337*262144)),' percent complete',
i += 1
file.close()
return q
parser = OptionParser()
parser.add_option('-f',dest="filename",help="Read dynamic spectrum from FILE",metavar="FILE")
parser.add_option('-t',dest="dtime",help="The time integration used in seconds, default 10",default=10)
parser.add_option('-n',dest="dfreq",help="The bandwidth of each frequency channel in Hz",default=11.92092896)
parser.add_option('-w',dest="reduce",help="The chuncker divider in frequency channels, integer default 16",default=16)
(opts,args) = parser.parse_args()
rows=12
freqs = 262144
file1 = opts.filename
s = readalllines(file1,rows,freqs)
s = np.reshape(s,(rows,freqs))
s = s.T
print s.shape
#raw_input()
#s_shift = scipy.fftpack.fftshift(s)
#fig = plt.figure()
#fig.patch.set_alpha(0.0)
#axes = plt.axes()
#axes.patch.set_alpha(0.0)
###plt.ylim(0,8)
plt.ion()
i = 0
for o in range(0,rows,4):
fig = plt.figure()
#plt.clf()
plt.imshow(s[:,o:o+4],interpolation='nearest',aspect='auto', cmap=cm.gray_r, origin='lower')
if o == 0:
axis([0,rows,0,freqs])
fdf, fdff = xticks()
print fdf
xticks(fdf+o)
print xticks()
#axis([o,o+4,0,freqs])
plt.draw()
#w, h = fig.canvas.get_width_height()
#buf = np.fromstring(fig.canvas.tostring_argb(), dtype=np.uint8)
#buf.shape = (w,h,4)
#buf = np.rol(buf, 3, axis=2)
#w,h,_ = buf.shape
#img = Image.fromstring("RGBA", (w,h),buf.tostring())
#if prev:
# prev.paste(img)
# del prev
#prev = img
i += 1
pl.colorbar()
pl.show()
If you plot any array with more than ~2k pixels across something in your graphics chain will down sample the image in some way to display it on your monitor. I would recommend down sampling in a controlled way, something like
data = convert_raw_data_to_fft(args) # make sure data is row major
def ds_decimate(row,step = 100):
return row[::step]
def ds_sum(row,step):
return np.sum(row[:step*(len(row)//step)].reshape(-1,step),1)
# as per suggestion from tom10 in comments
def ds_max(row,step):
return np.max(row[:step*(len(row)//step)].reshape(-1,step),1)
data_plotable = [ds_sum(d) for d in data] # plug in which ever function you want
or interpolation.
Matplotlib is pretty memory-inefficient when plotting images. It creates several full-resolution intermediate arrays, which is probably why your program is crashing.
One solution is to downsample the image before feeding it into matplotlib, as #tcaswell suggests.
I also wrote some wrapper code to do this downsampling automatically, based on your screen resolution. It's at https://github.com/ChrisBeaumont/mpl-modest-image, if it's useful. It also has the advantage that the image is resampled on the fly, so you can still pan and zoom without sacrificing resolution where you need it.
I think you're just missing the extent=(left, right, bottom, top) keyword argument in plt.imshow.
x = np.random.randn(2, 10)
y = np.ones((4, 10))
x[0] = 0 # To make it clear which side is up, etc
y[0] = -1
plt.imshow(x, extent=(0, 10, 0, 2))
plt.imshow(y, extent=(0, 10, 2, 6))
# This is necessary, else the plot gets scaled and only shows the last array
plt.ylim(0, 6)
plt.colorbar()
plt.show()

Categories