I'm trying to plot a delaunay triangulation from a pandas df. I'm hoping to group the points by Time. At present, I'm getting an error when attempting to plot the point from the first time point.
QhullError: QH6214 qhull input error: not enough points(2) to construct initial simplex (need 6)
While executing: | qhull d Q12 Qt Qc Qz Qbb
Options selected for Qhull 2019.1.r 2019/06/21:
run-id 768388270 delaunay Q12-allow-wide Qtriangulate Qcoplanar-keep
Qz-infinity-point Qbbound-last _pre-merge _zero-centrum Qinterior-keep
_maxoutside 0
It appears it's only passing those two arrays as a single points.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.spatial import Delaunay
df = pd.DataFrame({
'Time' : [1,1,1,1,2,2,2,2],
'A_X' : [5, 5, 6, 6, 4, 3, 3, 4],
'A_Y' : [5, 6, 6, 5, 5, 6, 5, 6],
})
fig, ax = plt.subplots(figsize = (6,6))
ax.set_xlim(0,10)
ax.set_ylim(0,10)
ax.grid(False)
points_x1 = df.groupby("Time")["A_X"].agg(list).tolist()
points_y1 = df.groupby("Time")["A_Y"].agg(list).tolist()
points = list(zip(points_x1, points_y1))
tri = Delaunay(points[0])
#plot triangulation
plt.triplot(points[:,0], points[:,1], tri.simplices)
plt.plot(points[:,0], points[:,1], 'o')
You can take advantage of the apply method which allows to perform operation on Series.
def make_points(x):
return np.array(list(zip(x['A_X'], x['A_Y'])))
c = df.groupby("Time").apply(make_points)
Result is properly shaped array of points for each time bucket:
Time
1 [[5, 5], [5, 6], [6, 6], [6, 5]]
2 [[4, 5], [3, 6], [3, 5], [4, 6]]
dtype: object
Finally it suffices to compute the Delaunay triangulation for each time bucket and plot it:
fig, axe = plt.subplots()
for p in c:
tri = Delaunay(p)
axe.triplot(*p.T, tri.simplices)
You can even make it in a single call:
def make_triangulation(x):
return Delaunay(np.array(list(zip(x['A_X'], x['A_Y']))))
c = df.groupby("Time").apply(make_triangulation)
fig, axe = plt.subplots()
for tri in c:
axe.triplot(*tri.points.T, tri.simplices)
Related
I have the following DataFrame:
LATITUDE LONGITUDE STATE
... ... True
With the code bellow I can plot the graph with coordinates
import matplotlib.pyplot as plt
plt.scatter(x=df['LAT'], y=df['LONG'])
plt.show()
graph
However, I want to define two different colors for each point according to the 'state' attribute
How to do this?
What you're looking for is the c parameter, taking your example and adding the STATUS column
import matplotlib.pyplot as plt
df = {'LAT': [1, 2, 3, 4, 5], 'LONG': [3, 2, 4, 5, 3], 'STATUS': [0, 1, 0, 0, 1] }
plt.scatter(x=df['LAT'], y=df['LONG'], c=df['STATUS'])
plt.show()
it shows a bicoloured chart
I'm trying to change the size of only SOME of the markers in a seaborn pairplot.
df = pd.DataFrame({'num_legs': [2, 4, 8, 0],
'num_wings': [2, 0, 0, 0],
'num_specimen_seen': [10, 2, 1, 8]},
index=['falcon', 'dog', 'spider', 'fish'])
Prettier:
num_legs num_wings num_specimen_seen class
falcon 2 2 10 1
dog 4 0 2 2
spider 8 0 1 3
fish 0 0 8 4
I want to for example increase the size of all samples with class=4.
How could this be done with the seaborn pairplot?
What I have so far:
sns.pairplot(data=df,diag_kind='hist',hue='class')
I have tried adding plot_kws={"s": 3}, but that changes the size of all the dots. Cheers!
After checking out how the pairplot is built up, one could iterate through the axes and change the size of each 4th set of scatter dots:
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
import pandas as pd
N = 100
classes = np.random.randint(1, 5, N)
df = pd.DataFrame({'num_legs': 2 * classes % 8,
'num_wings': (classes == 1) * 2,
'num_specimen_seen': np.random.randint(1,20,N),
'class': classes})
g = sns.pairplot(data=df,diag_kind='hist',hue='class')
for ax in np.ravel(g.axes):
if len(ax.collections) == 4:
ax.collections[3].set_sizes([100])
g.fig.legends[0].legendHandles[3].set_sizes([100])
plt.show()
I have a requirement to add subplots with two column and with multiple rows. The rows will not be fixed but for one column I want to create seaborn line plot from one data set and for second column i want to create seaborn line plot for another data set.
I have tried the following but not working.
tips = sns.load_dataset("tips")
dataset2=tips
days = list(tips.drop_duplicates('day')['day'])
ggpec = gridspec.GridSpec(len(days ), 2)
axs = []
for i,j in zip(days,range(1,len(days)+1)):
fig = plt.figure(figsize=(20,4),dpi=200)
palette = sns.color_palette("magma", 2)
chart = sns.lineplot(x="time", y="total_bill",
hue="sex",style='sex',
palette=palette, data=tips[tips['day']==i])
chart.set_xticklabels(
chart.get_xticklabels(),
rotation=90,
minor=True,
verticalalignment=True,
horizontalalignment='right',
fontweight='light',
fontsize='large'
)
plt.title("Title 1",fontsize=18, fontweight='bold')
fig2 = plt.figure(figsize=(20,5),dpi=200)
palette = sns.color_palette("magma", 2)
chart = sns.lineplot(x="time", y="total_bill",
hue="sex",style='sex',
palette=palette, data=dataset2[dataset2['day']==i])
chart.set_xticklabels(
chart.get_xticklabels(),
rotation=90,
minor=True,
verticalalignment=True,
horizontalalignment='right',
fontweight='light',
fontsize='large'
)
plt.title("Title 2",fontsize=18, fontweight='bold')
plt.show()
for creating multiple plots with 2 columns and multiple rows, you can use subplot. Where in you define the number of rows, columns and the subplot to activate at present.
import matplotlib.pyplot as plt
plt.subplot(3, 2, 1) # Define 3 rows, 2 column, Activate subplot 1.
plt.plot([1, 2, 3, 4, 5, 6, 7], [7, 8, 6, 5, 2, 2, 4], 'b*-', label='Plot 1')
plt.subplot(3, 2, 2) # 3 rows, 2 column, Activate subplot 2.
# plot some data here
plt.plot([1, 2, 3, 4, 5, 6, 7], [7, 8, 6, 5, 2, 2, 4], 'b*-', label='Plot 2')
plt.subplot(3, 2, 3) # 3 rows, 2 column, Activate subplot 3.
# plot some data here
plt.plot([1, 2, 3, 4, 5, 6, 7], [7, 8, 6, 5, 2, 2, 4], 'b*-', label='Plot 3')
# to Prevent subplots overlap
plt.tight_layout()
plt.show()
You can build upon this concept to draw you seaborn plots as well.
f, axes = plt.subplots(3,2) # Divide the plot into 3 rows, 2 columns
# Draw the plot in first row second column
sns.lineplot(xData, yData, data=dataSource, ax=axes[0][1])
I'm fairly new to Pandas, but typically what I do with data (when all columns are of equal sizes), I build np.zeros(count) matrices, then use a for loop to populate the data from a text file (np.genfromtxt()) to do my graphing and analysis in matplotlib.
However, I am now trying to implement similar analysis with columns of different sizes on the same plot from a CSV file.
For instance:
data.csv:
A B C D E F
1 2 3 4 5 6
2 3 4 5 6 7
3 4 5 6
4 5
df = pandas.read_csv('data.csv')
ax = df.plot(x = 'A', y = 'B')
df.plot(x = 'C', y = 'D', ax = ax)
df.plot(x = 'E', y = 'F', ax = ax)
This code plots the first two on the same graph, but the rest of the information is lost (and there are a lot more columns of mismatched sizes, but the x/y columns I am plotting are the all the same size).
Is there an easier way to do all of this? Thanks!
Here is how you could generalize your solution :
I edited my answer to add an error handling. If you have a lonely last column, it'll still work.
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
data = {
'A' : [1, 2, 3, 4],
'B' : [2, 3, 4, 5],
'C' : [3, 4, 5, np.nan],
'D' : [4, 5, 6, np.nan],
'E' : [5, 6, np.nan, np.nan],
'F' : [6, 7, np.nan, np.nan]
}
df = pd.DataFrame(data)
def Chris(df):
ax = df.plot(x='A', y='B')
df.plot(x='C', y='D', ax=ax)
df.plot(x='E', y='F', ax=ax)
plt.show()
def IMCoins(df):
fig, ax = plt.subplots()
try:
for idx in range(0, df.shape[1], 2):
df.plot(x = df.columns[idx],
y = df.columns[idx + 1],
ax= ax)
except IndexError:
print('Index Error: Log the error.')
plt.show()
Chris(df)
IMCoins(df)
I created a polar contour plot and try to close it by adding the data of first row to the end.
See in this picture at 180 deg:
PolarPlot
Data is created by using meshgrid and griddata modules.
The array sizes are of nxn type.
In example:
ri - float64 - (100,100) size
print ri
[[ 0.00160738 0.00184056 0.00207375 ..., 0.23409252 0.23432571
0.23455889]
[ 0.00160738 0.00184056 0.00207375 ..., 0.23409252 0.23432571
0.23455889]
[ 0.00160738 0.00184056 0.00207375 ..., 0.23409252 0.23432571
0.23455889]
theta and contour is created equivalent.
Plotting is done by matplotlib
How can I do this? Is this the right way for "closing" the plot at 180 degrees?
And here is the plot snippet:
fig4 = plt.figure()
ax = fig4.add_subplot(111)
ax = plt.axes(polar=True)
image=plt.contourf(thetai,ri/d,contourd,128,vmin=0,extent=([-math.pi,+math.pi,min(ri[0]/d),max(ri[0]/d)]),cmap=plt.cm.Paired)
ax.set_xlabel(r'$\Theta$')
ax.set_ylabel(r'$r$')
I'm not sure about your exact plot, but you can perform the array operation you describe using numpy.pad.
We can give np.pad your original array, and a pad_width of ((0,1),(0,1)), which means pad 0 columns on the left, 1 on the right, 0 columns on top, and 1 on the bottom. Set mode='edge' to copy the values on the edge of the array.
For example:
In [16]: a = np.array([[1,1,5],[2,2,6],[7,8,9]])
In [17]: a
Out[17]:
array([[1, 1, 5],
[2, 2, 6],
[7, 8, 9]])
In [18]: np.pad(a,((0,1),(0,1)),mode='edge')
Out[18]:
array([[1, 1, 5, 5],
[2, 2, 6, 6],
[7, 8, 9, 9],
[7, 8, 9, 9]])
It does not really become clear from your question what your data looks like.
In general though, for polar plots to be closed, the last point of your data has to be the same as the first. So if you have an array X where X[:,0] is the angle and X[:,1] is the radius, you can close the polar plot by appending the first element to the end like so:
X_closed = np.append(X,X[[0]],axis = 0)
I.e. you only have to add the first row to the end, not the first column.