for loop to plot multiple graph in one diagram - python

I am trying to plot multiple graphs in one diagram. I am planning to do it with a for loop.
x = df1['mrwSmpVWi']
c = df['c']
a = df['a']
b = df['b']
y = (c / (1 + (a) * np.exp(-b*(x))))
for number in df.Seriennummer:
plt.plot(x,y, linewidth = 4)
plt.title("TEST")
plt.xlabel('Wind in m/s')
plt.ylabel('Leistung in kWh')
plt.xlim(0,25)
plt.ylim(0,1900)
plt.show()
The calculation doesn't work I just get dots in the diagram and I get 3 different diagrams.
This is the df:
Seriennummer c a b
0 701085 1526 256 0.597
1 701086 1193 271 0.659
2 701087 1266 217 0.607
Does someone know what I did wrong?
[![enter image description here][1]][1]
Df1 has about 500,000 rows. This is a part of df1:
Seriennummer mrwSmpVWi mrwSmpP
422 701087.0 2.9 25.0
423 701090.0 3.9 56.0
424 701088.0 3.2 22.0
425 701086.0 4.0 49.0
426 701092.0 3.7 46.0
427 701089.0 3.3 0.0
428 701085.0 2.4 4.0
429 701091.0 3.6 40.0
430 701087.0 2.7 11.0
431 701090.0 3.1 23.0
432 701086.0 3.6 35.0
The expected output schould be a diagram with multiple logitic graphs. Something like that: [![enter image description here][2]][2]
EDIT:

I guess you are using matplotlib. You can use something like
import matplotlib.pyplot as plt
# some calculations for x and y ...
fig, ax = plt.subplots(ncols=1,nrows=1)
for i in range(10):
ax.plot(x[i],y[i])
plt.show()
Further information can be found on the matplotlib subplots documentation>
https://matplotlib.org/api/_as_gen/matplotlib.pyplot.subplots.html
Because you problem is related to the pandas data frames, try something like
for number in df.Seriennummer:
x = df1.loc['Seriennummer'==number]['mrwSmpVWi']
y = (c['Seriennummer'==number] / (1 + (a['Seriennummer'==number]) * np.exp(-b['Seriennummer'==number]*(x))))
plt.plot(x,y, linewidth = 4)

Related

Seaborn custom axis sxale: matplotlib.scale.FuncScale

I'm trying to figure out how to get a custom scale for my axis. My x-axis goes from 0 to 1,000,000 in 100,000 step increments, but I want to scale each of these numbers by 1/100, so that they go from 0 to 1,000 in 100 step increments. matplotlib.scale.FuncScale, but I'm having trouble getting it to work.
Here's what the plot currently looks like:
My code looks like this:
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
dataPlot = pd.DataFrame({"plot1" : [1, 2, 3], "plot2" : [4, 5, 6], "plot3" : [7, 8, 9]})
ax = sns.lineplot(data = dataPlot, dashes = False, palette = ["blue", "red", "green"])
ax.set_xlim(1, numRows)
ax.set_xticks(range(0, numRows, 100000))
plt.ticklabel_format(style='plain')
plt.scale.FuncScale("xaxis", ((lambda x : x / 1000), (lambda y : y * 1000)))
When I run this code specifically, I get AttributeError: module 'matplotlib.pyplot' has no attribute 'scale', so I tried adding import matplotlib as mpl to the top of the code and then changing the last line to be mpl.scale.FuncScale("xaxis", ((lambda x : x / 1000), (lambda y : y * 1000))) and that actually ran without error, but but it didn't change anything.
How can I get this to properly scale the axis?
Based on the clarification from the question comments a straightforward solution scaling the x-axis data in the dataframe (x-data in the question case being the df index) and then plot.
Using example data since the code from the question wasn't running on its own.
x starting range is 0 to 100, and then scaled to 0 to 10, but that's equivalent to any other starting range and scaling.
1st the default df.plot: (just as reference)
import pandas as pd
import numpy as np
arr = np.arange(0, 101, 1) * 1.5
df = pd.DataFrame(arr, columns=['y_data'])
print(df)
y_data
0 0.0
1 1.5
2 3.0
3 4.5
4 6.0
.. ...
96 144.0
97 145.5
98 147.0
99 148.5
100 150.0
df.plot()
Note that per default df.plot uses the index as x-axis.
2nd scaling the x-data in the dataframe:
The interims dfs are only displayed to follow along.
Preparation
df.reset_index(inplace=True)
Getting the original index data as a column to further work with (see scaling below).
index y_data
0 0 0.0
1 1 1.5
2 2 3.0
3 3 4.5
4 4 6.0
.. ... ...
96 96 144.0
97 97 145.5
98 98 147.0
99 99 148.5
100 100 150.0
df = df.rename(columns = {'index':'x_data'}) # just to be more explicit
x_data y_data
0 0 0.0
1 1 1.5
2 2 3.0
3 3 4.5
4 4 6.0
.. ... ...
96 96 144.0
97 97 145.5
98 98 147.0
99 99 148.5
100 100 150.0
Scaling
df['x_data'] = df['x_data'].apply(lambda x: x/10)
x_data y_data
0 0.0 0.0
1 0.1 1.5
2 0.2 3.0
3 0.3 4.5
4 0.4 6.0
.. ... ...
96 9.6 144.0
97 9.7 145.5
98 9.8 147.0
99 9.9 148.5
100 10.0 150.0
3rd df.plot with specific columns:
df.plot(x='x_data', y = 'y_data')
By x= a specific column instead of the default = index is used as the x-axis.
Note that the y data hasn't changed but the x-axis is now scaled compared to the "1st the default df.plot" above.

Ploting a mathematical function in python

i want to plot the data which is shown below and compere it to a function which gives me the theoretical plot. I am able to plot the data with its uncertainty, but i am struguling to plot the mathematical function function which gives me the theoretical plot.
amplitude uncertainty position
5.2 0.429343685 0
12.2 1.836833144 1
21.4 0.672431409 2
30.2 0.927812481 3
38.2 1.163321108 4
44.2 1.340998136 5
48.4 1.506088975 6
51 1.543016526 7
51.2 1.587229032 8
49.8 1.507327436 9
46.2 1.400355669 10
40.6 1.254401849 11
32.5 0.995301462 12
24.2 0.753044487 13
14 0.58 14
7 0.29 15
here is my code so far:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
data = pd.read_excel("Verdier_6.xlsx")
verdier = data.values
frekvens = verdier [:,3]
effektresonans = verdier [:,0]
usikkerhet = verdier [:,1]
x = np.arange(0,15,0.1)
p= 28.2
r=0.8156
v= 343.8
f= 1117
y=p*np.sqrt(1+r**2+2*r*np.cos(((2*np.pi)/(v/f))*x))
plt.plot(x,y)
plt.plot(frekvens, effektresonans)
plt.errorbar(frekvens, effektresonans, usikkerhet, fmt = "o")
plt.title("")
plt.xlabel("Posisjon, X [cm]")
plt.ylabel("Amplitude, U [mV] ")
plt.grid()
plt.show()
And here is here is a image of the plot with only experimental data shown above:
and here is an image of how my experimental and theoretical plot look:
and here is an image of how the experimental and theoretical plot should look:

How to make plots with small whitespace separations in Matplotlib or Seaborn?

I'd like to make this type of plot with multiple columns separated by small whitespace, each having different category having 3-5 (5 in this example) different observations with varying values on y axis:
actually, i can plot this plot use ggplot2. for example:
head(mtcars)
# mpg cyl disp hp drat wt qsec vs am gear carb
# Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
# Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
# Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
# Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
# Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
# Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
library(dplyr)
library(ggplot2)
mtcars %>% reshape2::melt() %>%
ggplot(aes(x = variable, y = value)) +
geom_point() + facet_grid(~ variable) +
theme(axis.text.x = element_blank())
you set a categorical variable in your dataset,then use the facet_grid(~).this function can change your plot into multiple plot by your categrical variable
Here is an approach to draw a similar plot using Python's matplotlib. The plot has a grey background and white major and minor gridlines to delimit the zones. Getting the dots in the center of each little cell is somewhat tricky: divide into n+1 spaces and shift half a cell (1/2n). A secondary x-axis can be used to set the labels. A zorder has to be set to have the dots on top of the gridlines.
import numpy as np
from matplotlib import pyplot as plt
from matplotlib import ticker
n = 5
cols = 7
values = [np.random.uniform(1, 10, n) for c in range(cols)]
fig, ax = plt.subplots()
ax.set_facecolor('lightgrey')
ax.xaxis.set_major_locator(ticker.MultipleLocator(1))
ax.xaxis.set_minor_locator(ticker.MultipleLocator(1 / (n)))
ax.yaxis.set_major_locator(ticker.MultipleLocator(1))
ax.grid(True, which='both', axis='both', color='white')
ax.set_xticklabels([])
ax.tick_params(axis='x', which='both', length=0)
ax.grid(which='major', axis='both', lw=3)
ax.set_xlim(1, cols + 1)
for i in range(1, cols + 1):
ax.scatter(np.linspace(i, i + 1, n, endpoint=False) + 1 / (2 * n), values[i-1], c='crimson', zorder=2)
ax2 = ax.twiny()
ax2.set_xlim(0.5, cols + 0.5)
ticks = range(1, cols + 1)
ax2.set_xticks(ticks)
ax2.set_xticklabels([f'Cat_{t:02d}' for t in ticks])
bbox = dict(boxstyle="round", ec="limegreen", fc="limegreen", alpha=0.5)
plt.setp(ax2.get_xticklabels(), bbox=bbox)
ax2.tick_params(axis='x', length=0)
plt.show()

Python: Finding multiple linear trend lines in a scatter plot

I have the following pandas dataframe -
Atomic Number R C
0 2.0 49.0 0.040306
1 3.0 205.0 0.209556
2 4.0 140.0 0.107296
3 5.0 117.0 0.124688
4 6.0 92.0 0.100020
5 7.0 75.0 0.068493
6 8.0 66.0 0.082244
7 9.0 57.0 0.071332
8 10.0 51.0 0.045725
9 11.0 223.0 0.217770
10 12.0 172.0 0.130719
11 13.0 182.0 0.179953
12 14.0 148.0 0.147929
13 15.0 123.0 0.102669
14 16.0 110.0 0.120729
15 17.0 98.0 0.106872
16 18.0 88.0 0.061996
17 19.0 277.0 0.260485
18 20.0 223.0 0.164312
19 33.0 133.0 0.111359
20 36.0 103.0 0.069348
21 37.0 298.0 0.270709
22 38.0 245.0 0.177368
23 54.0 124.0 0.079491
The trend between r and C is generally a linear one. What I would like to do if possible is find an exhaustive list of all the possible combinations of 3 or more points and what their trends are with scipy.stats.linregress so that I can find groups of points that fit linearly the best.
Which would ideally look something like this for the data, (Source) but I am looking for all the other possible trends too.
So the question, how do I feed all the 16776915 possible combinations (sum_(i=3)^24 binomial(24, i)) of 3 or more points into lingress and is it even doable without a ton of code?
My following solution proposal is based on the RANSAC algorithm. It is method to fit a mathematical model (e.g. a line) to data with heavy of outliers.
RANSAC is one specific method from the field of robust regression.
My solution below first fits a line with RANSAC. Then you remove the data points close to this line from your data set (which is the same as keeping the outliers), fit RANSAC again, remove data, etc until only very few points are left.
Such approaches always have parameters which are data dependent (e.g. noise level or proximity of the lines). In the following solution and MIN_SAMPLES and residual_threshold are parameters which might require some adaption to the structure of your data:
import matplotlib.pyplot as plt
import numpy as np
from sklearn import linear_model
MIN_SAMPLES = 3
x = np.linspace(0, 2, 100)
xs, ys = [], []
# generate points for thee lines described by a and b,
# we also add some noise:
for a, b in [(1.0, 2), (0.5, 1), (1.2, -1)]:
xs.extend(x)
ys.extend(a * x + b + .1 * np.random.randn(len(x)))
xs = np.array(xs)
ys = np.array(ys)
plt.plot(xs, ys, "r.")
colors = "rgbky"
idx = 0
while len(xs) > MIN_SAMPLES:
# build design matrix for linear regressor
X = np.ones((len(xs), 2))
X[:, 1] = xs
ransac = linear_model.RANSACRegressor(
residual_threshold=.3, min_samples=MIN_SAMPLES
)
res = ransac.fit(X, ys)
# vector of boolean values, describes which points belong
# to the fitted line:
inlier_mask = ransac.inlier_mask_
# plot point cloud:
xinlier = xs[inlier_mask]
yinlier = ys[inlier_mask]
# circle through colors:
color = colors[idx % len(colors)]
idx += 1
plt.plot(xinlier, yinlier, color + "*")
# only keep the outliers:
xs = xs[~inlier_mask]
ys = ys[~inlier_mask]
plt.show()
In the following plot points shown as stars belong to the clusters detected by my code. You also see a few points depicted as circles which are the points remaining after the iterations. The few black stars form a cluster which you could get rid of by increasing MIN_SAMPLES and / or residual_threshold.

How to make axis tick labels visible on the other side of the plot in gridspec?

Plotting my favourite example dataframe,which looks like this:
x val1 val2 val3
0 0.0 10.0 NaN NaN
1 0.5 10.5 NaN NaN
2 1.0 11.0 NaN NaN
3 1.5 11.5 NaN 11.60
4 2.0 12.0 NaN 12.08
5 2.5 12.5 12.2 12.56
6 3.0 13.0 19.8 13.04
7 3.5 13.5 13.3 13.52
8 4.0 14.0 19.8 14.00
9 4.5 14.5 14.4 14.48
10 5.0 NaN 19.8 14.96
11 5.5 15.5 15.5 15.44
12 6.0 16.0 19.8 15.92
13 6.5 16.5 16.6 16.40
14 7.0 17.0 19.8 18.00
15 7.5 17.5 17.7 NaN
16 8.0 18.0 19.8 NaN
17 8.5 18.5 18.8 NaN
18 9.0 19.0 19.8 NaN
19 9.5 19.5 19.9 NaN
20 10.0 20.0 19.8 NaN
I have two subplots, for some other reasons it is best for me to use gridspec. The plotting code is as follows (it is quite comprehensive, so I would like to avoid major changes in the code that otherwise works perfectly and just doesn't do one unimportant detail):
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import gridspec
import matplotlib as mpl
df = pd.read_csv('H:/DocumentsRedir/pokus/dataframe.csv', delimiter=',')
# setting limits for x and y
ylimit=(0,10)
yticks1=np.arange(0,11,1)
xlimit1=(10,20)
xticks1 = np.arange(10,21,1)
# general plot formatting (axes colour, background etc.)
plt.style.use('ggplot')
plt.rc('axes',edgecolor='black')
plt.rc('axes', facecolor = 'white')
plt.rc('grid', color = 'grey')
plt.rc('grid', alpha = 0.3) # alpha is percentage of transparency
colours = ['g','b','r']
title1 = 'The plot'
# GRIDSPEC INTRO - rows, cols, distance of individual plots
fig = plt.figure(figsize=(6,4))
gs=gridspec.GridSpec(1,2, hspace=0.15, wspace=0.08,width_ratios=[1,1])
## SUBPLOT of GRIDSPEC with lines
# the first plot
axes1 = plt.subplot(gs[0,0])
for count, vals in enumerate(df.columns.values[1:]):
X = np.asarray(df[vals])
h = vals
p1 = plt.plot(X,df.index,color=colours[count],linestyle='-',linewidth=1.5,label=h)
# formatting
p1 = plt.ylim(ylimit)
p1 = plt.yticks(yticks1, yticks1, rotation=0)
p1 = axes1.yaxis.set_minor_locator(mpl.ticker.MultipleLocator(0.1))
p1 = plt.setp(axes1.get_yticklabels(),fontsize=8)
p1 = plt.gca().invert_yaxis()
p1 = plt.ylabel('x [unit]', fontsize=14)
p1 = plt.xlabel("Value [unit]", fontsize=14)
p1 = plt.tick_params('both', length=5, width=1, which='minor', direction = 'in')
p1 = axes1.xaxis.set_minor_locator(mpl.ticker.MultipleLocator(0.1))
p1 = plt.xlim(xlimit1)
p1 = plt.xticks(xticks1, xticks1, rotation=0)
p1 = plt.setp(axes1.get_xticklabels(),fontsize=8)
p1 = plt.legend(loc='best',fontsize = 8, ncol=2) #
# the second plot (something random)
axes2 = plt.subplot(gs[0,1])
for count, vals in enumerate(df.columns.values[1:]):
nonans = df[vals].dropna()
result=nonans-0.5
p2 = plt.plot(result,nonans.index,color=colours[count],linestyle='-',linewidth=1.5)
p2 = plt.ylim(ylimit)
p2 = plt.yticks(yticks1, yticks1, rotation=0)
p2 = axes2.yaxis.set_minor_locator(mpl.ticker.MultipleLocator(0.1))
p2 = plt.gca().invert_yaxis()
p2 = plt.xlim(xlimit1)
p2 = plt.xticks(xticks1, xticks1, rotation=0)
p2 = axes2.xaxis.set_minor_locator(mpl.ticker.MultipleLocator(0.1))
p2 = plt.setp(axes2.get_xticklabels(),fontsize=8)
p2 = plt.xlabel("Other value [unit]", fontsize=14)
p2 = plt.tick_params('x', length=5, width=1, which='minor', direction = 'in')
p2 = plt.setp(axes2.get_yticklabels(), visible=False)
fig.suptitle(title1, size=16)
plt.show()
However, is it possible to show the y tick labels of the second subplot on the right hand side? The current code produces this:
And I would like to know if there is an easy way to get this:
No, ok, found out it is precisely what I wanted.
I want the TICKS to be on BOTH sides, just the LABELS to be on the right. The solution above removes my ticks from the left side of the subplot, which doesn't look good. However, this answer seems to get the right solution :)
To sum up:
to get the ticks on both sides and labels on the right, this is what fixes it:
axes2.yaxis.tick_right(‌​)
axes2.yaxis.set_ticks_p‌​osition('both')
And if you need the same for x axis, it's axes2.xaxis.tick_top(‌​)
try something like
axes2.yaxis.tick_right()
Just look around Python Matplotlib Y-Axis ticks on Right Side of Plot.

Categories