Ploting a mathematical function in python - python

i want to plot the data which is shown below and compere it to a function which gives me the theoretical plot. I am able to plot the data with its uncertainty, but i am struguling to plot the mathematical function function which gives me the theoretical plot.
amplitude uncertainty position
5.2 0.429343685 0
12.2 1.836833144 1
21.4 0.672431409 2
30.2 0.927812481 3
38.2 1.163321108 4
44.2 1.340998136 5
48.4 1.506088975 6
51 1.543016526 7
51.2 1.587229032 8
49.8 1.507327436 9
46.2 1.400355669 10
40.6 1.254401849 11
32.5 0.995301462 12
24.2 0.753044487 13
14 0.58 14
7 0.29 15
here is my code so far:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
data = pd.read_excel("Verdier_6.xlsx")
verdier = data.values
frekvens = verdier [:,3]
effektresonans = verdier [:,0]
usikkerhet = verdier [:,1]
x = np.arange(0,15,0.1)
p= 28.2
r=0.8156
v= 343.8
f= 1117
y=p*np.sqrt(1+r**2+2*r*np.cos(((2*np.pi)/(v/f))*x))
plt.plot(x,y)
plt.plot(frekvens, effektresonans)
plt.errorbar(frekvens, effektresonans, usikkerhet, fmt = "o")
plt.title("")
plt.xlabel("Posisjon, X [cm]")
plt.ylabel("Amplitude, U [mV] ")
plt.grid()
plt.show()
And here is here is a image of the plot with only experimental data shown above:
and here is an image of how my experimental and theoretical plot look:
and here is an image of how the experimental and theoretical plot should look:

Related

How do I plot a beautiful scatter plot with linear regression?

I want to make a beautiful scatter plot with linear regression line using the data given below. I was able to create a scatter plot but am not satisfied with how it looks. Additionally, I want to plot a linear regression line on the data.
My data and code are below:
x y
117.00 111.0
107.00 110.0
77.22 78.0
112.00 95.4
149.00 150.0
121.00 121.0
121.61 120.0
111.54 140.0
73.00 72.0
70.47 000.0
66.3 72.0
113.00 131.0
81.00 81.0
72.00 00.0
74.20 98.0
84.24 90.0
86.60 88.0
99.00 97.0
90.00 102.0
85.00 000.0
138.0 135.0
96.00 93.0
import numpy as np
import matplotlib.pyplot as plt
print(plt.style.available)
from sklearn.linear_model import LinearRegression
plt.style.use('ggplot')
data = np.loadtxt('test_data', dtype=float, skiprows=1,usecols=(0,1))
x=data[:,0]
y=data[:,1]
plt.xlim(20,200)
plt.ylim(20,200)
plt.scatter(x,y, marker="o",)
plt.show()
Please check the snippet. You can use numpy.polyfit() with degree=1 to calculate slope and y-intercept of line to y=m*x+c
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('ggplot')
data = np.loadtxt('test_data.txt', dtype=float, skiprows=1,usecols=(0,1))
x=data[:,0]
y=data[:,1]
plt.xlim(20,200)
plt.ylim(20,200)
plt.scatter(x,y, marker="o",)
m, b = np.polyfit(x, y, 1)
plt.plot(x, m*x + b)
plt.show()
Edit1:
Based on your comment, I added more points and now graph seems like this and it seems it passes via points.
To set transparency to points you can use alpha argument . You can set range between 0 and 1 to change transparency. Here I set alpha=0.5
plt.scatter(x,y, marker="o",alpha=0.5)
Edit2: Based on #tmdavison suggestion
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('ggplot')
data = np.loadtxt('test_data.txt', dtype=float, skiprows=1,usecols=(0,1))
x=data[:,0]
y=data[:,1]
x2 = np.arange(0, 200)
plt.xlim(20,200)
plt.ylim(20,200)
plt.scatter(x,y, marker="o",)
m, b = np.polyfit(x, y, 1)
plt.plot(x2, m*x2 + b)
plt.show()

Display mean and deviation values on grouped boxplot in Python

I want to display mean and standard deviation values above each of the boxplots in the grouped boxplot (see picture).
My code is
import pandas as pd
import seaborn as sns
from os.path import expanduser as ospath
df = pd.read_excel(ospath('~/Documents/Python/Kandidatspeciale/TestData.xlsx'),'Ark1')
bp = sns.boxplot(y='throw angle', x='incident angle',
data=df,
palette="colorblind",
hue='Bat type')
bp.set_title('Rubber Comparison',fontsize=15,fontweight='bold', y=1.06)
bp.set_ylabel('Throw Angle [degrees]',fontsize=11.5)
bp.set_xlabel('Incident Angle [degrees]',fontsize=11.5)
Where my dataframe, df, is
Bat type incident angle throw angle
0 euro 15 28.2
1 euro 15 27.5
2 euro 15 26.2
3 euro 15 27.7
4 euro 15 26.4
5 euro 15 29.0
6 euro 30 12.5
7 euro 30 14.7
8 euro 30 10.2
9 china 15 29.9
10 china 15 31.1
11 china 15 24.9
12 china 15 27.5
13 china 15 31.2
14 china 15 24.4
15 china 30 9.7
16 china 30 9.1
17 china 30 9.5
I tried with the following code. It needs to be independent of number of x (incident angles), for instance it should do the job for more angles of 45, 60 etc.
m=df.mean(axis=0) #Mean values
st=df.std(axis=0) #Standard deviation values
for i, line in enumerate(bp['medians']):
x, y = line.get_xydata()[1]
text = ' μ={:.2f}\n σ={:.2f}'.format(m[i], st[i])
bp.annotate(text, xy=(x, y))
Can somebody help?
This question brought me here since I was also looking for a similar solution with seaborn.
After some trial and error, you just have to change the for loop to:
for i in range(len(m)):
bp.annotate(
' μ={:.2f}\n σ={:.2f}'.format(m[i], st[i]),
xy=(i, m[i]),
horizontalalignment='center'
)
This change worked for me (although I just wanted to print the actual median values). You can also add changes like the fontsize, color or style (i.e., weight) just by adding them as arguments in annotate.

Using the matplotlib to plot

I try to analyze the open data,and I tried to plot the scatter figure, but encounter the problem is always show the error.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# 讀入 csv 文字檔
csv_file = ("../ff0002fiac-4.csv")
data = pd.read_csv(csv_file,names=['a','b','c','d','e','f'])
print(data.head(5))
#df=pd.DataFrame(data)
years=data['a']
people=data['b']
print(years)
print(people)
data.plot(kind='line',x=years,y=people)
plt.show()
I expect to show the scatter figure, but the result is error.
Here is the data:
a b c d e f
0 100 3.56 120905 89608 72562 6686
1 101 3.43 118800 90229 73645 7858
2 102 3.47 116210 90236 73148 9170
3 103 3.17 105977 82889 68020 7949
4 104 3.36 121654 95517 77258 10049
and show the error below
KeyError: '[100 101 102 103 104 105 106] not in index'
From the pandas.DataFrame.plot documentation, the x and y parameters should be labels or positions. You're probably meaning to do this:
data.plot(kind='line',x='a',y='b')

Interpolate Temperature Data On Urban Area Using Cartopy

I'm trying to interpolate temperature data observed on an urban area formed by 5 locations. I am using cartopy to interpolate and draw the map, however, when I run the script the temperature interpolation is not shown and I only get the layer of the urban area with the color palette. Can someone help me fix this error? The link of shapefile is
https://www.dropbox.com/s/0u76k3yegvr09sx/LimiteAMG.shp?dl=0
https://www.dropbox.com/s/yxsmm3v2ey3ngsp/LimiteAMG.cpg?dl=0
https://www.dropbox.com/s/yx05n31dfkggbb6/LimiteAMG.dbf?dl=0
https://www.dropbox.com/s/a6nk0xczgjeen2d/LimiteAMG.prj?dl=0
https://www.dropbox.com/s/royw7s51n2f0a6x/LimiteAMG.qpj?dl=0
https://www.dropbox.com/s/7k44dcl1k5891qc/LimiteAMG.shx?dl=0
Data
Lat Lon tmax
0 20.8208 -103.4434 22.8
1 20.7019 -103.4728 17.7
2 20.6833 -103.3500 24.9
3 20.6280 -103.4261 NaN
4 20.7205 -103.3172 26.4
5 20.7355 -103.3782 25.7
6 20.6593 -103.4136 NaN
7 20.6740 -103.3842 25.8
8 20.7585 -103.3904 NaN
9 20.6230 -103.4265 NaN
10 20.6209 -103.5004 NaN
11 20.6758 -103.6439 24.5
12 20.7084 -103.3901 24.0
13 20.6353 -103.3994 23.0
14 20.5994 -103.4133 25.0
15 20.6302 -103.3422 NaN
16 20.7400 -103.3122 23.0
17 20.6061 -103.3475 NaN
18 20.6400 -103.2900 23.0
19 20.7248 -103.5305 24.0
20 20.6238 -103.2401 NaN
21 20.4753 -103.4451 NaN
Code:
import cartopy
import cartopy.crs as ccrs
from matplotlib.colors import BoundaryNorm
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import cartopy.io.shapereader as shpreader
from metpy.calc import get_wind_components
from metpy.cbook import get_test_data
from metpy.gridding.gridding_functions import interpolate, remove_nan_observation
from metpy.plots import add_metpy_logo
from metpy.units import units
to_proj = ccrs.PlateCarree()
data=pd.read_csv('/home/borisvladimir/Documentos/Datos/EMAs/EstacionesZMG/RedZMG.csv',usecols=(1,2,3),names=['Lat','Lon','tmax'],na_values=-99999,header=0)
fname='/home/borisvladimir/Dropbox/Diversos/Shapes/LimiteAMG.shp'
adm1_shapes = list(shpreader.Reader(fname).geometries())
lon = data['Lon'].values
lat = data['Lat'].values
xp, yp, _ = to_proj.transform_points(ccrs.Geodetic(), lon, lat).T
x_masked, y_masked, t = remove_nan_observations(xp, yp, data['tmax'].values)
#Interpola temp usando Cressman
tempx, tempy, temp = interpolate(x_masked, y_masked, t, interp_type='cressman', minimum_neighbors=3, search_radius=400000, hres=35000)
temp = np.ma.masked_where(np.isnan(temp), temp)
levels = list(range(-20, 20, 1))
cmap = plt.get_cmap('viridis')
norm = BoundaryNorm(levels, ncolors=cmap.N, clip=True)
fig = plt.figure(figsize=(15, 10))
view = fig.add_subplot(1, 1, 1, projection=to_proj)
view.add_geometries(adm1_shapes, ccrs.PlateCarree(),edgecolor='black', facecolor='white', alpha=0.5)
view.set_extent([-103.8, -103, 20.3, 21.099 ], ccrs.PlateCarree())
ZapLon,ZapLat=-103.50,20.80
GuadLon,GuadLat=-103.33,20.68
TonaLon,TonaLat=-103.21,20.62
TlaqLon,TlaqLat=-103.34,20.59
TlajoLon,TlajoLat=-103.44,20.47
plt.text(ZapLon,ZapLat,'Zapopan',transform=ccrs.Geodetic())
plt.text(GuadLon,GuadLat,'Guadalajara',transform=ccrs.Geodetic())
plt.text(TonaLon,TonaLat,'Tonala',transform=ccrs.Geodetic())
plt.text(TlaqLon,TlaqLat,'Tlaquepaque',transform=ccrs.Geodetic())
plt.text(TlajoLon,TlajoLat,'Tlajomulco',transform=ccrs.Geodetic())
mmb = view.pcolormesh(tempx, tempy, temp,transform=ccrs.PlateCarree(),cmap=cmap, norm=norm)
plt.colorbar(mmb, shrink=.4, pad=0.02, boundaries=levels)
plt.show()
The problem is in the call to MetPy's interpolate function. With the setting of hres=35000, it is generating a grid spaced at 35km. However, it appears that your data points are spaced much more closely than that; together, that results in a generated grid that has only two points, as shown as the red points below (black points are the original stations with non-masked data):
The result is that it only creates two points for the grid, both of which are outside the bounds of your data points; therefore those points end up masked. If instead we set hres to something much lower, say 5km (i.e. 5000), then a much more sensible result comes out:

can not remove a trend components and a seasonal components

I am trying to make a model for predicting energy production, by using ARMA model.
 
The data I can use for training is as following;
(https://github.com/soma11soma11/EnergyDataSimulationChallenge/blob/master/challenge1/data/training_dataset_500.csv)
ID Label House Year Month Temperature Daylight EnergyProduction
0 0 1 2011 7 26.2 178.9 740
1 1 1 2011 8 25.8 169.7 731
2 2 1 2011 9 22.8 170.2 694
3 3 1 2011 10 16.4 169.1 688
4 4 1 2011 11 11.4 169.1 650
5 5 1 2011 12 4.2 199.5 763
...............
11995 19 500 2013 2 4.2 201.8 638
11996 20 500 2013 3 11.2 234 778
11997 21 500 2013 4 13.6 237.1 758
11998 22 500 2013 5 19.2 258.4 838
11999 23 500 2013 6 22.7 122.9 586
As shown above, I can use data from July 2011 to May 2013 for training.
Using the training, I want to predict energy production on June 2013 for each 500 house.
The problem is that the time series data is not stationary and has trend components and seasonal components (I checked it as following.).
import csv
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
data_train = pd.read_csv('../../data/training_dataset_500.csv')
rng=pd.date_range('7/1/2011', '6/1/2013', freq='M')
house1 = data_train[data_train.House==1][['EnergyProduction','Daylight','Temperature']].set_index(rng)
fig, axes = plt.subplots(nrows=1, ncols=3)
for i, column in enumerate(house1.columns):
house1[column].plot(ax=axes[i], figsize=(14,3), title=column)
plt.show()
With this data, I cannot implement ARMA model to get good prediction. So I want to get rid of the trend components and a seasonal components and make the time series data stationary. I tried this problem, but I could not remove these components and make it stationary..
I would recommend the Hodrick-Prescott (HP) filter, which is widely used in macroeconometrics to separate long-term trending component from short-term fluctuations. It is implemented statsmodels.api.tsa.filters.hpfilter.
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm
df = pd.read_csv('/home/Jian/Downloads/data.csv', index_col=[0])
# get part of the data
x = df.loc[df.House==1, 'Daylight']
# hp-filter, set parameter lamb=129600 following the suggestions for monthly data
x_smoothed, x_trend = sm.tsa.filters.hpfilter(x, lamb=129600)
fig, axes = plt.subplots(figsize=(12,4), ncols=3)
axes[0].plot(x)
axes[0].set_title('raw x')
axes[1].plot(x_trend)
axes[1].set_title('trend')
axes[2].plot(x_smoothed)
axes[2].set_title('smoothed x')

Categories