I have a set of data which I like to fit a 1D-Sersic profile (a function already defined in astropy) I can easily fit the data if I simply consider the data points without any error bars as below:
import numpy as np
from astropy.modeling import models, fitting
from astropy.modeling.models import Sersic1D
##
data_LARS02 =np.loadtxt('/Users/a/data.asc')
##
r_l02 = data[:,0]
flux_l02 = data[:,1]
###########################################Fitting
sers_l02_init = Sersic1D(amplitude=5e40, r_eff=1, n=2)
fit_sers_l02 = fitting.LevMarLSQFitter()
Sers_l02 = fit_sers_l02(sers_l02_init,r_l02,flux_l02)
#####
The code works and everything is fine. However, I noticed there is a problem with the fit due to neglecting the error bars. Thus, I decided to include the error bars. But I do not know how to implement it. Do you have any suggestions?
Related
I typically use MATLAB, but want to push myself to learn something about Python. I tried a code of linear regression that introduced by a youtuber. Here is the code:
import pandas as pd
from sklearn import linear_model
import matplotlib.pyplot as plt
#read data
dataframe = pd.read_fwf('brain_body.txt')
x_values = dataframe[['Brain']]
y_values = dataframe[['Body']]
#train model on data
body_reg = linear_model.LinearRegression()
body_reg.fit(x_values,y_values)
#visualize results
plt.scatter(x_values,y_values)
plt.plot(x_values,body_reg.predict(x_values))
plt.show()
But I ended up with a very strange plot (I use Python 3.6):
1
here is part of details:
2
Apparently, something is missing or wrong.
The data of brain_body.txt can be found in https://github.com/llSourcell/linear_regression_demo/blob/master/brain_body.txt
Any suggestion or advice is welcome.
Update
I tried sera's code, and here is what I get:
3
It's funny and weird. it occurred to me that something is wrong with my data file, or something missing in my Python, but I just copied and pasted the raw data into the notepad and saved as .txt; I tried Python 3.6 and 2.7 as well as Pycharm and Spyder...so I have no idea...
BTW, the youtube video is here
#sascha #Moritz #sera I asked my friend to run the same code and data file, and everything is fine. In other words, there is something wrong with my Python and I don't know why. Let me try another computer and/or try an earlier version of python.
I tried, but nothing changed. Here are two different approaches I used to install Python:
1. Install Python (e.g. ver. 3.6); install Pycharm; install packages Pandas, scikit-learn...
2. Install Anaconda
Solved
Thanks for #Marc Bataillou 's suggestion. This is a problem associated with different versions of matplotlib. The problem was found in version 2.1.0. I tried 2.0.2 and found that the original code works fine in the older version; apparently, some changes are made from 2.0.2 to 2.1.0. Thanks for all your efforts.
You should use
plt.scatter(x_values.values,y_values.values)
instead of
plt.scatter(x_values,y_values)
I hope it works !
You can visualize the results using the following code. I use cross validation for the predictions. If the model was perfect, then all the dots would be on the plotted line.
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import cross_val_predict
from sklearn import linear_model
#read data
dataframe = pd.read_fwf('brain_body.txt')
x_values = dataframe[['Brain']]
y_values = dataframe[['Body']]
#model on data
body_reg = linear_model.LinearRegression()
# cross_val_predict returns an array of the same size as `y` where each entry
# is a prediction obtained by cross validation:
predicted = cross_val_predict(body_reg, x_values, y_values, cv=10)
fig, ax = plt.subplots()
ax.scatter(y_values, predicted, edgecolors=(0, 0, 0))
ax.plot([y_values.min(), y_values.max()], [y_values.min(), y_values.max()], 'k--', lw=4)
ax.set_xlabel('Measured')
ax.set_ylabel('Predicted')
plt.show()
Results:
Data
https://ufile.io/p7x0r
I want to plot cross section along longitude using python Iris module which developed for oceanography and meteorology, I'm using their example:
http://scitools.org.uk/iris/docs/v1.4/examples/graphics/cross_section.html
I tried to change their code to my example but output of my code is empty.
data: http://data.nodc.noaa.gov/thredds/fileServer/woa/WOA09/NetCDFdata/temperature_annual_1deg.nc
import iris
import iris.plot as iplt
import iris.quickplot as qplt
# Enable a future option, to ensure that the netcdf load works the same way
# as in future Iris versions.
iris.FUTURE.netcdf_promote = True
# Load some test data.
fname = 'temperature_annual_1deg.nc'
theta = iris.load_cube(fname, 'sea_water_temperature')
# Extract a single depth vs longitude cross-section. N.B. This could
# easily be changed to extract a specific slice, or even to loop over *all*
# cross section slices.
cross_section = next(theta.slices(['longitude',
'depth']))
qplt.contourf(cross_section, coords=['longitude', 'depth'],
cmap='RdBu_r')
iplt.show()
What you need to understand here is that your current cross_section is defined as first member of theta.slices iterator, meaning that it starts from one end of coordinates (which are empty in current case). So you need to iterate to the next members of the iterator until you get some data. If you add these lines to the code, maybe it helps to understand what is going on:
import numpy as np
cs = theta.slices(['longitude', 'depth'])
for i in cs:
print(np.nanmax(i))
Which should print something like:
--
--
--
-0.8788
-0.9052
I would like to have a chart with the temperatures for the following days on my website, and the Global Forecasting System meets my needs the most. How do I plot the GRIB2 data in matplotlib and create a PNG image from the plot?
I've spend hours of searching on the internet, asking people who do know how to do this (they where not helpfull at all) and I don't know where to start.
GFS data can be found here: ftp://ftp.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/
If possible, I'd like it to be lightweight and without loosing too much server space.
When you think lightweight about data usage and storage, you may consider to use other data forms than GRIB. GRIB-files usually contain worldwide data, which is pretty useless when you only want to plot for a specific domain.
I can strongly recommend to use data from the NOAA-NCEP opendap data server. You can gain data from this server using netCDF4. Unfortunately, this server is known to be unstable at some times which may causes delays in refreshing runs and/or malformed datasets. Although, in 95% of the time, I have acces to all the data I need.
Note: This data server may be slow due to high trafficking after a release of a new run. Acces to the data server can be found here: http://nomads.ncdc.noaa.gov/data.php?name=access#hires_weather_datasets
Plotting data is pretty easy with Matplotlib and Basemap toolkits. Some examples, including usage of GFS-datasets, can be found here: http://matplotlib.org/basemap/users/examples.html
Basically, there are 2 steps:
use wgrib to extract selected variables from grib2 data, and save into NetCDF file. Although there are some API such as pygrib, yet I found it less buggy to use the command line tool directly. some useful links:
install: http://www.cpc.ncep.noaa.gov/products/wesley/wgrib2/compile_questions.html
tricks: http://www.ftp.cpc.ncep.noaa.gov/wd51we/wgrib2/tricks.wgrib2
For example, extract temperature and humidity:
wgrib2 test.grb2 -s | egrep '(:RH:2 m above ground:|:TMP:2 m above ground:)'|wgrib2 -i test.grb2 -netcdf test.nc
use Python libraries to process NetCDF files, example code may look like this:
import warnings
warnings.filterwarnings("ignore")
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
% matplotlib inline
from netCDF4 import Dataset
from mpl_toolkits.basemap import Basemap
from pyproj import Proj
import matplotlib.cm as cm
import datetime
file = "test.nc"
rootgrp = Dataset(file, "r")
x = rootgrp['longitude'][:] # 0-359, step = 1
y = rootgrp['latitude'][:] # -90~90, step =1
tmp = rootgrp['TMP_2maboveground'][:][0] # shape(181,360)
dt = datetime.datetime(1970,1,1) + datetime.timedelta(seconds = rootgrp['time'][0])
fig = plt.figure(dpi=150)
m = Basemap(projection='mill',lat_ts=10,llcrnrlon=x.min(),
urcrnrlon=x.max(),llcrnrlat=y.min(),urcrnrlat=y.max(), resolution='c')
xx, yy = m(*np.meshgrid(x,y))
m.pcolormesh(xx,yy,tmp-273.15,shading='flat',cmap=plt.cm.jet)
m.colorbar(location='right')
m.drawcoastlines()
m.drawparallels(np.arange(-90.,120.,30.), labels=[1,0,0,0], fontsize=10)
m.drawmeridians(np.arange(0.,360.,60.), labels=[0,0,0,1], fontsize=10)
plt.title("{}, GFS, Temperature (C) ".format(dt.strftime('%Y-%m-%d %H:%M UTC')))
plt.show()
I've been trying to get some data to display in a matplotlib graph and I'm having an issue that seems fairly unexpected. I was originally trying to plot a large number of data points (~500000) and was getting the
OverflowError: Agg rendering complexity exceeded. Consider downsampling or decimating your data.
So, I did just that. I decimated my data using both the signal.decimate function and using slice notation. None of these solved my issue, I still get the complexity exceeded error even when trying to plot only 60 data points. I've attempted to determine if my computer my have some bad settings but I am fully capable of plotting 500000 points in a straight line without a hiccup. I'll add some example code and maybe someone can help me spot the error of my ways.
import scikits.audiolab as audiolab
if __name__ == "__main__":
import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import freqz
sound = audiolab.sndfile('exampleFile.wav', 'read')
sound_info = sound.read_frames(sound.get_nframes())
sound.close()
nsamples = sound_info.size
t = np.linspace(0, 5, nsamples, endpoint=False)
plt.figure()
plt.plot(t, sound_info, label='Filtered signal (600 Hz)')
plt.show()
Today I was doing a report for a course and I needed to include a figure of a contour plot of some field. I did this with matplotlib (ignore the chaotic header):
import numpy as np
import matplotlib
from matplotlib import rc
rc('font',**{'family':'sans-serif','sans-serif':['Helvetica']})
## for Palatino and other serif fonts use:
#rc('font',**{'family':'serif','serif':['Palatino']})
rc('text', usetex=True)
from matplotlib.mlab import griddata
import matplotlib.pyplot as plt
import numpy.ma as ma
from numpy.random import uniform
from matplotlib.colors import LogNorm
fig = plt.figure()
data = np.genfromtxt('Isocurvas.txt')
matplotlib.rcParams['xtick.direction'] = 'out'
matplotlib.rcParams['ytick.direction'] = 'out'
rc('text', usetex=True)
rc('font', family='serif')
x = data[:,0]
y = data[:,1]
z = data[:,2]
# define grid.
xi = np.linspace(0.02,1, 100)
yi = np.linspace(0.02,1.3, 100)
# grid the data.
zi = griddata(x,y,z,xi,yi)
# contour the gridded data.
CS = plt.contour(xi,yi,zi,25,linewidths=0,colors='k')
CS = plt.contourf(xi,yi,zi,25,cmap=plt.cm.jet)
plt.colorbar() # draw colorbar
# plot data points.
plt.scatter(x,y,marker='o',c='b',s=0)
plt.xlim(0.01,1)
plt.ylim(0.01,1.3)
plt.ylabel(r'$t$')
plt.xlabel(r'$x$')
plt.title(r' Contour de $\rho(x,t)$')
plt.savefig("Isocurvas.eps", format="eps")
plt.show()
where "Isocurvas.txt" is a 3 column file, which I really don't want to touch (eliminate data, or something like that, wouldn't work for me). My problem was that the figure size was 1.8 Mb, which is too much for me. The figure itself was bigger than the whole rest of the report, and when I opened the pdf it wasn't very smooth .
So , my question is :
Are there any ways of reducing this size without a sacrifice on the quality of the figure?. I'm looking for any solution, not necessarily python related.
This is the .png figure, with a slight variation on parameters. using .png you can see the pixels, which i don't like very much, so it is preferable pdf or eps.
Thank you.
The scatter plot is what's causing your large size. Using the EPS backend, I used your data to create the figures. Here's the filesizes that I got:
Straight from your example: 1.5Mb
Without the scatter plot: 249Kb
With a raster scatter plot: 249Kb
In your particular example it's unclear why you want the scatter (not visible). But for future problems, you can use the rasterized=True keyword on the call to plt.scatter to activate a raster mode. In your example you have 12625 points in the scatter plot, and in vector mode that's going to take a bit of space.
Another trick that I use to trim down vector images from matplotlib is the following:
Save figure as EPS
Run epstopdf (available with a TeX distribution) on the resulting file
This will generally give you a smaller pdf than matplotlib's default, and the quality is unchanged. For your example, using the EPS file without the scatter, it produced a pdf with 73 Kb, which seems quite reasonable. If you really want a vector scatter command, running epstopdf on the original 1.5 Mb EPS file produced a pdf with 198 Kb in my system.
I'm not sure if it helps with size, but if your willing to try the matplotlib 1.2 release candidate there is a new backend for producing PGF images (designed to slot straight into latex seamlessly). You can find the docs for that here: http://matplotlib.org/1.2.0/users/whats_new.html#pgf-tikz-backend
If you do decide to give it a shot and you have any questions, I'm probably not the best person to talk to, so would recommend emailing the matplotlib-users mailing list.
HTH,
Try removing the scatter plot of your data. They do not appear to be visible in your final figure (because you made them size 0) and may be taking up space in your eps.
EDITED: to completely change the answer because I read the question wrong.