Using PVLIB to simulate a system with shading losses - python

I am following the basic examples found here to simulate a simple system's energy generation in 15 minute intervals.
I would like to know, however, how can I introduce losses in the system following the same basic example. That is, with the following code:
import pandas as pd
import matplotlib.pyplot as plt
import pvlib
from pvlib.pvsystem import PVSystem
from pvlib.location import Location
from pvlib.modelchain import basic_chain, ModelChain
#%%
naive_times = pd.DatetimeIndex(start='01-30-2017', end='08-02-2017', freq='15min')
coordinates = [(52, 4, 'Amsterdam', 10, 'Etc/GMT-1')]
sandia_modules = pvlib.pvsystem.retrieve_sam('SandiaMod')
sapm_inverters = pvlib.pvsystem.retrieve_sam('cecinverter')
module = sandia_modules['Hanwha_HSL60P6_PA_4_250T__2013_']
inverter = sapm_inverters['ABB__PVI_10_0_I_OUTD_x_US_208_y_208V__CEC_2011_']
temp_air = 20
wind_speed = 0
system = PVSystem(surface_tilt = 13, surface_azimuth = 270, module_parameters = module, modules_per_string = 20, strings_per_inverter = 2, inverter_parameters = inverter)
for latitude, longitude, name, altitude, timezone in coordinates:
location = Location(latitude, longitude, name=name, altitude=altitude, tz=timezone)
mc = ModelChain(system, location, orientation_strategy=None)
mc.run_model(naive_times.tz_localize(timezone))
ac = mc.ac
energy = ac*0.001*0.25
plt.figure()
energy.plot()
I get
What I would like to have is a similar thing as this, obtained from real measurements:
In detail,
As you can see, a lot of losses from shading, DC losses, etc.
My question now is how to proceed from my code sample and achieve a plot similar to the ones in images 2 and 3?
Thanks in advance!

Your question is about dc losses and shading, but the biggest difference between your current ModelChain and the real system is the weather, particularly the irradiance, since two days in a row are not identical, which is due to changing cloud cover, rather than static losses.
The example on readthedocs: https://pvlib-python.readthedocs.io/en/latest/modelchain.html includes applying weather data at step 4. Further on in Demystifying ModelChain Internals, it defines weather. Unfortunately, it does not work with POA (plane of array) irradiance, which is the most common type measured on-site. However, ghi and dhi can be estimated from POA, but apparently there are no functions implemented.
weather : None or DataFrame, default None
If None, assumes air temperature is 20 C, wind speed is 0
m/s and irradiation calculated from clear sky data. Column
names must be 'wind_speed', 'temp_air', 'dni', 'ghi', 'dhi'.
Do not pass incomplete irradiation data. Use method
:py:meth:`~pvlib.modelchain.ModelChain.complete_irradiance`
instead.
The readthedoc page does provide some information about adding different kinds of DC losses, mostly through specific physical models (aoi or spectral). Unfortunately, shading is complex, depending on the system and its surroundings, and no one has created a shading loss module.

Related

How to get earth vector data from jpl horizons in python?

I am trying to get the vector data for Earth using Astroquery's Horizons Class. I have the following code:
from astroquery.jplhorizons import Horizons
import numpy as np
earth = Horizons(id=399, epochs = {'start':'2005-06-20', 'stop':'2005-06-21','step':'1d'})
earthVectors = earth.vectors()
earthX = earthVectors['x'].data # X is in AU
au2km = 149_597_870.7
earthXkm = earthX * au2km # X is in km
which returns earthXkm = [-3429775.6506088143 -899299.0538429054] in kilometers.
Getting this information directly from JPL Hoizons gives [-2793030.0, -2627770.0] kilometers.
There is a large discrepancy here and this is the same for all the values in the astropy table. I would also not expect the data to vary as much in one day as that from the astroquery result.
Is there an error in my code, or does the horizons vectors() method not work as intended?
You could just use astropy's get_body_barycentric instead (Note that Horizons currently uses the DE441 ephemeris, so the code below will download the file for this ephemeris which is 3.3 Gb):
from astropy.coordinates import get_body_barycentric, solar_system_ephemeris
from astropy.time import Time
# set the ephemeris to use DE441
solar_system_ephemeris.set("ftp://ssd.jpl.nasa.gov/pub/eph/planets/bsp/de441.bsp")
t = Time("2005-06-20", scale="tdb")
pos = get_body_barycentric("earth", t)
print(pos.x)
<Quantity -2793031.73765342 km>
This is identical (to within a micron [probably just the numerical truncation]) to what I get from the Horizons web interface (outputting value in km rather than AU).
After further analysis I have found the cause of the discrepancy.
Astroquery's Horizon Class uses a default coordinate system centered at the center of the sun for vectors. The Horizon's app; however, using the Solar System barycenter as the default coordinate origin. Using the location attribute set to the solar system barycenter fixes the issue.
location='#ssb' or location='500#0'

Applying spline interpolation for Brunt-Vaisala frequency

I have taken an upper air sounding from UWYo Database and currently calculating the Brunt-Vaisala frequency (the 'squared' one, at the moment) using MetPy across several stations for some basic synoptic purposes.
The minimal (at some point) and reproducible code runs like this;
import metpy.calc as mpcalc
import matplotlib.pyplot as plt
import numpy as np
from datetime import datetime
from metpy.units import units, pandas_dataframe_to_unit_arrays
from siphon.simplewebservice.wyoming import WyomingUpperAir
stations = ['RPLI', 'RPUB', '98433', 'RPMP', 'RPVP', 'RPMD'] #6 stations
station_data = {}
date = datetime(2016, 8, 14, 0)
for station in stations:
print(f'Getting {station}')
df = pandas_dataframe_to_unit_arrays(WyomingUpperAir.request_data(date, station))
df['theta'] = mpcalc.potential_temperature(df['pressure'], df['temperature'])
df['bv_squared'] = mpcalc.brunt_vaisala_frequency_squared(df['height'], df['theta'])
station_data[station] = df
mean_bv = []
for station in stations:
df = station_data[station]
keep_idx = (df['height'] >= 1000 * units.m) & (df['height'] <= 5 * units.km)
mean_bv.append(np.mean(df['bv_squared'][keep_idx]).m)
plt.title("Atmospheric Stability")
plt.plot(mean_bv)
plt.show()
which produces a simple plot like this
I would like to ask for help on how to smooth out those 'lines'/data, like by applying interpolation producing a smooth curve? I'm a bit novice, thus I look forward to your help and responses.
Essentially what you're looking for is to smooth or (low-pass) filter the data.
One option is to fit the data points to some kind of appropriate curve (polynomial, spline, exponential, etc.), and replace the original data values with with those computed from the curve. You can look at some of the tools in scipy.optimize to do the fit.
For filtering, there are a variety of options, from a moving average to more traditional filters; for this a good simple Savitzky-Golay filter. scipy.signal has a lot of tools to help you with this.

Why isn't the optimal surface_azimuth for energy production in pvlib at or around 180° in the Northern hemisphere?

I'm building a model using the open-source pvlib software (and CEC Modules) to estimate yearly photovoltaic energy output. I have encountered some inconsistencies in the model and would appreciate any troubleshooting the community can offer.
My main problem is this: the model tells me that the ideal Northern hemisphere surface_azimuth for energy production (ie. surface_azimuth with the highest energy output) is around 76° (just North of due-East) while the worst surface_azimuth for energy production is around 270° (due West). However, I understand that the ideal surface_azimuth in the Northern hemisphere should be about180° (due South) with the worst surface_azimuth at 0° (due North).
I've included this graph to help visualize the variation in energy production based on surface_azimuth
This is also generated at the end of the code attached.
Can anyone help me rectify this issue or correct my understanding?
Code copied below for reference
import os
import pandas as pd
import numpy as np
import os
import os.path
import matplotlib.pyplot as plt
import pvlib
from geopy.exc import GeocoderTimedOut
from geopy.geocoders import Nominatim
from IPython.display import Image
## GET THE LATITUDE & LONGITUDE OF A GIVEN CITY
geolocator = Nominatim(user_agent="____'s app")
geo = geolocator.geocode("Berkeley")
## CHECK THAT CITY IS CORRECT (by Country, State, etc.)
print(geo.address)
# CHECK THE LAT, LON order
print(geo.latitude, geo.longitude)
## SELECT THE YEAR & TIMES YOU'D LIKE TO MODEL OFF
YEAR = 2019
STARTDATE = '%d-01-01T00:00:00' % YEAR
ENDDATE = '%d-12-31T23:59:59' % YEAR
TIMES = pd.date_range(start=STARTDATE, end=ENDDATE, freq='H')
## ACCESS THE NREL API TO EXTRACT WEATHER DATA
NREL_API_KEY = os.getenv('NREL_API_KEY', 'DEMO_KEY')
## FILL IN THE BLANK WITH YOUR EMAIL ADRRESS
EMAIL = os.getenv('EMAIL', '_______.com')
##NEED TO COMMENT OUT THIS LINE BELOW -- if you call it too many times within an hour, it will break your code
header, data = pvlib.iotools.get_psm3(LATITUDE, LONGITUDE, NREL_API_KEY, EMAIL)
## SELECT THE PVLIB PANEL & INTERVTER YOU'D LIKE TO USE
## CAN ALSO CHOOSE FROM SANDIA LABS' DATASET OF PANELS & INVERTERS (check out the function)
## WE CHOSE THE CECMods because they highlighted the ones that were BIPV
INVERTERS = pvlib.pvsystem.retrieve_sam('CECInverter')
INVERTER_10K = INVERTERS['SMA_America__SB10000TL_US__240V_']
CECMODS = pvlib.pvsystem.retrieve_sam('CECMod')
## SELECT THE PANEL YOU'D LIKE TO USE (NOTE: THE PEVAFERSA MODEL IS A BIPV PANEL)
CECMOD_MONO = CECMODS['Pevafersa_America_IP_235_GG']
## CREATING AN ARRAY TO ITERATE THROUGH IN ORDER TO TEST DIFFERENT SURFACE_AZIMUTHS
heading_array = np.arange(0, 361, 2)
heading_array
heading_DF = pd.DataFrame(heading_array).rename(columns = {0: "Heading"})
heading_DF.head()
# geo IS AN OBJECT (the given city) CREATED ABOVE
LATITUDE, LONGITUDE = geo.latitude, geo.longitude
# data IS AN OBJECT (the weather patterns) CREATED ABOVE
# TIMES IS ALSO CREATED ABOVE, AND REPRESENTS TIME
data.index = TIMES
dni = data.DNI.values
ghi = data.GHI.values
dhi = data.DHI.values
surface_albedo = data['Surface Albedo'].values
temp_air = data.Temperature.values
dni_extra = pvlib.irradiance.get_extra_radiation(TIMES).values
# GET SOLAR POSITION
sp = pvlib.solarposition.get_solarposition(TIMES, LATITUDE, LONGITUDE)
solar_zenith = sp.apparent_zenith.values
solar_azimuth = sp.azimuth.values
# CREATING THE ARRY TO STORE THE DAILY ENERGY OUTPUT BY SOLAR AZIMUTH
e_by_az = []
# IDEAL surface_tilt ANGLE IN NORTHERN HEMISPHERE IS ~25
surface_tilt = 25
# ITERATING THROUGH DIFFERENT SURFACE_AZIMUTH VALUES
for heading in heading_DF["Heading"]:
surface_azimuth = heading
poa_sky_diffuse = pvlib.irradiance.get_sky_diffuse(
surface_tilt, surface_azimuth, solar_zenith, solar_azimuth,
dni, ghi, dhi, dni_extra=dni_extra, model='haydavies')
# calculate the angle of incidence using the surface_azimuth and (hardcoded) surface_tilt
aoi = pvlib.irradiance.aoi(
surface_tilt, surface_azimuth, solar_zenith, solar_azimuth)
# https://pvlib-python.readthedocs.io/en/stable/generated/pvlib.irradiance.aoi.html
# https://pvlib-python.readthedocs.io/en/stable/generated/pvlib.pvsystem.PVSystem.html
poa_ground_diffuse = pvlib.irradiance.get_ground_diffuse(
surface_tilt, ghi, albedo=surface_albedo)
poa = pvlib.irradiance.poa_components(
aoi, dni, poa_sky_diffuse, poa_ground_diffuse)
poa_direct = poa['poa_direct']
poa_diffuse = poa['poa_diffuse']
poa_global = poa['poa_global']
iam = pvlib.iam.ashrae(aoi)
effective_irradiance = poa_direct*iam + poa_diffuse
temp_cell = pvlib.temperature.pvsyst_cell(poa_global, temp_air)
# THIS IS THE MAGIC
cecparams = pvlib.pvsystem.calcparams_cec(
effective_irradiance, temp_cell,
CECMOD_MONO.alpha_sc, CECMOD_MONO.a_ref,
CECMOD_MONO.I_L_ref, CECMOD_MONO.I_o_ref,
CECMOD_MONO.R_sh_ref, CECMOD_MONO.R_s, CECMOD_MONO.Adjust)
# mpp is the list of energy output by hour for the whole year using a single panel
mpp = pvlib.pvsystem.max_power_point(*cecparams, method='newton')
mpp = pd.DataFrame(mpp, index=TIMES)
first48 = mpp[:48]
Edaily = mpp.p_mp.resample('D').sum()
# Edaily is the list of energy output by day for the whole year using a single panel
Eyearly = sum(Edaily)
e_by_az.append(Eyearly)
## LINKING THE Heading (ie. surface_azimuth) AND THE Eyearly (ie. yearly energy output) IN A DF
heading_DF["Eyearly"] = e_by_az
heading_DF.head()
## VISUALIZE ENERGY OUTPUT BY SURFACE_AZIMUTH
plt.plot(heading_DF["Heading"], heading_DF["Eyearly"])
plt.xlabel("Surface_Azimuth Angle")
plt.ylabel("Yearly Energy Output with tilt # " + str(surface_tilt))
plt.title("Yearly Energy Output by Solar_Azimuth Angle using surface_tilt = " + str(surface_tilt) + " in Berkeley, CA");
# FIND SURFACE_AZIMUTH THAT YIELDS THE MAX ENERGY OUTPUT
heading_DF[heading_DF["Eyearly"] == max(heading_DF["Eyearly"])]
# FIND SURFACE_AZIMUTH THAT YIELDS THE MIN ENERGY OUTPUT
heading_DF[heading_DF["Eyearly"] == min(heading_DF["Eyearly"])]
Thanks to kevinanderso#gmail.com for helping me out in the pvlib-python Google Group. He pointed out that my "TIMES variable is not timezone-aware and so the solar position calculation is assuming UTC".
To fix that, he suggested that I initialize TIMES with tz='Etc/GMT+8' (ie. PST in the US).
In his words, "the code [I] originally posted is modeling a hypothetical system where solar position and irradiance are timeshifted with respect to each other. That's a big departure from reality, and so real-life expectations don't apply to your model".
Thanks to Kevin and hopefully this helps anyone else having a similar issue.

Why is the simple model chain example from the website is calculating negative ac power?

I'm using the pvlib library for my masterthesis. When I run the example for times later than 4pm it is usually reporting for the ac power -0.02. Does somebody know why? I'm using the code below:
import pandas as pd
import numpy as np
# pvlib imports
import pvlib
from pvlib.pvsystem import PVSystem
from pvlib.location import Location
from pvlib.modelchain import ModelChain
# load some module and inverter specifications
sandia_modules = pvlib.pvsystem.retrieve_sam('SandiaMod')
cec_inverters = pvlib.pvsystem.retrieve_sam('cecinverter')
sandia_module = sandia_modules['Canadian_Solar_CS5P_220M___2009_']
cec_inverter = cec_inverters['ABB__MICRO_0_25_I_OUTD_US_208_208V__CEC_2014_']
location = Location(latitude=49.0205559, longitude=12.057453900000041)
system = PVSystem(surface_tilt=20, surface_azimuth=200,
module_parameters=sandia_module,
inverter_parameters=cec_inverter)
mc = ModelChain(system, location)
python_native_dt = datetime.datetime.now()
weather = pd.DataFrame([[1050, 1000, 100, 30, 5]],
columns=['ghi', 'dni', 'dhi', 'temp_air', 'wind_speed'],
index=[pd.Timestamp(pytz.timezone('Etc/GMT+2').localize(python_native_dt))])
mc.run_model(times=weather.index, weather=weather)
print(mc.ac)
Doing a mc.acwill result in: 2018-06-05 16:20:19.117017-02:00 -0.02
dtype: float64
The -0.02 is the energy your chosen inverter consumes when the input dc power is below its activation threshold.
To improve reproducibility and help us track down the answer, I suggest you specify an exact time rather than relying on datetime.datetime.now(). Using index=[pd.Timestamp('2018-06-05 16:20:19.117017-02:00')], I get 2018-06-05 16:20:19-02:00 13.660678.
I suggest that you confirm that mc.aoi and mc.solar_position are consistent with your weather inputs. They are derived from the time index and used to calculate the plane of array irradiance.
If that does not help... What versions of pvlib and pandas? Note that the example also needs import pytz and import datetime to run.
G'day Maximilian,
I suspect it is the weather dataframe. I have noticed that most weather files from NWP centres (I am using ECMWF ERA5) represent DNI GHI DHI in accumulation. With ERA5 data I noticed negative power values with raw data.
After converting from accumulation to W/s by dividing by 60(secs)*60(minutes) it worked.

Multi-period portfolio optimization in python

Scenario: I am trying to do multiple portfolio optimizations, with different constraints (weights, risk, risk aversion...) in a multi-period scenario.
What I already did: From the examples of cvxpy I found how to optimize a portfolio under a non-linear quadratic formula that results in a list of weights for the assets in the portfolio composition. My problem is that, although I have 15 years of monthly data, I don't know how to optimize for different periods (the code, as of its current form, yields the best composition for the entire time span of my data).
Question 1: Is it possible to make the code optimize for different periods. such as 1, 3, 4, 6, 9, 12 months (in that case, yielding different weights for each of those periods) If so, how could one do that?.
Question 2: Is it possible to restrain the number of assets in each portfolio composition? What is the best way to achieve that? (the current code uses all of them, but I would like to test when the number of assets is limited, to control the turnover level).
Code:
from cvxpy import *
from cvxopt import *
import pandas as pd
import numpy as np
prices = pd.DataFrame()
logret = pd.DataFrame()
normret = pd.DataFrame()
returns = pd.DataFrame()
prices = pd.read_excel(open('//folder//Dgms89//calculation v3.xlsx', 'rb'), sheetname='Prices Final')
logret = pd.read_excel(open('//folder//Dgms89//calculation v3.xlsx', 'rb'), sheetname='Returns log')
normret = pd.read_excel(open('//folder//Dgms89//calculation v3.xlsx', 'rb'), sheetname='Returns normal')
returns = normret
def calculate_portfolio(returns, selected_solver):
cov_mat = returns.cov()
Sigma = np.asarray(cov_mat.values)
w = Variable(len(cov_mat))
gamma = quad_form(w, Sigma)
prob = Problem(Minimize(gamma), [sum_entries(w) == 1])
prob.solve(solver=selected_solver)
weights = []
for weight in w.value:
weights.append(float(weight[0]))
return weights
The problem of multiperiod is that your model will be overfitted. On the other hand, you can backtest traditional portfolio optimization models asumming a rebalancing period. Riskfolio-Lib has an example using backtrader where it compares S&P500 with diferent portfolios using quarterly rebalancing. You can check the example in this link: https://riskfolio-lib.readthedocs.io/en/latest/examples.html
The standard mean-variance portfolio model is a static model. No dynamics in the model. (Time series are only used to estimate the variance-covariance matrix and the expected return). Some related models can answer questions like when and how to rebalance.
Restricting the number of assets in a portfolio leads to what is called the cardinality-constrained portfolio problem. This becomes basically an MIQP (Mixed-Integer Quadratic Programming problem).

Categories