# Solar Decomposition Models#

Knowing the direct or beam normal irradiance (DNI) is useful for many solar and energy applications, e.g., calculating the yield of solar concetrating power systems or determining the irradiance on an inclined surface. However, often only global horizontal irradiance (GHI) is available, as it is much cheaper to measure than the individual components and most satelitte radiation models only calculate GHI. Many applications, therefore, require applying solar radiation decomposition models to obtain the direct beam irradiance. The majority of decomposition models work by applying an empirical correlations between GHI, diffuse horizontal irradiance (DHI) and DNI. In this section, we present the implementation in Python of several solar decomposition models.

Content by Javier Lopez Lorente and Adam R. Jensen

## The role of solar decomposition models#

Solar decomposition models are some of the most frequently used algorithms, and are often used during the first part of a larger analysis. As described earlier, decomposition models permit estimating the beam or direct normal irradiance (DNI) from GHI measurements, often using relationships between the clearness index ($$k_t$$) and the diffuse fraction of solar irradiance ($$k_d$$). The clearness index is defined as:

$k_t = \frac{GHI}{E_a \cdot cos(\theta_Z)}$

where $$E_a$$ is the extraterrestrial irradiance, and $$\theta_Z$$ is the solar zenith angle.

The diffuse fraction is defined as:

$k_d = \frac{DHI}{GHI}$

The role of decomposition models can be graphically illustrated in the diagram below: Multiple models have been proposed for the decomposition of global horizontal radiation/irradiance into its direct normal and diffuse horizontal components in the scientific literature. Below, the implementation in Python of some of these methods will be shown. The following methods will be modelled and compared:

• DISC model (Maxwell, 1987)

• DIRINT model (Pérez et al., 1992)

• DIRINDEX model (Pérez et al., 2002)

• Erbs model (Erbs et al., 1982)

The four decomposition models, among some others, are implemented in the open-source library pvlib python.

Before applying the models in practice, an example irradiance data set will be retrieved.

## Data preparation#

To illustrate the implementation of each of these models, we will use real irradiance observations. The decomposition models will be used to calculate DNI and DIh from measured GHI, and the results will then be compared to measurements of DNI and DHI. The first step is getting and preparing the measurement data to be used.

# Import all the necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import pvlib


### Retrieving weather data#

The irradiance data used is from the BSRN Cabauw station operated by the Royal Netherlands Meteorological Institute (KNMI). The data is retrieved from the BSRN FTP server using the get_bsrn function from pvlib-python. The Cabauw station is located at latitude 51.9711° and longitude 4.9267°, at an elevation of 0 m AMSL. The reason for choosing the Cabauw station is that the data has already been extensively quality checked, hence in this example this step is skipped. The data used are 1-minute global, direct, and diffuse irradiance measurements between January and July 2021, which are resampled to hourly observations.

# Retrieving the measurement data
df, meta = pvlib.iotools.get_bsrn(
station='CAB',
start=pd.Timestamp('2021-01-01'),
end=pd.Timestamp('2021-07-31'),

# Resample the 1-minute data to hourly
df = df.resample('1h').mean()

# Display the first five rows of the dataset

ghi ghi_std ghi_min ghi_max dni dni_std dni_min dni_max dhi dhi_std dhi_min dhi_max lwd lwd_std lwd_min lwd_max temp_air relative_humidity pressure
2021-01-01 00:00:00+00:00 -1.150000 0.076667 -1.366667 -1.033333 1.000000 0.008333 1.000000 1.133333 -1.000000 0.050000 -1.000000 -1.000000 242.133333 0.288333 241.666667 242.633333 -1.531667 99.360000 1006.000000
2021-01-01 01:00:00+00:00 -1.000000 0.078333 -1.000000 -1.000000 1.150000 0.005000 1.050000 1.550000 -1.000000 0.051667 -1.000000 -1.000000 243.666667 0.253333 243.216667 244.100000 -2.320000 98.990000 1006.000000
2021-01-01 02:00:00+00:00 -1.000000 0.081667 -1.016667 -0.900000 1.266667 0.016667 1.050000 1.516667 -0.600000 0.051667 -0.683333 -0.500000 265.950000 0.603333 264.983333 267.066667 -0.808333 98.183333 1006.066667
2021-01-01 03:00:00+00:00 -1.000000 0.075000 -1.000000 -1.000000 1.583333 0.006667 1.383333 1.616667 -0.533333 0.065000 -0.833333 -0.333333 271.800000 0.578333 270.800000 272.733333 -0.930000 98.736667 1006.216667
2021-01-01 04:00:00+00:00 -0.966667 0.076667 -1.000000 -0.850000 1.316667 0.005000 1.233333 1.500000 0.000000 0.061667 -0.350000 0.000000 291.933333 0.231667 291.500000 292.266667 -0.171667 97.960000 1006.050000

Besides GHI, DNI and DHI, this station has additional weather-related variables available (e.g., air temperature, humidity, and wind speed), as well as sensor-specific data. The details of the sensors and measurements available in each station can be found under the section “Instrument history and meta data” of each MIDC’s station webpage.

Let’s list the variables available and then subset those that we will use in this example:

# Visualize variables avilable
df.columns

Index(['ghi', 'ghi_std', 'ghi_min', 'ghi_max', 'dni', 'dni_std', 'dni_min',
'dni_max', 'dhi', 'dhi_std', 'dhi_min', 'dhi_max', 'lwd', 'lwd_std',
'lwd_min', 'lwd_max', 'temp_air', 'relative_humidity', 'pressure'],
dtype='object')


It is often helpful to visualize the data in order to check that the measurements are within the expected ones. For example, a plot for a single day (18th July 2021):

df.loc['2021-07-18', ['ghi','dhi','dni']].plot()
plt.title('Solar irradiance components, 18th July 2021')
plt.ylabel('Irradiance [W/m$^2$]')
plt.legend()

<matplotlib.legend.Legend at 0x17ea950d250> For an overall representation of the data, it’s useful to inspect the the key statistical metrics for the data:

df[['ghi','dni','dhi', 'temp_air', 'relative_humidity', 'pressure']].describe()

ghi dni dhi temp_air relative_humidity pressure
count 5088.000000 5021.000000 5006.000000 5088.000000 5088.000000 5088.000000
mean 150.154482 129.608143 78.730762 9.613794 81.598316 1016.147409
std 223.501191 244.733398 109.759797 7.098631 14.189124 10.043831
min -2.033333 -0.316667 -1.827586 -8.151667 32.516667 981.450000
25% -1.000000 1.000000 -0.783333 4.133750 72.566667 1009.945833
50% 21.075000 1.316667 18.233333 8.776667 84.843333 1016.675000
75% 234.120833 117.466667 125.820833 15.312083 93.289167 1023.575000
max 926.033333 958.500000 568.216667 29.330000 100.000000 1042.000000

Estimating other input variables:

The 3 solar decomposition models that we will implement in this section (i.e., DISC, DIRINT and Erbs models) have different inputs to be implemented. Solar position data is a common input for the 3 models. The DISC and DIRINT models accept as a possible input the dew (wet-bulb) temperature ($$T_d$$). This variable is not directly available from our weather observations; however, it can be easily estimated using a simple conversion that has relative humidity ($$RH$$) and the ambient temperature ($$T_a$$) as input. This simple approximation is given by the equation below (Lawrence, 2005):

$T_{d} = T_a (^{\circ} C) - \frac{100-RH (\%)}{5}$

In Python, this equation could be implemented as follows:

df['temp_dew'] = df['temp_air']-((100-df['relative_humidity'])/5)


The solar position for the entire period can be calculated using the pvlib.solarposition.get_solarposition function which supports a number of different solar position algorithms. The default algorithm is an implementation of the NREL SPA algorithm.

# Define a pvlib location object for the site
location = pvlib.location.Location(latitude=51.9711, longitude=4.9267, altitude=0, name='Cabauw')

# Calculate the solar position for each index in the datarame for the specific location
# The 30 minute shift is applied in order to calculate the solar position at the middle of the hour!
solpos = location.get_solarposition(df.index+pd.Timedelta(minutes=30))
solpos.index = solpos.index - pd.Timedelta(minutes=30)

# Let's have a look to the output

apparent_zenith zenith apparent_elevation elevation azimuth equation_of_time
2021-01-01 00:00:00+00:00 149.693651 149.693651 -59.693651 -59.693651 21.451575 -3.441097
2021-01-01 01:00:00+00:00 144.585022 144.585022 -54.585022 -54.585022 45.257706 -3.460697
2021-01-01 02:00:00+00:00 137.059810 137.059810 -47.059810 -47.059810 63.685127 -3.480288
2021-01-01 03:00:00+00:00 128.334219 128.334219 -38.334219 -38.334219 78.296833 -3.499870
2021-01-01 04:00:00+00:00 119.152361 119.152361 -29.152361 -29.152361 90.721936 -3.519442

## DISC model#

The DISC model (Maxwell, 1987) requires the following inputs: GHI, solar zenith angle, and the atmospheric pressure, which is used to estimate the absolute (pressure-corrected) airmass.

# DISC method estimated with absolute airmass as input
ghi=df['ghi'],
solar_zenith=solpos['apparent_zenith'],
datetime_or_doy=df.index,
pressure=df['pressure']*100) # Site pressure in Pascal


The functoin returns the DNI, the clearness index ($$k_t$$) and the airmass estimation:

disc.columns

Index(['dni', 'kt', 'airmass'], dtype='object')


## DIRINT model#

An implementatin of the DIRINT model (Pérez et al., 1992) is also available in pvlib-python. The model requires the following inputs: GHI, solar zenith angle, atmospheric pressure, and dew (wet-bulb) temperature.

# Estimation of the DIRINT model
ghi=df['ghi'],
solar_zenith=solpos['apparent_zenith'],
times=df.index,
pressure=df['pressure']*100, # atmospheric pressure in Pascal
temp_dew=df['temp_dew']) # Dew temperature in Degree Celsius


## DIRINDEX model#

The DIRINDEX model (Pérez et al., 2002) was found to be one of the two best decomposition models in an extensive study by Gueymard and Ruiz-Arias (2016). The model is an enhancement of the DIRINT model by considering clearsky data.

# Retrieve clear-sky irradiance from CAMS McClear
clearsky, meta = pvlib.iotools.get_cams(
latitude=location.latitude,
longitude=location.longitude,
altitude=location.altitude,
start=df.index,
end=df.index[-1],

# Estimation of the DIRINDEX model
ghi=df['ghi'],
ghi_clearsky=clearsky['ghi_clear'],
dni_clearsky=clearsky['dni_clear'],
zenith=solpos['zenith'],
times=df.index,
pressure=df['pressure']*100, # atmospheric pressure in Pascal
temp_dew=df['temp_dew']) # Dew temperature in Degree Celsius


## Erbs model#

The Erbs model (Erbs et al., 1982) is a simple and very popular model that only requires GHI and the solar zenith angle. This model is also available in pvlib and the implementation is equally straight-forward:

erbs = pvlib.irradiance.erbs(
ghi=df['ghi'],
zenith=solpos['apparent_zenith'],
datetime_or_doy=df.index)

erbs.columns

Index(['dni', 'dhi', 'kt'], dtype='object')


## Comparison and evaluation of decomposition models#

Once the 4 models are estimated, we can evaluate the results of the models against the actual DNI measurements available. For example, below we visualize the results of the models a single day and a week in July:

# Plot of models for a single day
date = '2021-07-18'

disc.loc[date, 'dni'].plot(label='DISC', alpha=0.7)
dirint.loc[date].plot(label='DIRINT', alpha=0.7)
dirindex.loc[date].plot(label='DIRINDEX', alpha=0.7)
erbs.loc[date, 'dni'].plot(label='Erbs', alpha=0.7)
df.loc[date, 'dni'].plot(c='k', label='Reference')

plt.ylim(0, 1200)
plt.ylabel('Direct Normal Irradiance [W/m$^2$]')
plt.xlabel('Time [HH:MM]')
plt.title('Comparison of decomposition models\n(July 18th 2021)')
plt.legend(loc='upper left')

<matplotlib.legend.Legend at 0x17ea2beffd0> Hide code cell content
# Plot of models for several days on an hourly basis
disc.loc['20210709':'20210718', 'dni'].resample('1h').mean().plot(label='DISC', alpha=0.5) # format of date YYYYMMDD
dirint.loc['20210709':'20210718'].resample('1h').mean().plot(label='DIRINT', alpha=0.5)
dirindex.loc['20210709':'20210718'].resample('1h').mean().plot(label='DIRINDEX', alpha=0.5)
erbs.loc['20210709':'20210718', 'dni'].resample('1h').mean().plot(label='Erbs', alpha=0.5)
df.loc['20210709':'20210718', 'dni'].resample('1h').mean().plot(label='Reference')

plt.ylim(0,1200)
plt.ylabel('Direct Normal Irradiance [W/m$^2$]')
plt.title('Comparison of decomposition on an hourly basis\n (9th-18th July 2021)')
plt.legend(ncol=2, loc='upper center')
plt.tight_layout() The correlation between the obtained models and the measured observations can be also presented in scatter plots:

# Create multiple plots
fig, axes = plt.subplots(ncols=4, figsize=(15, 4), facecolor='w', edgecolor='k', sharey=True)

# Plot
axes.scatter(df['dni'], disc['dni'], s=4, c='k')
axes.scatter(df['dni'], dirint, s=4, c='k')
axes.scatter(df['dni'], dirindex, s=4, c='k')
axes.scatter(df['dni'], erbs['dni'], s=4, c='k')

# Add characteristics to each subplot in a loop
axes.set_ylabel('Modelled DNI [W/m$^2$]')
axes.set_ylim(0, 1200)

for ax, model in zip(axes, ['DISC', 'DIRINT', 'DIRINDEX', 'ERBS']):
ax.plot([0,1200], [0,1200], c='r')
ax.set_xlim(0, 1200)
ax.set_title(model)
ax.set_xlabel('Observed DNI [W/m$^2$]')

plt.tight_layout()
plt.show() The evaluation cannot be only based on a visual inspection. For solar resource assessment, there are a number of common metrics to evaluate the error and performance, which we will use below. These metrics are the mean bias error (MBE) and the root mean square error (RMSE). The MBE is the average error representing a systematic error to under- or overestimate and the RMSE is a measure of the dispersion of the deviations.

$MBE = \frac{1}{n} \sum_{i=1}^{n} M_i - O_i$
$RMSE = \sqrt{\frac{1}{n}\sum_{i=1}^{n}{(M_i - O_i)^2}}$

where $$n$$ is the size of the sample or dataset, $$M_i$$ is each of the predicted values and $$O_i$$ are the observed or true values. In Python, each of these evaluation metrics can be defined as function.

# Mean Bias Error
def mbe(predictions, observed):
return (predictions - observed).mean()

# Root Mean Square Error
def rmse(predictions, observed):
return np.sqrt(((predictions - observed)**2).mean())


For the comparison of evaluation metrics, only measurement for periods where the solar elevation is greater than 5 degrees are used:

df = df[solpos['elevation']>5]
erbs = erbs[solpos['elevation']>5]
disc = disc[solpos['elevation']>5]
dirint = dirint[solpos['elevation']>5]
dirindex = dirindex[solpos['elevation']>5]


For the DISC method:

mbe_disc = mbe(disc['dni'], df['dni'])
rmse_disc = rmse(disc['dni'], df['dni'])

print("DISC Decomposition Method:")
print("MBE " + "%.2f" % mbe_disc + " W/m\u00b2")
print("RMSE " + "%.2f" % rmse_disc + " W/m\u00b2")

DISC Decomposition Method:
MBE 41.68 W/m²
RMSE 90.00 W/m²


For the DIRINT method:

mbe_dirint = mbe(dirint, df['dni'])
rmse_dirint = rmse(dirint, df['dni'])

print("DIRINT Decomposition Method:")
print("MBE " + "%.2f" % mbe_dirint + " W/m\u00b2")
print("RMSE " + "%.2f" % rmse_dirint + " W/m\u00b2")

DIRINT Decomposition Method:
MBE 18.26 W/m²
RMSE 76.04 W/m²


For the DIRINDEX method:

mbe_dirindex = mbe(dirindex, df['dni'])
rmse_dirindex = rmse(dirindex, df['dni'])

print("DIRINDEX Decomposition Method:")
print("MBE " + "%.2f" % mbe_dirindex + " W/m\u00b2")
print("RMSE " + "%.2f" % rmse_dirindex + " W/m\u00b2")

DIRINDEX Decomposition Method:
MBE -3.45 W/m²
RMSE 68.51 W/m²


For the Erbs method:

mbe_erbs = mbe(erbs['dni'], df['dni'])
rmse_erbs = rmse(erbs['dni'], df['dni'])

print("Erbs Decomposition Method:")
print("MBE " + "%.2f" % mbe_erbs + " W/m\u00b2")
print("RMSE " + "%.2f" % rmse_erbs + " W/m\u00b2")

Erbs Decomposition Method:
MBE 14.06 W/m²
RMSE 81.79 W/m²


The performance metrics show that, in this example, the DIRINDEX method reports the lowest error metrics, followed by the Erbs model, the DISC, and the DIRINT methods, respectively.

## Section summary#

This section has presented the solar radiation decomposition methods, which are a common type of technique to estimate the direct normal or beam irradiance from global horizontal irradiance observations. Particularly, the following examples have been covered:

• The Python implementation of three solar radiation decomposition methods have been presented using real data, namely the models used are the DISC, DIRINT, DIRINDEX and Erbs models.

• Time-series visualization and scatter plots have been introduced to evaluate differences in the models.

• Two common error metrics to evaluate the performance, i.e. MBE and RMSE, have been introduced.

Within solar resource assessment, decomposition methods are used as means to estimate the direct irradiance or irradiation for further analysis in particular applications. For example, in solar concentrating power where direct irradiance is essential for power generation availability. There are multiple decomposition methods available in the literature. However, the Python functions for these 4 methods are available in the Python library pvlib.