Introduction to Extreme Value Theory

Valentine Chisango
4 min readFeb 20, 2023

An introduction to modelling extreme values

In risk modelling, much of the concern is on extreme events — i.e. events that have a low probability of occurring but very high severity. These events are found in the tail of the loss distribution¹. Modelling these events is difficult since there is often limited data at the extreme end of the distribution. The parameters estimated during the fitting process are influenced most by the data closer to the centre, since this is where the bulk of the data tends to lie. The distribution fitted to the entire dataset may therefore provide little information about the distribution of extreme values. Extreme value theory provides us with two models that we can use for extreme values, which will be discussed in the context of the example below.

Suppose we have observed the motor claims experience of a general insurer, summarised by the histogram and table below. The distribution of these claim amounts is clearly long-tailed and positively skewed, with almost all of the claims falling in the (0; 10 000] interval.

Graph by author
Frequency of Motor Insurance Claims

The first extreme value theory model is the block maxima model. For this model, we consider the maximum values in equally sized subsets of our dataset. In particular, we divide our dataset into blocks of a certain size and calculate the maximum value in each block. The underlying assumption here is that each observation is independent and identically distributed.

Definition of block maxima

The concern is then fitting a distribution to those maximum values, the block maxima, once they have been standardised. The distribution of interest is the limiting distribution of the standardised block maxima, i.e. the distribution as the size of our blocks becomes sufficient large. Standardizing the block maxima requires finding an appropriate sequence of positive real constants. Extreme value theory proves that it is possible to find an appropriate sequence for most common loss distributions. This proof, and the approach for finding the appropriate sequence, is beyond the scope of this introductory piece but can be found in the reference material below². For a large class of underlying loss distributions, the distribution of the standardised block maxima will converge to a distribution called the generalised extreme value (GEV) distribution.

Generalised Extreme Value (GEV) distribution

For example, if we assume that the underlying distribution of our motor claims data is a two-parameter Pareto distribution then we can determine the limiting distribution of the standardized block maxima as follows:

GEV where the underlying distribution is a two-parameter Pareto distribution

The limiting distribution could then be used to give more accurate answers to questions that pertain to the tail of the loss distribution. For example, if we wanted to know the probability that the standardized block maxima exceeds. The key drawback of the block maxima model is that it does not make use of all the extreme data. Only the maximum in each of the large blocks is used in the modelling. For this reason, threshold exceedance models are often preferred in practice where data in the tails of the distribution tends to be limited.

The second extreme value theory model, the threshold exceedance model, considers all observations that exceed some specified threshold. For this model, we consider the distribution of the observations that exceed some specified threshold. Extreme value theory tells us that for a large class of underlying loss distributions, the distribution of the threshold exceedances will converge to a Generalised Pareto Distribution (GPD) as the threshold increases, that is:

Generalised Pareto Distribution (GPD)

The parameters for the GPD can then be determined using the threshold exceedances as the data and an appropriate model fitting technique such as maximum likelihood estimation. The fitted GPD can then be used to answer questions about the threshold exceedances as you would use distributions in other contexts.

The repository with the R code to reproduce the graphs and calculations can be found here: https://github.com/ValentineChisango/A212-CS2

[1] My article on Loss Distributions can be found here: https://vmchisango.medium.com/introduction-to-loss-distributions-dbc3eacff971

[2] McNeil, A.J., Frey, R. and Embrechts, P., 2015. Quantitative risk management: concepts, techniques and tools-revised edition. Princeton university press.

--

--