# Construction of detailed correlation structures across GI business segments

In this article, Benjamin Avanzi, Greg Taylor, and Bernard Wong from UNSW have teamed up to illustrate the uses of large-scale correlation structures in General Insurance.

## Why are we interested in large-scale correlation structures?

Who cares about large correlation matrices? Well, the fact is that they can be of considerable use in the construction of **risk margins**, and, as will appear later, even **capital margins**. Certain aspects of correlation were examined in our earlier article **here**. The present article summarises our Insights presentation found **here** and video **here**.

In making such statements, one should always maintain an awareness of the limitations of correlation as a measure of dependency. The **Pearson correlation coefficient** is often well adapted to this purpose in the presence of **multi-normally distributed** variates. It may provide a very misleading measure in other circumstances (see e.g. McNeil et al, 2005). So the correlation coefficient may allow usefully for dependency between **business segments** (which might be lines of business) in the construction of risk margins, which typically lie near the centre of the distribution of outstanding claim liabilities.

The outstanding claims of a business segment will usually be estimated in a “lower triangle”. In the simple case where such triangles are 10×10, the lower triangle will contain 45 entries. If there are 50 segments, not unusual for a large insurer, there will be 2,250 entries in all, requiring a 2250×2250 correlation matrix, containing about 2.5 million free entries.

This article concerns the construction of such a matrix in such a way that:

- its construction is simple;
- it is known to be positive definite;
- its entries are in accordance with intuition and can be seen to be reasonable;
- the relations between pairs of entries (there are about 3 trillion pairs) are also reasonable.

## Common shock models

The idea used here to create dependency in a model is very simple. Suppose *X,Y, Z* are independent random variables, and define new variables

*A= $$\alpha$$*_{A}Z + $$\beta$$_{A}X **(1)**

B = *$$\alpha$$ _{B}Z + $$\beta$$_{B}Y *

**(2)**

where the *$$\alpha$$’s, $$\beta$$’s are constants > 0.*

Evidently, *A* and *B *are dependent provided *Z* is not degenerate. In fact

Cov(A,B) = *$$\alpha$$ _{A}$$\alpha$$_{A}$$\sigma_Z^2$$ ≥ 0*

Models of this sort are known as **common shock models**, and form the basis of almost the entirety of this article.

## Application to multiple claim triangles

The schema set out in equations (1) and (2) can be extended to an arbitrary number of variates, instead of just *A* and *B*, and can be applied to claim triangles. Suppose one is dealing with triangles (in this case, upper triangles) associated with business segments, as illustrated in Figure 1, in which the generic observation is denoted *$$X_{ij}^{(n)}$$, *representing the observation of claim experience in development year $$j$$ of accident year $$i$$ in business segment $$n$$.

**Figure ****1****: Claim triangles for multiple business segments**

Evidently, one could formulate a common shock model along the lines of (1) and (2), in which *A* and *B* are replaced by *$$X_{ij}^{(m)}$$ and $$X_{kl}^{(n)}$$ , *observations from different triangles. However, with such a large number of observations to be included in the dependency structure, the imposition of some shape on the structure will be helpful.

There are pre-eminent forms of between-triangle dependency that one might wish to include, e.g. **diagonal-wise dependency**, i.e. between the same diagonals of different triangles, and similarly rows, columns, etc. One may also wish to include some within-triangle dependencies. All of this, and more, can be effected, using the flexibility of common shock models.

Space precludes a detailed discussion of these models. This can be found in our formal paper, for which a web link is provided at the end of this article. But Figure 2 provides an example of a correlation matrix generated by a common shock model with diagonal-wise dependency overlaid by a smaller array-wide dependency.

The figure is schematic only, insensitive to small differences in correlations, but broadly heavier shading (darker grey or black) indicates larger values, with white indicating zero correlation. The matrix covers a number of business segments, and cells have been ordered according to accident year within calendar year within business segment.

**Figure ****2****: Correlation matrix for simultaneous diagonal-wise and array-wide independence**

The figure illustrates a reasonably complex dependency structure, but one might well question why dependency between diagonals of different segments exists only for the same diagonals. It might be more reasonable to expect dependency between all diagonals, but reducing as the distance between diagonals increases.

A simple modification of the common shock model introduces AR(1) dependency between diagonals, according to which a parameter *D _{t }*associated with diagonal $$t$$ evolves through time according to the following:

*D _{t }= $$\theta$$D_{t-1} + $$\varepsilon$$_{t }, E [$$\varepsilon$$_{t}] = 0, Var [$$\varepsilon$$_{t}] = $$\sigma_{\varepsilon}^{2}$$*

where *$$\theta$$ is a constant, and $$\varepsilon$$ _{t }*a random quantity.

Under such a model, dependency between diagonals reduces roughly in geometric sequence as the distance between diagonals increases. This yields the rich correlation structure illustrated for two business segments in Figure 3.

**Figure ****3****: Correlation matrix for AR(1) diagonal-wise independence**

## Reduction to simple concepts for populating large correlation matrices

Although it has been shown that the common shock model can generate a very large and complex correlation structure, it is evident that a price must be paid for each additional feature included in the structure. For practical purposes, it is necessary to contain the number of parameters to a minimum compatible with the desired dependency structure.

Since the parameters will commonly be estimated heuristically (i.e. by informed guesswork), it is also essential that each have a strongly intuitive meaning in order that the practitioner be afforded a chance of reasonable accuracy in estimation. Again, space precludes a recital of the detail, but most of the parameters required by a model can be obtained by decomposition of cell variances into intuitive components.

Details will vary according to the specifics of the model chosen but, in the case of a dependency structure that includes AR(1) diagonal-wise dependency both within and between triangles, one needs to estimate the proportion of each cell’s variance that relate to:

- a diagonal common shock across all triangles;
- a diagonal common shock specific to the triangle;
- noise specific to cell.

The AR(1) parameters, one per diagonal common shock, that describe how dependence between diagonals decays with increasing distance between them, must be estimated separately.

The end result for this model is the estimation of a manageable *3N + 1* parameters. In our earlier example involving an insurer with 50 business segments, this model would require 151 parameters to describe a correlation matrix with about 2.5 million free entries (which would require about 2.5 million parameters in the absence of a model structure).

## Numerical example for risk margins

Figure 4 gives a numerical example of the correlation structure just described, for two business segments and on the basis of assumed values of the *3N + 1 = 7 *required parameters. Specifically, these are:

- a parameter for each segment, specifying the proportion of variance of each of the segment’s cell contributed by the diagonal effect;
- a parameter for each segment, specifying the proportion of variance contributed by the effect associated with the segment itself;
- a parameter for each segment, specifying the AR(1) rate of mean reversion of the diagonal effect specific to that segment;
- a single parameter, specifying the AR(1) rate of mean reversion of a diagonal effect common to both segments.

**Figure ****4****: Numerical example**

## Capital margins

Correlations are, of themselves, of little assistance in calculation of high-quantile capital margins. However, they may be used to inform a copula, e.g. a $$t$$ copula, which may then provide a reliable vehicle for the calculation. See the full paper for details.

## Conclusions

Dependency models have been constructed across triangles for multiple business segments. These are flexible models that allow for:

- within- and between-triangle dependencies;
- row-, column- and diagonal-wise dependencies (and, indeed, just about anything else);
- time series dependencies between different rows, etc.

The models can be expressed in a parametrisation that is:

- frugal in the number of parameters;
- intuitive in interpretation.

The models are applicable directly to risk margins, but also applicable to capital margins under a simple extension.

## Acknowledgements

This research was supported by an Australian Actuarial Research Grant awarded by the Australian Actuaries Institute to the authors. The views expressed herein are those of the authors and are not necessarily those of the supporting organisations.

## References

McNeil, A.J., Frey, R. and Embrechts, P. (2005). *Quantitative Risk Management: Concepts, Techniques and Tools*. Princeton University Press, Princeton NJ, USA.

Avanzi, B., Taylor, G., Wong, B. (2016). Common shock models for claim arrays, UNSW Business School Research Paper Series 2016ACTL07, https://ssrn.com/abstract=2881058

CPD: Actuaries Institute Members can claim two CPD points for every hour of reading articles on Actuaries Digital.