Further increases in the AOW-age from 2022 (and age 67) onward, are to be estimated using a stochastic mortality model and will be based on the best estimate of the remaining life expectancy forecasts.

Whether or not an increase in the AOW-age is deemed justified, depends on the result of the formulas used to determine whether an increase is necessary.

These formulas (as found in textit{art. 7a AOW}) look like:

begin{eqnarray}

V^{mbox{AOW}}_{t} := (L_{t}-18.26)-(P_{t-1}-65)nonumber\

V^{mbox{pension}}_{t} := (L_{t+10}-18.26)-(P_{t-1}-65)nonumber\

end{eqnarray}

For the AOW-age and the pensionable-age, respectively.

Here, $t$ is the year in which a possible increase in the AOW-age or pensionable-age will happen, $V_{t}$ determines the increase in the AOW-/pensionable-age and the initial age of accrual, measured in periods of years, $L_{t}$ is the best estimate of the sex-neutral remaining life expectancy of a 65-year old in year $t$. This estimate will be determined by the CBS at $t-5$ for the AOW age.

$P_{t-1}$ is the previous AOW-age or pensionable age.

As a minimum increase in $V^{mbox{AOW}}_{t}$ in one year that prompts an increase in the AOW-age, a value of 0.25 (or three months) is maintained. For any increase below 0.25 (or possibly even a decrease), the AOW-age in that year remains unchanged.

Similarly for the pensionable age, for any increase in $V^{mbox{pension}}_{t}$ in a given year greater than or equal to 1, the pensionable age increases with 1 year, while any increase below 1 or any decrease leaves the pensionable age unchanged in that year.

From this it follows that the forecast of the AOW-age can be expressed as a linear transformation of the projection of the sex-neutral life expectancy:

begin{equation}

P_{t} := P_{t-1}+V_{t} Rightarrow P_{t-1}+(L_{t}-18.26)-(P_{t-1}-65) = L_{t}+46.74

end{equation}

Similarly for the pensionable age:

begin{equation}

P_{t} := P_{t-1}+V_{t} Rightarrow P_{t-1}+(L_{t+10}-18.26)-(P_{t-1}-65) = L_{t+10}+46.74

end{equation}

The incremental increases with three months for the AOW-age and one year for the pensionable-age described above will inevitably cause a mismatch between AOW and pension and the payout of entitlements. While everyone would agree this is undesirable, there is not really a straightforward solution: Having the pensionable age increase in steps of three months would undoubtedly result in large amounts of additional administrative work and costs associated with that, which will ultimately be passed on to the individuals participating in the pension plan. On the other hand, an increase of the AOW-age in stepts of one year would disadvantage individuals very close to retirement to a great extent, since there is usually already a set date for retirement. Increasing the AOW-age by a year will in that case result in such an individual having to bridge a 1-year financial gap with no income and no, or insufficient, pension entitlements.

section{Data}

If we let $x$ be the age of a person and $t$ is the year that this person is $x$ years old, then we need

the number of deaths $D_{x,t}$ and the exposures-to-risk $E_{x,t}$ to construct the mortality model.

$D_{x,t}$ then gives the observed death count of people aged $[x, x + 1)$ in the year $[t,t + 1)$.

In other words, these are the people who die in year $t$ with exact age $x$, before reaching the

age $x+1$. And in a similar way, $E_{x,t}$ is an estimate of the population aged $[x, x + 1)$ exposed to the risk of death during interval $[t,t + 1)$,

based on annual population estimates.

All estimations, forecasts and further computations in this thesis are based on Dutch data of death rates and exposure-to-risk, as made available by the Royal Dutch Actuarial Association (Koninklijk Actuariaal Genootschap), see cite{22}. This dataset contains observations of both 1 year in age and time $(1 times 1)$ with ages $x in mathcal{X} := (0,1,dots,90)$ and years $t in mathcal{T} := (1970,1971,dots,2015)$.

section{Assumptions}

Here we describe the notation used and the assumptions made for further derivation of formulas, formulation of models and estimation of parameters and variables.

subsection{Estimation for total population}

It is common practice to make separate estimations for males and females when using stochastic mortality models, but AOW- and pensionable ages are ultimately based on estimations for the total population, since discrimination based on gender is not allowed (even though gender has significant effect on the remaining life expectancy). Taking this into consideration, we only focus on modeling mortality for the total population, with exception of some visualization of mortality rates for both sexes.

subsection{Stochastic Framework}

First we let the 1-year probability of death $q_{x,t}$ be the probability that an individual aged $x$ is alive at the first day of year $t$, but dies somewhere in the interval $(t,t+1)$.

Taking this one step further, we define $_{k}q_{x,t}$ as being the probability of death for an individual aged $x$ at time $t$ in the interval $(t,t+k)$.

From this it immediately follows that the survival probability of the same individual in the same interval can be defined as $_{k}p_{x,t} = 1- _{k}q_{x,t}$ and we assume that $_{k}q_{x,t}$ as a function of $k$ is differentiable. With these functions it is possible to express textit{death in the next moment of time conditional on survival up to age $x$}, by a conditional density function.

This is the force of mortality $mu_{x,t}$ which is defined as:

begin{equation}

mu_{x,t} := frac{1}{_{k}p_{x,t}} frac{partial_{k} q_{x,t}}{partial k} := -frac{1}{_{k}p_{x,t}} frac{partial_{k} p_{x,t}}{partial k}.

end{equation}

This can be rewritten using an integral, making it look like:

begin{equation}

_{k}p_{x,t} quad := quad text{exp}left(-int_{0}^{k} mu_{x+u,t+u}diff u right).

end{equation}

subsection{Piece-wise constant hazard rates}

To ultimately be able to do calculations, we need approximations of the 1-year mortality rate $q_{x,t}$. If we assume $mu_{x+k,t+k} = mu_{x,t}$ for $k in [0,1)$ and that $mu_{x,t}$ is a piece-wise constant hazard rate, we can write:

subsection{Remaining life excpectancy}

We define remaining life expectancy as the average number of years that an $x$-year old survives.

This we calculate in two different ways, the first being based on the assumption that the 1-year mortality rate does not change over time, notated as $q_{x,t}=q_{x,t+s}$, for all $s>0$.

This would give us the complete remaining period life expectancy, given as:

The other way in which we calculate the remaining life expectancy, is by taking into account all future improvements and deteriorations in mortality rates:

Which is defined as the complete remaining cohort life expectancy, where $q_{x+l},{t+l}$ is the 1-year mortality rate of an individual aged $x+l$ in the year $t+l$.

Although remaining life expectancy in the past was commonly calculated based on periods (likely due to insufficient data and/or computational limitations), it makes much more sense to use the cohort-method, given that an individual experiences mortality rates over consecutive years, which one can visualize as moving diagonally through a table of mortality years for given ages.

In this thesis, we will use both methods for further calculations.

subsection{Closing the life tables for the higher ages}

The dataset used in this thesis only considers the ages $mathcal{X} :={0,dots,90}$, but considering the fact that longevity is the red thread running through this thesis, the higher ages certainly deserve attention and for estimating and forecasting cohort life expectancy, including these higher ages even becomes a necessity. The ages $mathcal{X}_{old} := {91, dots, 120}$ are likely excluded because the corresponding mortality rates are very volatile, caused by a low number of observations and small exposure-to risk. To close the life table, we can use the logistic mortality law of cite{17}. Life tables as established by the Koninklijk Actuarieel Genootschap cite{22} use this method for closing the lifetables, as well. Kannist”{o}’s method works as follows:\

The force of mortality $mu{x,t}$ for the higher ages is modeled as:

and these parameters $phi_{1,t}$ and $phi_{2,t}$ can be estimated for any $t$ by first transforming the mortality rates into hazard rates, under the assumption that these hazard rates are constant during an age-time interval:

We now apply a logistic transform of the form ‘logit($x$) = $ln(frac{x}{1-x})$’ to this equation:

With this linear model, the parameters $phi_{1,t}$ and $phi_{1,t}$ can be estimated by applying Ordinary Least Squares (OLS) to the ages $mathcal{X}_{fit} := {80, dots, 90}$:

and the OLS estimate $hat{phi}_{t}$ is then found by matrix multiplication:

begin{equation}

hat{phi}_{t} quad := quad (X’X)^{-1}X’text{logit}(mu_{80:90,t}).

end{equation}

The estimates $hat{phi}_{1,t}$ and $hat{phi}_{2,t}$ that are found this way, enable the extrapolation of the mortality rates for higher ages:

begin{equation}

q^{text{old.age}}_{x,t} := 1 – text{e}^{(-frac{hat{phi}_{1,t} cdot text{e}^{hat{phi}_{2,t} cdot x}}{1+hat{phi}_{1,t} cdot text{e}^{hat{phi}_{2,t} cdot x}})}, qquad forall x in mathcal{X}_{text{old.age}} := {91, 92, dots, 120}.

end{equation}

Ultimately, we can use the force of mortality found from the above equation to determine the mortality rates for each year $t$, using an equation for the approximation of the 1-year mortality rate $q_{x,t}$:

begin{equation}

q_{x,t} quad := quad 1-p_{x,t} quad = quad 1-text{exp} left( -int_{0}^{1} mu_{x+u,t+u} diff u right) quad approx quad 1-text{exp}(-mu_{x,t})).

end{equation}

subsection{Best estimate and confidence interval}

In this thesis, as in cite{24}, we define the best estimate as the most likely outcome under uncertainty in the projections of our stochastic mortality model.

For the mortality probabilities $q_{x,t}$, we obtain this best estimate by setting the error terms in the stochastic time series to zero. The remaining life expectancy caclulated from the best estimate of the projection table of these $q_{x,t}$ then becomes a best estimate of the remaining life expectancy of its own.\

The above-mentioned best estimate follows from a number of simulations $N$ of the mortality projections, with realizations $X_{1}, dots, X_{N}$.

For a certain confidence level $alpha$, we define the confidence level as $[hat{Phi}^(frac{alpha}{2}), hat{Phi}^(1-frac{alpha}{2})]$, where $hat{Phi}^(frac{alpha}{2})$ and $hat{Phi}^(1-frac{alpha}{2})$ are the estimated lower and upper bound for the $(frac{alpha}{2})$-th percentile and $(1-frac{alpha}{2})$-th percentile, respectively. These estimates for the upper and lower bounds become more accurate as the number of realizations $N$ is increased.

section{Mortality rates}

The maximum likelihood estimator of the mortality rates, as defined previously, is equal to the number of deaths at age $x$ in year $t$ denominated by the corresponding exposure-to-risk $E_{x,t}$:

begin{equation}

mu_{x,t} = frac{D_{x,t}}{E_{x,t}}.

end{equation}

This result follows from the assumption that the distribution of the number of deaths $D_{x,t}$ is Poisson distributed with mean $E_{x,t} cdot mu_{x,t}$ proportional to the population size, as in cite{27}.

The likelihood function $L$ in this case looks like:

begin{equation}

L quad := quad prodlimits_{x in mathcal{X}} prodlimits_{t in mathcal{T}}left[frac{(E_{x,t}cdot mu_{x,t})^{D_{x,t}} cdot text{e}^{-E_{x,t} cdot mu_{x,t}}}{D_{x,t}!} right].

end{equation}

The log-likelihood is then derived from this as:

begin{equation}

ln(L) quad := quad sumlimits_{x in mathcal{X}} sumlimits_{t in mathcal{T}} left[D_{x,t} cdot (ln(E_{x,t}cdot mu_{x,t})-E_{x,t} cdot mu_{x,t} – ln(D_{x,t}!)right],

end{equation}

and from this, we maximize this log-likelihood by taking the partial derivative with respect to $mu_{x,t}$:

begin{equation}

frac{partial ln(L)}{partial mu_{x,t}} quad := quad frac{D_{x,t}}{mu_{x,t}}-E_{x,t} quad = quad 0.

end{equation}

When we rewrite this we end up with the equation introduced at the beginning of this subsection, which is the estimator of the hazard rate $hat{mu}_{x,t}$, for which the log-likelihood is maximized:

begin{equation}

hat{mu}_{x,t} quad = quad frac{D_{x,t}}{E_{x,t}} quad = quad m_{x,t}.

end{equation}

This $hat{mu}_{x,t}$ stands for the number of people that died during an $(x,t)$-interval, divided by the number of people that textit{could have died} during the same interval, and this estimator is known as the textit{central death rate} which is denoted, as in the above equation, as $m_{x,t}$. This ratio allows for visualizing using the available datasets. The image below shows the pattern of logarithmic death rates for ages 0 to 90 for both males, females and the total population, respectively:

begin{figure}[H]

centering

includegraphics[width=0.7linewidth]{“../Data/Rplot Log Death Rates Age”}

caption{Logarithm of death rates for Dutch males, females and total population for ages 0 to 90 and all years}

label{fig:rplot-log-death-rates-age}

end{figure}

Notice the rates declining for children as they age, with the ‘accident bump’ around age 20 much more pronounced for males.\

If we take a look at the log central death rates relative to time, we get different insights:\

begin{figure}[H]

centering

includegraphics[width=0.7linewidth]{“../Data/Rplot Log Death Rates Time”}

caption{Estimates of the log force of mortality for Dutch males, females and total population for the period 1970 to 2015 for all ages}

label{fig:rplot-log-death-rates-time}

end{figure}

This figure shows that mortality in the Dutch population is falling rapidly for the younger ages, but not all that much

for the highest ages.

Finally, we can create a 3D plot for the log death rates, plotting them against both time and age, resulting in the following image:

begin{figure}[H]

centering

includegraphics[width=0.7linewidth]{“../Data/Rplot central death rates”}

caption{Estimates of the log force of mortality for ages 0 to 90 and years 1970 to 2015 for the total Dutch population}

label{fig:rplot-central-death-rates}

end{figure}

This image shows consistenly higher log central death rates for the earlier years in the data set, as well as a more pronounced ‘accident bump’. Later years show a more smooth progression as the age increases and lower values for the log central death rates, supporting the accepted and obvious view that going into the future, longevity is increasing.\

chapter{Stochastic Mortality Model}

Human mortality modeling is done using the Lee-Carter model, defined as:

begin{equation}

ln (hat{mu}_{x,t}) := ln left(frac{D_{x,t}}{E_{x,t}}right) := alpha_{x}+beta_{x} kappa_{t}+epsilon_{x,t}.

end{equation}

Here $alpha_{x}$ and $beta_{x}$ are age-dependent parameters which do not change over time, while $kappa_{t}$ is a time-varying parameter which does not depend on the age $x$. More specifically:

begin{itemize}

item $alpha_{x}$ is the average of the logarithm of the central death rates ($ln(m_{x,t})$) over time;

item $beta_{x}$ is the rate of change of the central death rates for age $x$ given a change in the time-varying parameter $kappa_{t}$;

item $kappa_{t}$ is defined as the time-varying index of the level of mortality;

end{itemize}

This way, the model separates age- and period effects. The error term is assumed to be an independent identically distributed Gaussian variable for all pairs $(x,t)$.

To fit the model, there are three fitting procedures that are most commonly associated with the Lee-Carter model.

The first option is through singular value composition, where one would have a least squares criterion minimizing the sum of squared residuals as below:

begin{equation}

SSR := underset{alpha_{x},beta_{x},kappa_{t}}{text{min}} sumlimits_{x in mathcal{X}} sumlimits_{t in mathcal{T}}left(lnleft(frac{D_{x,t}}{E_{x,t}}right)-alpha_{x}-beta_{x} cdot kappa_{t}right)^{2}

end{equation}

The problem with this approach is that in the event that the number of deaths $D_{x,t}$ or exposure-to-risk $E_{x,t}$ equals zero, it results in an error. Sticking with singular value decomposition, this can only be avoided by excluding the conflicting age(s) $x$ or year(s) $t$, which becomes problematic when we want to use the full dataset.

A second method of fitting then, is to use weighted least squares, where the $D_{x,t}$ act as weights for the observations:

begin{equation}

WSSR := underset{alpha_{x},beta_{x},kappa_{t}}{text{min}} sumlimits_{x in mathcal{X}} sumlimits_{t in mathcal{T}}D_{x,t} cdot left(lnleft(frac{D_{x,t}}{E_{x,t}}right)-alpha_{x}-beta_{x} cdot kappa_{t}right)^{2}

end{equation}

Although this method avoids the problem of errors caused by taking the logarithm of zero, both these methods share the drawback that errors are assumed to be normally distributed and homoskedastic. Considering that the smaller (absolute) number of deaths at the older ages makes the logarithm of the central death rates much more variable for these ages, these seem like unrealistic assumptions.

The third option available for the fitting procedure, which does not suffer from the problems mentioned above, is ‘maximum likelihood estimation’, which comes down to maximizing the likelihood of a Poisson distribution, explained in much greater detail by cite{25}, among others.

If we let $D_{x,t} sim text{Pois}(E_{x,t}cdotmu_{x,t})$, and take the hazard rates to be $mu_{x,t} := text{e}^{alpha_{x}+beta_{x} kappa_{t}}$, this makes it so that the natural logarithm of the hazard rates is now specified like the Lee-Carter model. We can now define the likelihood function $L$:

begin{equation}

L := prodlimits_{x in mathcal{X}} prodlimits_{t in mathcal{T}}D_{x,t} cdot left(frac{(E_{x,t}cdot text{e}^{alpha_{x}+beta_{x} kappa_{t}})^{D_{x,t}}}{D_{x,t}!} cdot text{e}^{-E_{x,t}cdot text{e}^{alpha_{x}+beta_{x} kappa_{t}}}right)

end{equation}

with log-likelihood, $ln(L)$:

begin{equation*}

begin{aligned}

&ln(L) quad& := quad & sumlimits_{x in mathcal{X}} sumlimits_{t in mathcal{T}} left(D_{x,t} cdot (ln(E_{x,t})+alpha_{x}+beta_{x}kappa_{t})-E_{x,t} cdot text{e}^{alpha_{x}+beta_{x}kappa_{t}} – ln(D_{x,t}!)right) \

& quad & := quad & sumlimits_{x in mathcal{X}} sumlimits_{t in mathcal{T}} left(D_{x,t} cdot (alpha_{x}+beta_{x}kappa_{t})-E_{x,t} cdot text{e}^{alpha_{x}+beta_{x}kappa_{t}} right) + C,

end{aligned}

end{equation*}

with C a constant.

To maximize the log-likelihood, we first have to perform partial differentiation with respect to $alpha_{x}$, $beta_{x}$ and $kappa_{t}$:

begin{equation*}

begin{aligned}

& frac{partial ln(L)}{partial alpha_{x}} & := &text{ } sumlimits_{t in mathcal{T}} left(D_{x,t} – E_{x,t} cdot text{e}^{alpha_{x}+beta_{x}kappa_{t}} right) & = &text{ } 0, & forall x in mathcal{X}. \

& frac{partial ln(L)}{partial beta_{x}} & := &text{ } sumlimits_{t in mathcal{T}} left(D_{x,t} cdot kappa_{t} – E_{x,t} cdot kappa_{t} cdot text{e}^{alpha_{x}+beta_{x}kappa_{t}} right) & = &text{ } 0, & forall x in mathcal{X}. \

& frac{partial ln(L)}{partial kappa_{t}} & := &text{ } sumlimits_{x in mathcal{X}} left(D_{x,t} cdot beta_{x} – E_{x,t} cdot beta_{x} cdot text{e}^{alpha_{x}+beta_{x}kappa_{t}} right) & = & text{ }0, & forall t in mathcal{T}. \

end{aligned}

end{equation*}

This system of equations cannot be solved analytically, but it is possible to find an approximation through an iterative procedure using the Newton-Raphson method. This can be written as:

begin{equation*}

begin{aligned}

&hat{alpha}_{x,n+1} & := & &hat{alpha}_{x,n} – & frac{sumlimits_{t in mathcal{T}}left(D_{x,t} – E_{x,t} cdot text{e}^{hat{alpha}_{x,n}+hat{beta}_{x,n}hat{kappa}_{t,n}} right)}{-sumlimits_{t in mathcal{T}}left(E_{x,t} cdot text{e}^{hat{alpha}_{x,n}+hat{beta}_{x,n}hat{kappa}_{t,n}}right)}, & forall x in mathcal{X}. \

&hat{beta}_{x,n+1} & := & &hat{beta}_{x,n} – & frac{sumlimits_{t in mathcal{T}} hat{kappa}_{t,n+1} cdot left(D_{x,t} – E_{x,t} cdot text{e}^{hat{alpha}_{x,n+1}+hat{beta}_{x,n}hat{kappa}_{t,n+1}} right)}{-sumlimits_{t in mathcal{T}} hat{kappa}_{t,n+1}^{2} cdotleft(E_{x,t} cdot text{e}^{hat{alpha}_{x,n+1}+hat{beta}_{x,n}hat{kappa}_{t,n+1}}right)}, & forall x in mathcal{X}. \

&hat{kappa}_{x,n+1} & := & & hat{kappa}_{x,n} – & frac{sumlimits_{t in mathcal{T}} hat{beta}_{x,n} cdot left(D_{x,t} – E_{x,t} cdot text{e}^{hat{alpha}_{x,n+1}+hat{beta}_{x,n}hat{kappa}_{t,n}} right)}{-sumlimits_{x in mathcal{X}} hat{beta}_{x,n}^{2} cdotleft(E_{x,t} cdot text{e}^{hat{alpha}_{x,n+1}+hat{beta}_{x,n}hat{kappa}_{t,n}}right)}, & forall t in mathcal{T}. \

end{aligned}

end{equation*}

By imposing the restrictions that

begin{eqnarray}

sumlimits_{x in mathcal{X}} hat{beta}_{x} = 1 & text{and} & sumlimits_{t i

n mathcal{T}} hat{kappa}_{t} = 0,

end{eqnarray}

using a set of starting values $S_{0} := (alpha_{x,0}=0, beta_{x,0}=1, kappa_{t,0}=0)$, and following through the iterative procedure, the parameters can be solved numerically by setting a remainder term and continuining the procedure of computing and adjusting the parameters until successive computations result in a change that is no larger than the specified remainder term.

section{Forecasting}

Once fitted with historical data, this model yields a set of parameter estimates $mathcal{X} = (hat{alpha}_{x},hat{beta}_{x},hat{kappa}_{t})$. The time factor $hat{kappa}_{t}$ is a stochastic process for all $t$ after today. We need an ARIMA time series model to forecast this time factor, and for this we proceed in the same manner as Lee and Carter did for the U.S. population, where they used a random walk with drift model (ARIMA(0,1,0)).

This model is defined as:

begin{equation}

kappa_{t+1} := kappa_{t} + theta + sigma cdot zeta_{t},

end{equation}

with $theta$ the drift term, $sigma$ the standard deviation of the (random) noise and the $zeta_{t}$ are considered iid standard Gaussian variables.

To get estimates $hat{theta}$ and $hat{sigma}$, we consider the following:

begin{equation}

begin{aligned}

Deltakappa_{t} quad &:= quad & quad kappa_{t-1}-kappa_{t} quad & = & quad theta+sigma cdot zeta_{t}. quad & & \

E[Deltakappa_{t}] quad &:= quad & quad E[theta+sigma cdot zeta_{t}] quad & = & quad theta+sigma cdot E[zeta_{t}] quad & = & hat{theta}. \

Var[Deltakappa_{t}] quad & := quad & quad Var[theta+sigma cdot zeta_{t}] quad & = & quad sigma^{2} cdot E[zeta_{t}] quad & = & hat{theta}^{2}. \

end{aligned}

end{equation}

The estimates $hat{theta}$ and $hat{sigma}$ that we obtain from this along with the last value of $hat{kappa}_{t}$ generate mortality-level forecasts, which are denoted as $tilde{kappa}_{t}$. If we assume $alpha_{x}$ and $beta_{x}$ to remain constant over time, these $tilde{kappa}_{t}$-forecasts enable us to make forefasts of future mortality rates:

**...(download the rest of the essay above)**

### About this essay:

This essay was submitted to us by a student in order to help you with your studies.

If you use part of this page in your own work, you need to provide a citation, as follows:

Essay Sauce, *Increasing AOW-age*. Available from:<https://www.essaysauce.com/finance-essays/increasing-aow-age/> [Accessed 23-09-19].

### Review this essay:

*Please note that the above text is only a preview of this essay.*