Modeling mortality tables

In this section, we attempt to fit the Gompertz-Makeham and the Heligman-Pollard models to a selected mortality table.

The Gompertz-Makeham model

In the Gompertz-Makeham model, the force of mortality (akin to the hazard function) $\mu_x$ is given by

\[ \mu_x = Ae^{Bx} + C,\]

for age $x$. The force of mortality represents the instantaneous rate of mortality at a certain age $x$, in an annual basis. It is closely related to the mortality rate $q_x$, which is the percentage of deaths in a population per year, which can be interpreted as the probability a person at an age $x$, in years, dies before reaching age $x+1$. Under certain assumptions, the two are related by

\[ \mu_x = \frac{q_x}{1 - q_x}, \qquad q_x = \frac{\mu_x}{1 + \mu_x}.\]

The terms $A$ and $B$ are associated with the Gompertz law $Ae^{Bx}$, while the term $C$ is an additional term provided by Makeham, which combine to form the Gompertz-Makeham model.

The Heligman-Pollard model

The Gompertz-Makeham model approximates reasonably well the force of mortality, especially the growth seen in adult years, but a better model is the Heligman-Pollard, that also approximates well the childhood and young groups and the eldery years. This model takes the form

\[ \mu_x = A^{(x + B)^C} + D e^{-E \ln(x/F)^2} + \frac{GH^x}{1 + KGH^x},\]

for suitable coefficients $A, B, C, D, E, F, G, H, K$. It is common to see the Heligman-Pollard models with $K=0$, which is the main model suggested by the authors, but in the same paper they also discuss two extensions of the model, one of them being the one above.

We can clearly distinguish three terms in the model, with the first term, with parameters $A, B, C$, modeling the steep exponential decline in mortality in the early childhood years due in part to a relatively high degree of mortality in the newborn; the second term, with parameters $D, E, F$, representing the log-normal bump in mortality in the youth ages; and the last term, with parameters $G, H, K$, with the exponential growth at middle to older ages. Notice the last term follows the original Gompertz law (without the additional term due to Makeham).

A mortality table

We start by loading the necessary packages

using Distributions, Turing, StatsPlots

We illustrate the modeling process with the United States 1997 mortality table. We removed the data from ages 0-1 and 100+ hoping for a better fit.

x = collect(1:99)
# mortality rate
qx = [
    0.00055
    0.00036
    0.00029
    0.00023
    0.00021
    0.00020
    0.00019
    0.00017
    0.00015
    0.00014
    0.00014
    0.00019
    0.00028
    0.00041
    0.00055
    0.00068
    0.00078
    0.00085
    0.00089
    0.00093
    0.00098
    0.00101
    0.00101
    0.00101
    0.00100
    0.00099
    0.00100
    0.00103
    0.00108
    0.00114
    0.00119
    0.00126
    0.00133
    0.00140
    0.00149
    0.00157
    0.00167
    0.00178
    0.00192
    0.00206
    0.00222
    0.00239
    0.00257
    0.00278
    0.00300
    0.00325
    0.00352
    0.00380
    0.00411
    0.00444
    0.00482
    0.00524
    0.00571
    0.00623
    0.00685
    0.00755
    0.00833
    0.00916
    0.01005
    0.01101
    0.01208
    0.01321
    0.01439
    0.01560
    0.01679
    0.01802
    0.01948
    0.02127
    0.02338
    0.02565
    0.02799
    0.03043
    0.03297
    0.03563
    0.03843
    0.04147
    0.04494
    0.04904
    0.05385
    0.05938
    0.06555
    0.07241
    0.07990
    0.08812
    0.09653
    0.10556
    0.11539
    0.12616
    0.13802
    0.15085
    0.16429
    0.17813
    0.19250
    0.20764
    0.22354
    0.23999
    0.25653
    0.27295
    0.28915
]
# mx = - log.(1.0 .- qx) # force of mortality # roughly the same as below
mx = qx ./ (1.0 .- qx) # force of mortality
scatter(x, qx, yscale=:log10, legend=:topleft, label="qx")
scatter!(x, mx, yscale=:log10, legend=:topleft, label="mx")

The Gompertz-Makeham model in Turing.jl

First we start by defining the function that characterizes the Gompertz-Makeham model

function gompertz_makeham(x, p)
    A, B, C = p
    m = A * exp(B * x) + C
    return m
end

gompertz_makeham (generic function with 1 method)

Now we define the compound probability model, assigning Beta distributions to the parameters. But, as mentioned before, the initial prior is very important. It is not easy to get a good fit. So, first, we approximate by "hand", with trial and error, to get the following approximate fit:

let (A, B, C) = (0.00001, 0.1, 0.001)
    m = A * exp.(B * x) .+ C
    plt = plot(yscale=:log10, title="Force of mortality", titlefont=10, xlabel="age", ylabel="force of mortality", legend=:topleft)
    plot!(plt, x, m, label="Gompertz-Makeham hand-fit")
    scatter!(plt, x, mx, label="data")
end

With these values, we define the prior Beta distributions for the compound probability model with parameters that yield a mean near those values.

@model function gompertz_makeham_model(x, m)
    A ~ Beta(2, 99998)
    B ~ Beta(2, 18)
    C ~ Beta(2, 1998)
    σ² ~ InverseGamma()
    σ = sqrt(σ²)
    p = (A, B, C)

    for i in eachindex(x)
        y = gompertz_makeham(x[i], p)
        m[i] ~ Normal(y, σ)
    end
end

gompertz_makeham_model (generic function with 2 methods)

Now we instantiate the Turing model

model_gm = gompertz_makeham_model(x, mx)

DynamicPPL.Model{typeof(Main.gompertz_makeham_model), (:x, :m), (), (), Tuple{Vector{Int64}, Vector{Float64}}, Tuple{}, DynamicPPL.DefaultContext}(Main.gompertz_makeham_model, (x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10  …  90, 91, 92, 93, 94, 95, 96, 97, 98, 99], m = [0.0005503026664665567, 0.0003601296466728022, 0.00029008412439607484, 0.00023005291216979904, 0.00021004410926294523, 0.00020004000800160032, 0.00019003610686030346, 0.00017002890491383537, 0.0001500225033755063, 0.0001400196027443842  …  0.17764823647176592, 0.19658733292649364, 0.21673744022777328, 0.23839009287925697, 0.2620526023524661, 0.2878963501017438, 0.31577216089261984, 0.34504418470146736, 0.3754212227494671, 0.4067665470915102]), NamedTuple(), DynamicPPL.DefaultContext())

and fit it:

# chain_gm = sample(model_gm, HMC(0.05, 10), 40_000)
chain_gm = sample(model_gm, NUTS(0.65), 5_000)

Chains MCMC chain (5000×16×1 Array{Float64, 3}):

Iterations        = 1001:1:6000
Number of chains  = 1
Samples per chain = 5000
Wall duration     = 10.08 seconds
Compute duration  = 10.08 seconds
parameters        = A, B, C, σ²
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size

Use `describe(chains)` for summary statistics and quantiles.

Here is the result of the MCMC:

plot(chain_gm)

We can see the mean values of the parameters as follows

mean(chain_gm)

Mean
  parameters      mean
      Symbol   Float64

           A    0.0000
           B    0.1019
           C    0.0010
          σ²    0.0204

The mean fit is given by

m_gm = [gompertz_makeham(xi, mean(chain_gm, [:A, :B, :C]).nt.mean) for xi in x]

99-element Vector{Float64}:
 0.001025351464149076
 0.0010279015070434588
 0.0010307252267260255
 0.0010338519948534476
 0.0010373143353198267
 0.001041148262562462
 0.001045393656175411
 0.0010500946757274875
 0.0010553002200995336
 0.0010610644361188795
 ⋮
 0.2303657644355243
 0.2549816838303885
 0.2822394433970488
 0.3124225718301424
 0.34584502681147233
 0.38285446072271173
 0.4238358368423152
 0.4692154336414536
 0.5194652788307006

and we plot it

plt = plot(yscale=:log10, title="Force of mortality", titlefont=10, xlabel="age", ylabel="force of mortality", legend=:topleft)
plot!(plt, x, m_gm, label="Gompertz-Makeham fit")
scatter!(plt, x, mx, label="data")

It remains to compute the 95% credible interval,

quantiles_gm = reduce(
    hcat,
    quantile(
        [
            gompertz_makeham(xi, (A, B, C)) for (A, B, C) in eachrow(view(chain_gm.value.data, :, 1:3, 1))
        ],
        [0.025, 0.975]
        )
    for xi in x
)

2×99 Matrix{Float64}:
 0.000151974  0.000153419  0.000156102  …  0.239917  0.264606  0.290707
 0.00281375   0.00281528   0.00281844      0.439878  0.490358  0.54679

and plot it

plt = plot(yscale=:log10, title="Force of mortality", titlefont=10, xlabel="age", ylabel="force of mortality", legend=:topleft)
plot!(plt, x, m_gm, ribbon=(m_gm .- view(quantiles_gm, 1, :), view(quantiles_gm, 2, :) .- m_gm), label="Gompertz-Makeham fit")
scatter!(plt, x, mx, label="data")

Notice how the function with the means of the parameters is outside the quantiles, which is based on the function values of the parameter samples. Let's check the last portion of the ensemble.

plt = plot(yscale=:log10, title="Force of mortality", titlefont=10, xlabel="age", ylabel="force of mortality", legend=nothing)
plot!(plt, x, m_gm, label="Bayesian fitted line", color=2)
for (A, B, C) in eachrow(view(chain_gm.value.data, 4500:5000, 1:3, 1))
    plot!(plt, x, x -> gompertz_makeham(x, (A, B, C)), alpha=0.1, color=2, label=false)
end
scatter!(plt, x, mx, color=1)

Let's look at just a few samples to have a better look at the dependence of the function on the sampled values:^[off]

plt = plot(yscale=:log10, title="Force of mortality", titlefont=10, xlabel="age", ylabel="force of mortality", legend=nothing)
plot!(plt, x, m_gm, label="Bayesian fitted line", color=2)
for (A, B, C) in eachrow(view(chain_gm.value.data, 1000:50:2000, 1:3, 1))
    plot!(plt, x, x -> gompertz_makeham(x, (A, B, C)), alpha=0.4, color=3, label=false)
end
scatter!(plt, x, mx, color=1)

The Heligman-Pollard in Turing.jl

Now we consider the Heligman-Pollard model.

First we start by defining the function that characterizes the model:

function heligman_pollard(x, p)
    A, B, C, D, E, F, G, H, K = p
    m = A^((x + B)^C) + D * exp(-E * log(x / F)^2) + (G * H^x) / (1 + K * G * H^x)
    return m
end

heligman_pollard (generic function with 1 method)

Now we define the compound probability model. As mentioned before, the initial prior is very important. We start with numbers of the order of those given in the original article by Heligman and Pollard. They considered a number of examples, but, as a starting point, we borrow only the data from the 1970-1972 period, separated by gender:

Parameter	Males 1970-72	Females 1970-1972
A	0.00160	0.00142
B	0.00112	0.0350
C	0.1112	0.1345
D	0.00163	0.00038
E	16.71	21.86
F	20.03	18.27
G	0.0000502	0.0000507
H	1.1074	1.0937
K	2.416	-2.800

The difference in the sign of $K$ between the male and female populations is justified by the difference in mortality for the elderly, although the data showed in the article both look more like in the description of the male mortality. Let's see how an approximation of that looks like, keeping a positive sign for $K$ because of this observation.

let (A, B, C, D, E, F, G, H, K) = (0.0015, 0.018, 0.012, 0.001, 19.0, 19.0, 0.00005, 1.1, 1.0)
    m = [heligman_pollard(xi, (A, B, C, D, E, F, G, H, K)) for xi in x]
    plt = plot(yscale=:log10, title="Force of mortality", titlefont=10, xlabel="age", ylabel="force of mortality", legend=:topleft)
    plot!(plt, x, m, label="Heligman-Pollard hand-fit")
    scatter!(plt, x, mx, label="data")
end

Ok, that seems like a reasonable starting point. So we choose the following priors for each parameter:

Parameter	prior	mean
A	Beta(15, 9985)	0.0015
B	Beta(18, 982)	0.018
C	Beta(12, 988)	0.012
D	Beta(2, 1998)	0.001
E	Gamma(38, 0.5)	19.0
F	Gamma(38, 0.5)	19.0
G	Beta(5, 99995)	0.00005
H	Gamma(2, 0.5)	1.0
K	Gamma(2, 0.5)	1.0

But these turned out not to be so good. We either look for better informative priors or use slightly less informative priors. We fiddle a little bit with the parameters to get the following improvement.

let (A, B, C, D, E, F, G, H, K) = (0.0003, 0.02, 0.08, 0.0008, 15.0, 20.0, 0.00005, 1.1, 1.0)
    m = [heligman_pollard(xi, (A, B, C, D, E, F, G, H, K)) for xi in x]
    plt = plot(yscale=:log10, title="Force of mortality", titlefont=10, xlabel="age", ylabel="force of mortality", legend=:topleft)
    plot!(plt, x, m, label="Heligman-Pollard hand-fit")
    scatter!(plt, x, mx, label="data")
end

Based on that, we choose the following priors:

@model function heligman_pollard_model(x, m)
    A ~ Beta(3, 9997) # Beta(1, 660) # Beta(15, 9985)
    B ~ Beta(2, 98) # Beta(1, 50) # Beta(18, 982)
    C ~ Beta(8, 92) # Beta(12, 988)
    D ~ Beta(8, 9992) # Beta(1, 999) # Beta(2, 1998)
    E ~ Gamma(30, 0.5) # Gamma(19, 1) # Gamma(38, 0.5)
    F ~ Gamma(40, 0.5) # Gamma(19, 1) # Gamma(38, 0.5)
    G ~ Beta(5, 99995) # Beta(1, 19999) # Beta(5, 99995)
    H ~ Gamma(2.2, 0.5) # Gamma(1, 1) # Gamma(2, 0.5)
    K ~ Gamma(2.0, 0.5) # Gamma(1, 1) # Gamma(2, 0.5)
    σ² ~ InverseGamma()
    σ = sqrt(σ²)

    for i in eachindex(x)
        y = A ^ ((x[i] + B) ^ C) + D * exp( - E * (log(x[i]) - log(F)) ^ 2) + G * H ^ (x[i]) # / (1 + K * G * H ^ (x[i]) )
        m[i] ~ Normal(y, σ)
    end
end

heligman_pollard_model (generic function with 2 methods)

Now we instantiate the Heligman-Pollard Turing model

model_hp = heligman_pollard_model(x, mx)

DynamicPPL.Model{typeof(Main.heligman_pollard_model), (:x, :m), (), (), Tuple{Vector{Int64}, Vector{Float64}}, Tuple{}, DynamicPPL.DefaultContext}(Main.heligman_pollard_model, (x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10  …  90, 91, 92, 93, 94, 95, 96, 97, 98, 99], m = [0.0005503026664665567, 0.0003601296466728022, 0.00029008412439607484, 0.00023005291216979904, 0.00021004410926294523, 0.00020004000800160032, 0.00019003610686030346, 0.00017002890491383537, 0.0001500225033755063, 0.0001400196027443842  …  0.17764823647176592, 0.19658733292649364, 0.21673744022777328, 0.23839009287925697, 0.2620526023524661, 0.2878963501017438, 0.31577216089261984, 0.34504418470146736, 0.3754212227494671, 0.4067665470915102]), NamedTuple(), DynamicPPL.DefaultContext())