Modeling mortality tables
In this section, we attempt to fit the Gompertz-Makeham and the Heligman-Pollard models to a selected mortality table.
The Gompertz-Makeham model
In the Gompertz-Makeham model, the force of mortality (akin to the hazard function) $\mu_x$ is given by
\[ \mu_x = Ae^{Bx} + C,\]
for age $x$. The force of mortality represents the instantaneous rate of mortality at a certain age $x$, in an annual basis. It is closely related to the mortality rate $q_x$, which is the percentage of deaths in a population per year, which can be interpreted as the probability a person at an age $x$, in years, dies before reaching age $x+1$. Under certain assumptions, the two are related by
\[ \mu_x = \frac{q_x}{1 - q_x}, \qquad q_x = \frac{\mu_x}{1 + \mu_x}.\]
The terms $A$ and $B$ are associated with the Gompertz law $Ae^{Bx}$, while the term $C$ is an additional term provided by Makeham, which combine to form the Gompertz-Makeham model.
The Heligman-Pollard model
The Gompertz-Makeham model approximates reasonably well the force of mortality, especially the growth seen in adult years, but a better model is the Heligman-Pollard, that also approximates well the childhood and young groups and the eldery years. This model takes the form
\[ \mu_x = A^{(x + B)^C} + D e^{-E \ln(x/F)^2} + \frac{GH^x}{1 + KGH^x},\]
for suitable coefficients $A, B, C, D, E, F, G, H, K$. It is common to see the Heligman-Pollard models with $K=0$, which is the main model suggested by the authors, but in the same paper they also discuss two extensions of the model, one of them being the one above.
We can clearly distinguish three terms in the model, with the first term, with parameters $A, B, C$, modeling the steep exponential decline in mortality in the early childhood years due in part to a relatively high degree of mortality in the newborn; the second term, with parameters $D, E, F$, representing the log-normal bump in mortality in the youth ages; and the last term, with parameters $G, H, K$, with the exponential growth at middle to older ages. Notice the last term follows the original Gompertz law (without the additional term due to Makeham).
A mortality table
We start by loading the necessary packages
using Distributions, Turing, StatsPlots
We illustrate the modeling process with the United States 1997 mortality table. We removed the data from ages 0-1 and 100+ hoping for a better fit.
x = collect(1:99)
# mortality rate
qx = [
0.00055
0.00036
0.00029
0.00023
0.00021
0.00020
0.00019
0.00017
0.00015
0.00014
0.00014
0.00019
0.00028
0.00041
0.00055
0.00068
0.00078
0.00085
0.00089
0.00093
0.00098
0.00101
0.00101
0.00101
0.00100
0.00099
0.00100
0.00103
0.00108
0.00114
0.00119
0.00126
0.00133
0.00140
0.00149
0.00157
0.00167
0.00178
0.00192
0.00206
0.00222
0.00239
0.00257
0.00278
0.00300
0.00325
0.00352
0.00380
0.00411
0.00444
0.00482
0.00524
0.00571
0.00623
0.00685
0.00755
0.00833
0.00916
0.01005
0.01101
0.01208
0.01321
0.01439
0.01560
0.01679
0.01802
0.01948
0.02127
0.02338
0.02565
0.02799
0.03043
0.03297
0.03563
0.03843
0.04147
0.04494
0.04904
0.05385
0.05938
0.06555
0.07241
0.07990
0.08812
0.09653
0.10556
0.11539
0.12616
0.13802
0.15085
0.16429
0.17813
0.19250
0.20764
0.22354
0.23999
0.25653
0.27295
0.28915
]
# mx = - log.(1.0 .- qx) # force of mortality # roughly the same as below
mx = qx ./ (1.0 .- qx) # force of mortality
scatter(x, qx, yscale=:log10, legend=:topleft, label="qx")
scatter!(x, mx, yscale=:log10, legend=:topleft, label="mx")
The Gompertz-Makeham model in Turing.jl
First we start by defining the function that characterizes the Gompertz-Makeham model
function gompertz_makeham(x, p)
A, B, C = p
m = A * exp(B * x) + C
return m
end
gompertz_makeham (generic function with 1 method)
Now we define the compound probability model, assigning Beta distributions to the parameters. But, as mentioned before, the initial prior is very important. It is not easy to get a good fit. So, first, we approximate by "hand", with trial and error, to get the following approximate fit:
let (A, B, C) = (0.00001, 0.1, 0.001)
m = A * exp.(B * x) .+ C
plt = plot(yscale=:log10, title="Force of mortality", titlefont=10, xlabel="age", ylabel="force of mortality", legend=:topleft)
plot!(plt, x, m, label="Gompertz-Makeham hand-fit")
scatter!(plt, x, mx, label="data")
end
With these values, we define the prior Beta distributions for the compound probability model with parameters that yield a mean near those values.
@model function gompertz_makeham_model(x, m)
A ~ Beta(2, 99998)
B ~ Beta(2, 18)
C ~ Beta(2, 1998)
σ² ~ InverseGamma()
σ = sqrt(σ²)
p = (A, B, C)
for i in eachindex(x)
y = gompertz_makeham(x[i], p)
m[i] ~ Normal(y, σ)
end
end
gompertz_makeham_model (generic function with 2 methods)
Now we instantiate the Turing model
model_gm = gompertz_makeham_model(x, mx)
DynamicPPL.Model{typeof(Main.gompertz_makeham_model), (:x, :m), (), (), Tuple{Vector{Int64}, Vector{Float64}}, Tuple{}, DynamicPPL.DefaultContext}(Main.gompertz_makeham_model, (x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10 … 90, 91, 92, 93, 94, 95, 96, 97, 98, 99], m = [0.0005503026664665567, 0.0003601296466728022, 0.00029008412439607484, 0.00023005291216979904, 0.00021004410926294523, 0.00020004000800160032, 0.00019003610686030346, 0.00017002890491383537, 0.0001500225033755063, 0.0001400196027443842 … 0.17764823647176592, 0.19658733292649364, 0.21673744022777328, 0.23839009287925697, 0.2620526023524661, 0.2878963501017438, 0.31577216089261984, 0.34504418470146736, 0.3754212227494671, 0.4067665470915102]), NamedTuple(), DynamicPPL.DefaultContext())
and fit it:
# chain_gm = sample(model_gm, HMC(0.05, 10), 40_000)
chain_gm = sample(model_gm, NUTS(0.65), 5_000)
Chains MCMC chain (5000×16×1 Array{Float64, 3}):
Iterations = 1001:1:6000
Number of chains = 1
Samples per chain = 5000
Wall duration = 8.72 seconds
Compute duration = 8.72 seconds
parameters = A, B, C, σ²
internals = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size
Summary Statistics
parameters mean std mcse ess_bulk ess_tail rhat ⋯
Symbol Float64 Float64 Float64 Float64 Float64 Float64 ⋯
A 0.0000 0.0000 0.0000 2319.3678 1923.6552 1.0002 ⋯
B 0.1019 0.0078 0.0002 2322.4045 1976.5187 1.0003 ⋯
C 0.0010 0.0007 0.0000 2682.2130 1992.0565 1.0000 ⋯
σ² 0.0205 0.0030 0.0001 3239.9145 2762.0779 1.0000 ⋯
1 column omitted
Quantiles
parameters 2.5% 25.0% 50.0% 75.0% 97.5%
Symbol Float64 Float64 Float64 Float64 Float64
A 0.0000 0.0000 0.0000 0.0000 0.0001
B 0.0886 0.0964 0.1010 0.1066 0.1196
C 0.0001 0.0005 0.0008 0.0014 0.0028
σ² 0.0155 0.0184 0.0202 0.0223 0.0273
Here is the result of the MCMC:
plot(chain_gm)
We can see the mean values of the parameters as follows
mean(chain_gm)
Mean
parameters mean
Symbol Float64
A 0.0000
B 0.1019
C 0.0010
σ² 0.0205
The mean fit is given by
m_gm = [gompertz_makeham(xi, mean(chain_gm, [:A, :B, :C]).nt.mean) for xi in x]
99-element Vector{Float64}:
0.0010279596724221227
0.001030513151744429
0.0010333404873858415
0.001036471049971319
0.001039937360075447
0.001043775426049186
0.0010480251180779601
0.0010527305823568437
0.0010579406996853308
0.001063709593245615
⋮
0.22945562053442967
0.2539566525560491
0.28108537863617566
0.31112361449828724
0.3443834001234147
0.3812102412504186
0.42198669852165843
0.4671363615584368
0.5171282492492901
and we plot it
plt = plot(yscale=:log10, title="Force of mortality", titlefont=10, xlabel="age", ylabel="force of mortality", legend=:topleft)
plot!(plt, x, m_gm, label="Gompertz-Makeham fit")
scatter!(plt, x, mx, label="data")
It remains to compute the 95% credible interval,
quantiles_gm = reduce(
hcat,
quantile(
[
gompertz_makeham(xi, (A, B, C)) for (A, B, C) in eachrow(view(chain_gm.value.data, :, 1:3, 1))
],
[0.025, 0.975]
)
for xi in x
)
2×99 Matrix{Float64}:
0.00014535 0.000147215 0.000150652 … 0.236819 0.261927 0.288608
0.00279306 0.00279493 0.00279701 0.440955 0.491764 0.5468
and plot it
plt = plot(yscale=:log10, title="Force of mortality", titlefont=10, xlabel="age", ylabel="force of mortality", legend=:topleft)
plot!(plt, x, m_gm, ribbon=(m_gm .- view(quantiles_gm, 1, :), view(quantiles_gm, 2, :) .- m_gm), label="Gompertz-Makeham fit")
scatter!(plt, x, mx, label="data")
Notice how the function with the means of the parameters is outside the quantiles, which is based on the function values of the parameter samples. Let's check the last portion of the ensemble.
plt = plot(yscale=:log10, title="Force of mortality", titlefont=10, xlabel="age", ylabel="force of mortality", legend=nothing)
plot!(plt, x, m_gm, label="Bayesian fitted line", color=2)
for (A, B, C) in eachrow(view(chain_gm.value.data, 4500:5000, 1:3, 1))
plot!(plt, x, x -> gompertz_makeham(x, (A, B, C)), alpha=0.1, color=2, label=false)
end
scatter!(plt, x, mx, color=1)
Let's look at just a few samples to have a better look at the dependence of the function on the sampled values:[off]
plt = plot(yscale=:log10, title="Force of mortality", titlefont=10, xlabel="age", ylabel="force of mortality", legend=nothing)
plot!(plt, x, m_gm, label="Bayesian fitted line", color=2)
for (A, B, C) in eachrow(view(chain_gm.value.data, 1000:50:2000, 1:3, 1))
plot!(plt, x, x -> gompertz_makeham(x, (A, B, C)), alpha=0.4, color=3, label=false)
end
scatter!(plt, x, mx, color=1)
The Heligman-Pollard in Turing.jl
Now we consider the Heligman-Pollard model.
First we start by defining the function that characterizes the model:
function heligman_pollard(x, p)
A, B, C, D, E, F, G, H, K = p
m = A^((x + B)^C) + D * exp(-E * log(x / F)^2) + (G * H^x) / (1 + K * G * H^x)
return m
end
heligman_pollard (generic function with 1 method)
Now we define the compound probability model. As mentioned before, the initial prior is very important. We start with numbers of the order of those given in the original article by Heligman and Pollard. They considered a number of examples, but, as a starting point, we borrow only the data from the 1970-1972 period, separated by gender:
Parameter | Males 1970-72 | Females 1970-1972 |
---|---|---|
A | 0.00160 | 0.00142 |
B | 0.00112 | 0.0350 |
C | 0.1112 | 0.1345 |
D | 0.00163 | 0.00038 |
E | 16.71 | 21.86 |
F | 20.03 | 18.27 |
G | 0.0000502 | 0.0000507 |
H | 1.1074 | 1.0937 |
K | 2.416 | -2.800 |
The difference in the sign of $K$ between the male and female populations is justified by the difference in mortality for the elderly, although the data showed in the article both look more like in the description of the male mortality. Let's see how an approximation of that looks like, keeping a positive sign for $K$ because of this observation.
let (A, B, C, D, E, F, G, H, K) = (0.0015, 0.018, 0.012, 0.001, 19.0, 19.0, 0.00005, 1.1, 1.0)
m = [heligman_pollard(xi, (A, B, C, D, E, F, G, H, K)) for xi in x]
plt = plot(yscale=:log10, title="Force of mortality", titlefont=10, xlabel="age", ylabel="force of mortality", legend=:topleft)
plot!(plt, x, m, label="Heligman-Pollard hand-fit")
scatter!(plt, x, mx, label="data")
end
Ok, that seems like a reasonable starting point. So we choose the following priors for each parameter:
Parameter | prior | mean |
---|---|---|
A | Beta(15, 9985) | 0.0015 |
B | Beta(18, 982) | 0.018 |
C | Beta(12, 988) | 0.012 |
D | Beta(2, 1998) | 0.001 |
E | Gamma(38, 0.5) | 19.0 |
F | Gamma(38, 0.5) | 19.0 |
G | Beta(5, 99995) | 0.00005 |
H | Gamma(2, 0.5) | 1.0 |
K | Gamma(2, 0.5) | 1.0 |
But these turned out not to be so good. We either look for better informative priors or use slightly less informative priors. We fiddle a little bit with the parameters to get the following improvement.
let (A, B, C, D, E, F, G, H, K) = (0.0003, 0.02, 0.08, 0.0008, 15.0, 20.0, 0.00005, 1.1, 1.0)
m = [heligman_pollard(xi, (A, B, C, D, E, F, G, H, K)) for xi in x]
plt = plot(yscale=:log10, title="Force of mortality", titlefont=10, xlabel="age", ylabel="force of mortality", legend=:topleft)
plot!(plt, x, m, label="Heligman-Pollard hand-fit")
scatter!(plt, x, mx, label="data")
end
Based on that, we choose the following priors:
@model function heligman_pollard_model(x, m)
A ~ Beta(3, 9997) # Beta(1, 660) # Beta(15, 9985)
B ~ Beta(2, 98) # Beta(1, 50) # Beta(18, 982)
C ~ Beta(8, 92) # Beta(12, 988)
D ~ Beta(8, 9992) # Beta(1, 999) # Beta(2, 1998)
E ~ Gamma(30, 0.5) # Gamma(19, 1) # Gamma(38, 0.5)
F ~ Gamma(40, 0.5) # Gamma(19, 1) # Gamma(38, 0.5)
G ~ Beta(5, 99995) # Beta(1, 19999) # Beta(5, 99995)
H ~ Gamma(2.2, 0.5) # Gamma(1, 1) # Gamma(2, 0.5)
K ~ Gamma(2.0, 0.5) # Gamma(1, 1) # Gamma(2, 0.5)
σ² ~ InverseGamma()
σ = sqrt(σ²)
for i in eachindex(x)
y = A ^ ((x[i] + B) ^ C) + D * exp( - E * (log(x[i]) - log(F)) ^ 2) + G * H ^ (x[i]) # / (1 + K * G * H ^ (x[i]) )
m[i] ~ Normal(y, σ)
end
end
heligman_pollard_model (generic function with 2 methods)
Now we instantiate the Heligman-Pollard Turing model
model_hp = heligman_pollard_model(x, mx)
DynamicPPL.Model{typeof(Main.heligman_pollard_model), (:x, :m), (), (), Tuple{Vector{Int64}, Vector{Float64}}, Tuple{}, DynamicPPL.DefaultContext}(Main.heligman_pollard_model, (x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10 … 90, 91, 92, 93, 94, 95, 96, 97, 98, 99], m = [0.0005503026664665567, 0.0003601296466728022, 0.00029008412439607484, 0.00023005291216979904, 0.00021004410926294523, 0.00020004000800160032, 0.00019003610686030346, 0.00017002890491383537, 0.0001500225033755063, 0.0001400196027443842 … 0.17764823647176592, 0.19658733292649364, 0.21673744022777328, 0.23839009287925697, 0.2620526023524661, 0.2878963501017438, 0.31577216089261984, 0.34504418470146736, 0.3754212227494671, 0.4067665470915102]), NamedTuple(), DynamicPPL.DefaultContext())
and fit it:
# chain_hp = sample(model_hp, HMC(0.05, 10), 4_000) # Pure HMC didn't converge
# chain_hp = sample(model_hp, MH(), 5_000) # Metropolis-Hasting didn't converge
# chain_hp = sample(model_hp, Gibbs(MH(:A, :B, :C), MH(:D, :E, :F), HMC(0.65, 5, :G, :H, :K)), 5_000) # I am afraid I am not getting this right
chain_hp = sample(model_hp, NUTS(0.85), 5_000)
Chains MCMC chain (5000×22×1 Array{Float64, 3}):
Iterations = 1001:1:6000
Number of chains = 1
Samples per chain = 5000
Wall duration = 27.61 seconds
Compute duration = 27.61 seconds
parameters = A, B, C, D, E, F, G, H, K, σ²
internals = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size
Summary Statistics
parameters mean std mcse ess_bulk ess_tail rhat ⋯
Symbol Float64 Float64 Float64 Float64 Float64 Float64 ⋯
A 0.0003 0.0002 0.0000 6288.4819 2407.0987 1.0005 ⋯
B 0.0199 0.0140 0.0002 5297.4639 2343.2760 0.9999 ⋯
C 0.0800 0.0273 0.0003 6412.9666 3318.2054 1.0000 ⋯
D 0.0008 0.0003 0.0000 6840.5086 3520.0791 1.0009 ⋯
E 15.0611 2.7855 0.0306 8066.9082 3006.4320 1.0002 ⋯
F 20.0503 3.1971 0.0371 7380.3479 3494.9658 1.0003 ⋯
G 0.0000 0.0000 0.0000 4118.4309 2831.3746 0.9999 ⋯
H 1.0961 0.0056 0.0001 4055.0930 2675.5635 0.9999 ⋯
K 0.9975 0.7109 0.0089 4842.3014 2357.3357 1.0006 ⋯
σ² 0.0204 0.0030 0.0000 7146.8734 3374.2040 1.0009 ⋯
1 column omitted
Quantiles
parameters 2.5% 25.0% 50.0% 75.0% 97.5%
Symbol Float64 Float64 Float64 Float64 Float64
A 0.0001 0.0002 0.0003 0.0004 0.0007
B 0.0026 0.0093 0.0166 0.0271 0.0553
C 0.0353 0.0603 0.0772 0.0965 0.1412
D 0.0004 0.0006 0.0008 0.0010 0.0014
E 10.0060 13.1441 14.8909 16.8106 21.0734
F 14.2694 17.8391 19.8751 22.1139 26.7570
G 0.0000 0.0000 0.0000 0.0001 0.0001
H 1.0859 1.0923 1.0958 1.0996 1.1083
K 0.1141 0.4733 0.8374 1.3451 2.7937
σ² 0.0153 0.0183 0.0201 0.0222 0.0271
Here is the result of the MCMC:
plot(chain_hp)
We can see the mean values of the parameters as follows
mean(chain_hp)
Mean
parameters mean
Symbol Float64
A 0.0003
B 0.0199
C 0.0800
D 0.0008
E 15.0611
F 20.0503
G 0.0000
H 1.0961
K 0.9975
σ² 0.0204
The mean fit is given by
m_hp = [heligman_pollard(xi, mean(chain_hp, [:A, :B, :C, :D, :E, :F, :G, :H, :K]).nt.mean) for xi in x]
99-element Vector{Float64}:
0.00034968012338853555
0.000246488537808511
0.00020636590783560657
0.00018638293067518293
0.00017594612993119857
0.00017108071285336205
0.00016995535637307591
0.00017162388344596095
0.00017560641079663637
0.0001820092419380469
⋮
0.1731683176040796
0.1867136428394152
0.20106169770159063
0.2162201728911004
0.23219045338601163
0.24896688723562782
0.2665361213372277
0.28487653614007535
0.303957812133525
and we plot it
plt = plot(yscale=:log10, title="Force of mortality", titlefont=10, xlabel="age", ylabel="force of mortality", legend=:topleft)
plot!(plt, x, m_hp, label="Heligman-Pollard fit")
scatter!(plt, x, mx, label="data")
It remains to compute the 95% credible interval,
quantiles_hp = reduce(
hcat,
quantile(
[
heligman_pollard(xi, p) for p in eachrow(view(chain_hp.value.data, :, 1:9, 1))
],
[0.025, 0.975]
)
for xi in x
)
2×99 Matrix{Float64}:
0.000104923 7.96105e-5 6.99194e-5 … 0.16353 0.172546 0.182292
0.000755473 0.000544423 0.000458298 0.360687 0.393052 0.428256
and plot it
plt = plot(yscale=:log10, title="Force of mortality", titlefont=10, xlabel="age", ylabel="force of mortality", legend=:topleft)
plot!(plt, x, m_hp, ribbon=(m_hp .- view(quantiles_hp, 1, :), view(quantiles_hp, 2, :) .- m_hp), label="Heligman-Pollard fit")
scatter!(plt, x, mx, label="data")
Notice how the function with the means of the parameters is again outside the quantiles, which is based on the function values of the parameter samples. Let's check the last portion of the ensemble:
plt = plot(yscale=:log10, title="Force of mortality", titlefont=10, xlabel="age", ylabel="force of mortality", legend=nothing)
plot!(plt, x, m_hp, label="Bayesian fitted line", color=2)
for p in eachrow(view(chain_hp.value.data, 4500:5000, 1:9, 1))
plot!(plt, x, x -> heligman_pollard(x, p), alpha=0.1, color=2, label=false)
end
scatter!(plt, x, mx, color=1)
Let's look at just a few samples to have a better look at the dependence of the function on the sampled values:
plt = plot(yscale=:log10, title="Force of mortality", titlefont=10, xlabel="age", ylabel="force of mortality", legend=nothing)
plot!(plt, x, m_hp, label="Bayesian fitted line", color=2)
for p in eachrow(view(chain_hp.value.data, 1000:50:2000, 1:9, 1))
plot!(plt, x, x -> heligman_pollard(x, p), alpha=0.4, color=3, label=false)
end
scatter!(plt, x, mx, color=1)
- offHow often do you see $x \mapsto f_{\mathrm{mean}(p)}(x)$ fall off the credible interval of the family $x \mapsto \{f_p(x)\}_p$, where $p$ is the set of parameters? It happens.