Jon Minton introduces the Shiny app he developed for exploring mortality based on the Human Mortality Database.
This blog post introduces a web-based app I have been developing for allowing more intuitive and interactive exploration of data on population counts, and derived data on age-specific mortality hazards, from the Human Mortality Database (HMD), a joint initiative by the University of California, Berkeley, and the Max Planck Institute for Demographic Research (MPIDR) in Rostock, Germany, to provide detailed mortality data covering 40 countries The app is interactive both in terms of the populations that can be selected, and the ways in which data based on these selections can be visualised and explored.
The main challenge the app addresses is how to better use and understand the large amounts of age-year data available for any population group within the HMD. For any one population (such as Italian males or Taiwanese females), separate population counts, exposures (population counts corrected for mortality), and death counts are provided for each age in single years from birth to 109 years of age, and for up to 266 separate years (in the case of Sweden); the mean number of years’ worth of data available for a country in the HMD is 94 (median 68). This means that there are often tens of thousands of separate values for each country, gender and variable combination.
The enormity of such demographic data can easily be overwhelming. A standard approach when faced with this much data is to use it only indirectly, in the production of summary statistics, such as age-standardised mortality rates, and conditional and unconditional life expectancies. This reduces the amount of data displayed to a user by a couple of orders of magnitudes, at the cost of obscuring important patterns and features within the data, potentially leading to a distorted or only partial understanding of the specific processes which may be driving changes in the summary measures.
For example, in Scotland, life expectancy for women was 42 in 1875, 58 in 1925, and 74 in 1975, an increase of 32 years over 100 years, or around four months a year. Does that mean either that a 40 year old Scottish woman in 1875 would have been unlikely to reach her 45th birthday, or that baby girls born in 1975 (now 44 years old) can only expect to live another 30 years? In both cases the answer should be a resounding ‘no’, but for reasons that might be difficult to understand unless a deeper understanding of mortality structure, and variations in mortality hazard over the life course, and how these hazards have changed over time, has been developed. It is this deeper insight into mortality structure and dynamics that the app exists to help develop.
In the case of the 40 year old in 1875, the risk of not reaching age 41 was 137 per 10 000, and having survived to age 41, of then not reaching age 42, 141 per 10 000. This means there was around a 97% chance (i.e. (10000 – 137) / 10 000 x (10 000 – 141) / 10 000) of the 40 year old reaching that year’s reported female life expectancy of 42 years of age. The life expectancy just given was actually a relatively rare age to die at or near, sitting in the comparatively low risk ‘trough’ of middle age, between the two peaks of high infantile and elderly mortality. This relationship between mortality risk and age in 1975 is shown in the middle of the three subfigures below (Figure 1).
Similarly, the Scottish woman born in 1975 should expect to live quite a bit longer than the 75 years ‘allotted’ to her at the time, though we do not know exactly how much longer this will be. This is because the life expectancies usually reported are period life expectancies, which assume the mortality risks faced at each age remain frozen at the levels they were when you were born. But throughout recorded history this has almost never been the case; instead mortality rates at almost all ages tend to have fallen in most years, albeit at inconsistent rates, and not at the same rate for all ages. Though we cannot show what the mortality risks at older ages are for the 1975 birth cohort – as this would need a crystal ball – we can do something similar for the 1925 birth cohort, whose mortality risks in that year were such that her (period) life expectancy was just 58 years of age. The middle subfigure below shows the mortality risk profile with age in the year 1925, and the right subfigure shows the mortality risk profile actually experienced by the 1925 birth cohort (Figure 2).
We can see from this that the mortality age profile actually experienced was different to that in the year 1925. Although this cohort experienced a jump in deaths in early adulthood, due to becoming adults during the Second World War, the mortality-age schedule after the war was somewhat ‘gentler’ than that of the 1925 mortality-age profile snapshot of the middle subfigure. For example, the period schedule shows women facing a 1% risk of dying in the next 12 months at age 50, whereas the cohort actually experienced this risk at age 57; similarly, a 5% risk of dying was ‘postponed’ from around age 70 to 79 years, and so on.
Both the general problem – of how to build up an intuitive and broad sense of mortality structure and change over time – and the specific statistics listed above, are addressed by the app, which aims to present the tens of thousands of numbers in the HMD, for each individual country and gender, in ways that works to human strengths rather than limitations. People are readily able to process and make decisions about large amounts of complex visual and structural information all the time, as the billions of people able to drive cars and not walk into walls (for example) readily demonstrates.
Standard spatial maps show that information about complex three-dimensional structures can also be readily encoded and decoded within the two dimensional planes of the page or computer screen, with colours, symbols and contour lines allowing variations in height or other attributes over latitude and longitude. The principle of visualising age-year specific mortality rates, and other attributes that vary over age and year, as a quasi-spatial ‘map’ is what guides both the concept of the Lexis surface, and the conventional ways such surfaces are represented, as either heatmaps or contour plots.
Yet, decoding three-dimensional information encoded in a two dimensional map is somewhat like writing, something that has to be learned effortfully, rather than like speaking, which emerges effortlessly. In order to be able to read a spatial map, the user must first have a deep, intuitive, experiential understanding of how spatial surfaces vary in their topography and other features.
The app therefore aims to support the development of a similar kind of deep, intuitive, experiential understanding of the hidden topographies of demographic data, as rendered through the Lexis (‘age-year’) surface. It does this as follows:
- When the user selects a population from the HMD to explore, an interactive 3D Lexis surface plot of the attribute of interest appears to the right. This can be rotated, panned across, or zoomed into as required. (Attributes of interest include mortality hazard, population, gender ratio in population size or mortality ratio, or differences in between populations.)
- Hovering over any section of the 3D surface will produce information about the specific point on the surface being hovered over in the form of a tooltip.
- Clicking on any point on the 3D surface will produce/update three additional subplots below the 3D surface, showing the effects of ‘slicing through’ the surface at the point that’s been clicked on, at either 0 degrees (age section), 90 degrees (period section) or 45 degrees (cohort section).
- Hovering over features in the subplots will bring additional information about the points on the subplots in the form of a tooltip.
The combination of the 3D surface plot with the sectional subplots aims to make understanding mortality structure, and interrogating mortality data, a piece of cake. Literally! When you slice through a cake you are taking a section through a three dimensional structure along a single plane. But the slice you are left with, and the face of the cake slice that’s visible if you lay it on its side, depend on the angle at which you made the cut, as well as the shape of the cake. The three subplots are set up to ‘cut through’ the surface at the three most useful angles: by age (zero degrees), by year (90 degrees) and by birth cohort (45 degrees). Together with the full structure above, the three slices presented by the subplot aim to make asking and answering questions of the rich and complex data within the HMD an intuitive, engaging and informative experience.
Types of Display Currently Available
There are currently five types of Lexis surface that can be produced, each accessible by a different tab at the top of the screen:
- Mortality: Select a country and population, and the log mortality surface appears. The range of years and ages can be limited as required.
- Population: Show the population count by age and year. If the population selected is total, then the 3D surface plot and subplots display males and females cumulatively, with male population counts ‘added to’ female population counts.
- Mortality Sex Ratios (surface only): This shows the ratio of male deaths to female deaths at each age and year. A fixed amount (by default 50) is added to both populations being compared to increase stability when working with cells involving few observations. This can be adjusted as needed. The maximum ratio limit can also be set (3 by default). A translucent equal ratio plane is added; parts of the surface below this plane can be selected by ‘flipping’ the 3D surface to display it from below.
- Population Sex Ratios (surface only): As with Mortality Sex Ratios, but for population counts.
Mortality Group Comparisons: This allows either the log mortality surfaces for any two groups of countries to be shown simultaneously, as two translucent planes, or for the surface of the differences in age-year specific log mortality rates to be shown and colour coded as in the Sex Ratio tabs. This therefore allows the same gender group to be compared between any two populations. Multiple populations can be selected for both population groups to be compared, allowing whole continents or subcontinental regions to be compared instead of just single countries. Because of the additional calculations required to produce these multi-country comparisons, the 3D surface itself will only render once the button marked ‘click to recalculate’ has been clicked.
The app is currently under active development, and the code used to produce it freely available. My hope is that it will already be useful for researchers who make use of the HMD’s rich and complex data, and will improve their engagement and understanding of this excellent resource. Comments, feedback, opportunities and suggestions are keenly sought.
- The app: https://datascapes.shinyapps.io/hmd_explorer/
- The code used to produce the app: https://github.com/JonMinton/hmd_explorer
Dr. Jon Minton is a Public Health Intelligence researcher at NHS Health Scotland.