1. Overview of main changes

We are continuing to develop our research into the new dynamic population model (DPM). The DPM will estimate population and population change in a timely way, to better respond to user needs.

The DPM was used to produce updated admin-based population estimates (ABPEs) for mid-2021 and mid-2022, and provisional estimates for mid-2023, using updated methods and data. Estimates have been produced for all 318 local authorities in England and Wales and refer to the population at mid-year (30 June). The updated geography boundaries used (outlined on our Local government restructuring web page) differ from our previous releases and represent local authorities as of 1 April 2023. You can find out more about local authority district names and codes on the Open geography portal website.

Our companion article Admin-based population estimates: local authorities in England and Wales, mid-2021 to mid-2023 presents these ABPEs. These estimates are not directly comparable with the ABPE timeseries for mid-2011 to mid-2020 published in February 2023; this is because the input data sources have been updated to incorporate rebased official international and internal migration flows, following Census 2021. More information can be found in our Rebasing of mid-year population estimates following Census 2021, England and Wales bulletin. An updated mid-2011 to mid-2020 ABPE back series will be released in the future once we have had time to understand the impact of revised input data and methods.

In this article we discuss updates and improvements since our last set of ABPEs from the DPM were published in June 2023, including:

  • data sources

  • additional functionality developed within the DPM

  • the estimation method

  • the statistical models for count data

  • estimating uncertainty for aggregate data

In this article all references to years refer to mid-year (30 June).

Our Producing population estimates for lower layer super output areas (LSOAs) article and Development of statistical population datasets (SPD) article provide further useful information.

!

These are official statistics in development because we continue to refine our methods. They do not replace official mid-year population and international migration estimates and should not be used for decision making. These outputs must not be reproduced without this warning.

Back to table of contents

2. Overview of the dynamic population model

The census has evolved over time, providing a snapshot every 10 years into who we are and how we live. The census and our census-based mid-year estimates provide the current best picture of society at a moment in time.

We know that our traditional census and mid-year estimate approach no longer meets the full range of user needs. More timely population estimates are required, and we are committed to maximising the use of administrative data. We are working towards transitioning to a new system that is not reliant on census.

Summary information on how we are transforming population and migration statistics is provided in our Admin-based population estimates: local authorities in England and Wales, mid-2021 to mid-2023 results article, while our Overview of population and migration statistics transformation web page provides greater detail.

Our initial focus was on delivering population and migration estimates at local authority level, using different methods which allow more timely and coherent estimates. In July 2022, we introduced the dynamic population model (DPM) as our proposal for producing timely, coherent population statistics. The DPM method is more flexible and will meet a wider range of user needs. You can read more about it in our Dynamic population model for England and Wales: July 2022 article.

The DPM uses statistical modelling techniques and the cohort component method to produce admin-based population estimates (ABPEs). The DPM uses a range of data sources and balances the information on population (stocks) at specific points in time with the components of population change (flows of births, deaths and migrants) over time, to produce a coherent set of estimates. The ABPEs and input data sources, refer to mid-year (30 June) for the reference year.

A significant advantage of the DPM will be its flexibility. Our official mid-year estimates (MYEs) rely on rolling forward census-based stock estimates, births, deaths and migration each year. More information about MYEs can be found in our Population estimates for England and Wales, mid-2022: methods guide methodology.

The DPM however, uses administrative data as stock datasets each year, and can incorporate other data sources when they become available. This could include sources relating to local areas or specific population groups, for example school-age children, or sources that represent the total population.

The DPM can also adapt to quality issues in underlying data sources, by drawing strength across sources and balancing information from population stocks and population flows and measures of uncertainty. We use dashboards and data visualisations with real-time data and outlier detection to monitor trends in demographic behaviours and incorporate this intelligence into the model to make it responsive to current trends.

Official data on Mid-Year Estimates (MYEs) for 2022 were published on the Office for National Statistics (ONS) website on 23 November 2023, following a delay because of quality issues in some of the data used for the internal migration component of the estimates, which required further research and development to address. The flexibility of the DPM enabled 2022 ABPEs to be updated in June 2023 accounting for these quality issues within the credible intervals.

We have previously published the following research into the DPM:

Our latest research provides updated ABPEs from the DPM for 2021 and 2022 and provisional estimates for 2023 for all 318 local authorities.

We use a variety of data sources to provide more frequent, relevant, and timely population statistics. Our population statistics sources guide helps users find the right population statistics for them.

Back to table of contents

3. Improvements to data sources

We have updated several data sources used in the dynamic population model (DPM). Unless discussed in this article, assumptions made about individual data sources remain the same as in our previous publications:

Population stock

The Statistical Population Datasets (SPDs) are linked administrative data, which apply a set of inclusion rules to approximate the usually resident population. The SPD stocks are produced independently for each year, therefore any error in one year is less likely to be rolled forward to the next by the method.

The population stocks used for this December 2023 release were:

  • census-based mid-year estimates (MYEs): 2021

  • SPD version 4.1 and counts from the Personal Demographic Service (PDS): 2022

  • PDS counts, since an SPD stock was not available: 2023

The DPM allows us to input more than one population dataset for a given year. This was implemented for the first time, using SPD and PDS stocks for 2022.

In our June 2023 article, we explained why SPD version 4.0 was used for 2020 while census-based MYEs were used as population stock for 2021. Data delays meant we previously did not have an SPD available for 2022. Instead, we allowed the model to estimate 2022 stocks based on input flow data up to 2022 (births, deaths, internal migration, international migration and cross-border flows) and input stock data up to 2021.

The main changes to SPD version 4.1 compared with version 4.0 are:

  • this is a non-income based SPD since income data for 2022 were not available

  • delays in receiving 2022 PDS data meant that 2021 PDS data were rolled forward using monthly PDS updates to create the 2022 SPD; this approach should not be required in the future

  • provisional monthly data for Hospital Episode Statistics (HES) and Emergency Care Datasets (ECDS) are used where final annual data are not yet available

  • the Individualised Learner Record (ILR) data used an extract which had improved coverage and was timelier

SPD version 4.1 uses the following data sources:

  • PDS

  • English school census (ESC)

  • Welsh school census (WSC)

  • ILR

  • Higher Education Student Record (HESA)

  • HES

  • ECDS

  • deaths registrations

  • address frame

In the absence of income datasets, the SPD relies more on the PDS to provide sufficient coverage of the population. We therefore used PDS “presence” rules rather than “activity” rules used for SPD version 4.0. For someone to be counted as usually resident they only had to be present on the PDS, rather than having a recent activity. PDS presence rules were used in some previous versions of SPDs.

SPDs are quality assured at several stages throughout production and include checks on individual sources, through to processing and output checks. The most detailed checks involve analysts studying output data at local authority level by single year of age and sex.

The DPM requires unbiased population stock estimates for each year broken down by local authority, single year of age and sex. Coverage adjustment of the population stocks, including SPDs, is required because of coverage errors.

For the years ending June 2022 and June 2023, we assumed coverage ratios were constant based on comparison with census-based MYEs for 2021. The uncertainty of this assumption increases over time as we move further away from the 2021 reference period. Long-term we will develop methods to improve this approach.

We previously suggested (as explained in the Integrated population and characteristics survey report), that a voluntary household survey could be used to adjust the SPD for coverage error. We developed that work into two main strands of research as outlined in our SPD estimation options report. One focused on the potential use of a large-scale data collection exercise to adjust the SPD for under-coverage; and the other considered a full admin data approach.

In June 2023, we explained the results of research into a preliminary test of the data collection approach, testing the effectiveness of using Dual System Estimation (DSE) and a simple method of overcoverage estimation. These methods did not adequately address overcoverage with the data currently available to us. We are continuing to develop and refine these methods.

Our Population stock estimation development paper (which can be viewed on the UK Statistics Authority website) provides further detail on the methods we are currently developing to account for under and overcoverage in the SPD.

To improve the quality of the SPD and support its use in the DPM, future work will focus on:

  • investigating new data sources to improve coverage across population groups

  • exploring how best to ensure those in communal establishments are accurately reflected in the SPD

Uncertainty estimates

Uncertainty estimates for coverage-adjusted population stocks are provided to the DPM. These are based on statistical models of the ratio of reported stocks in 2011 and 2021 to census-based MYEs and represent the range of values that the true value of the estimate is likely to fall within.

Estimates of uncertainty allow the DPM to account for our confidence in the SPD totals across different years and cohorts. Further details can be found in our Dynamic population model, improvements to data sources and methodology for local authorities, England and Wales: 2011 to 2022.

Internal migration and cross-border moves

Internal migration describes moves made between local authorities and regions. Cross-border moves describe flows between countries within the UK. We included official estimates of internal migration and cross-border flows used in the MYEs for 2021 and 2022, in our methods guide which was published in November 2023.

The flexibility of the DPM means we can use alternative estimates of internal migration and cross-border flows. In the absence of 2023 MYEs, internal migration for 2023 was estimated using Personal Demographic Service (PDS) updates of those changing their address on NHS systems.

To ensure consistency with the rest of the time series, we scale the PDS-based migration to account for internal migration not captured by PDS alone. For example, internal migration of Higher Education students between places of usual residence while studying is not always well captured by the PDS. This can particularly affect estimates for local authorities with large student populations.

The scaling for 2023 internal migration was based on the ratio of PDS-based estimates to mid-year estimates of internal migration in 2018 and 2019 averaged, or 2022, for each age, sex, and local authority combination. Scaling based on the average of 2018 and 2019 was used as a baseline.

In areas where the 2022 PDS internal migration estimates were more similar to the MYE internal migration component for 2022, this period was applied. Around half of all local authorities were updated to use scaling based on 2022. We will continue to review the effectiveness of this methodology.

Quality assurance highlighted a much lower number of moves for 0-year olds in PDS data during a four-month period in the year ending 30 June 2023 compared with previous years. These estimates were scaled up based on the average proportion of moves in these months in historic PDS data (moves to 2018 and 2019).

International migration

We use the United Nations definition of an international migrant: a person who changes their country of usual residence for a period of at least a year.

In November 2023 we published our latest Long-term international migration, provisional: year ending June 2023 bulletin. These latest long-term international migration estimates (LTIM) are produced using methods that are based predominantly on administrative data and are less reliant on the International Passenger Survey (IPS) data and statistical modelling.

These UK-level LTIM estimates use different sources and methods for EU, non-EU and British national migration. More information can be found in our Methods to produce long-term international migration estimates methodology.

The methods to derive LTIM estimates for England and Wales by age, sex and local authority are independent of the methods to produce UK-level estimates. They use some of the same data sources from the Home Office and Department for Work and Pensions (DWP). Additional data sources, including Higher Education Statistics Agency (HESA) and NHS Personal Demographic Service (PDS), are used to fill coverage gaps (namely, international students and migrants aged 16 years and under).

For consistency and comparability, these same methods are used to produce the international migration component by age, sex and local authority required to create the official MYEs. Because of data availability, the methods differ slightly for 2021 compared with 2022 and 2023. Methods used in this release are described in full in our Population estimates for the UK, mid-2022: methods guide.

Methods used in our June 2023 release are described in our Population estimates for the UK, mid-2021: methods guide.

Back to table of contents

4. Improvements to methods

Since our previous methodology published in June 2023, there have been several improvements to methods used in the dynamic population model (DPM). These include:

  • ongoing research into estimates of uncertainty for aggregate data

  • estimation of migration rates to better capture recent trends

  • additional functionality developed within the DPM

  • changes in the estimation method

These are discussed in more detail in this section.

Estimates of uncertainty for aggregate data

We are continuing exploratory work to integrate the two steps used in the local authority level variance benchmark approach method. This involves the smoothing of rates and randomisation of rates by fitting a statistical model to the inflow and outflow estimates over all local authorities. This should lead to better estimates of the rates and more accurate estimates of uncertainty at all levels of aggregation.

This release updates our estimates of uncertainty for aggregate data published June 2023 using Census 2021 and population stock data for 2022 and 2023. Estimates of uncertainty for admin-based population estimates (ABPEs) at the aggregate local authority level for 2021 to 2023 are available in our published dataset. These estimates give improved accuracy by randomising the coverage adjusted stocks; previously we randomised the coverage ratio of stocks.

The coefficient of variation is a relative measure of dispersion which is used to determine the variability of data. For local authority level estimates, the coefficient of variation is at its lowest in census years (2011 and 2021) and increases the further we move away from these years.

Migration rates

To better capture changing migration patterns in 2022 and 2023, we smoothed separately the periods 2012 to 2021, 2022 and 2023. The method for smoothing rates, based on fitting generalised additive models, has remained unchanged.

There are several reasons for recent changes in migration patterns including changes to immigration schemes for Afghan, Hong Kong and Ukrainian nationals, a new immigration system following Brexit, and differences in people’s movements during, and following, the coronavirus (COVID-19) pandemic.

Our methods for 2012 to 2021, and for birth and death rates, remain unchanged. Smoothing is applied to reduce the amount of random variation and attempts to represent the underlying rates.

We improved the capture of change in migration patterns in Neath Port Talbot by separately smoothing periods to account for a new university campus built in 2017.

Model structure and additional functionality

The estimation steps in the DPM are:

  1. creating initial estimates of the demographic account by single year of age, sex and local authority, using counts of births, deaths, combined inflows and outflows, and population counts

  2. estimating statistical models to address systematic inaccuracies in data sources because of coverage and reporting errors; models are also used to estimate underlying demographic rates for births, deaths and inflow and outflow rates

  3. estimate demographic accounts for each local authority where the change in population between two time periods equals the net population flows

  4. finally, these separate accounts are combined to form a unified demographic account for England and Wales

The fundamental structure of the model remains unchanged, however the estimation method used in step three has been improved; this is discussed in our Changes in the estimation method section.

Additional functionality developed within the basic model structure since our June 2023 release includes:

  • statistical models for the counts of population inflows and outflows which account for systematic inaccuracies and uncertainty in those datasets

  • additional parameters in the statistical models for counts of stocks and flows which help adjust for systematic errors in the coverage adjustment and estimates of the input count data uncertainty; we refer to these parameters as scaling parameters

  • these multipliers can adjust coverage ratios, and input count data uncertainty, making our models more robust

Changes to the models used for the different stock and flow count datasets since our June 2023 release include:

  • models for stock data except the Mid-Year Estimates (MYEs); the models for these datasets now include scaling parameters to allow for systematic errors in the coverage adjustment and estimates of uncertainty

  • total inflows and outflows for 2012 to 2023 (neither dataset was in the model for our June 2023 release); these models assume no coverage adjustment, and no systematic errors in the uncertainty estimates

Changes in the estimation method

In previous publications, particle filtering was used for estimating the posterior distribution of the set of demographic accounts. An alternative approach using the Laplace approximation method has benefits over the particle filtering approach. It generates an approximation of the full posterior distribution of the demographic accounts faster. Laplace’s method approximates the posterior distribution with a multi-variate normal distribution and was implemented using the “R” package Template Model Builder (TMB) to produce our latest set of ABPEs. You can find out more about TMB in the TMB: automatic differentiation and Laplace approximation report.

Laplace’s method is well established and has been used in other areas including spatial modelling, estimating HIV epidemic indicators and estimating life expectancy by race.

We compared Laplace’s method and the particle filter approach using simulation studies and real data. A summary of the main findings follows.

Simulation study comparing particle filtering and Laplace approximation methods

We simulated a demographic account for the period 2011 to 2022 starting with 2011 MYEs, stepping through time using smoothed demographic rates from our June 2023 release and transition probability functions between successive periods. The simulated account represents the true account against which estimates are compared.

We ran the model 140 times using both particle filtering and Laplace approximation methods. We used input data drawn from appropriate probability distributions with parameters that reflected uncertainty and bias. These are the distributions assumed in the models for rates and flows.

Two scenarios were simulated:

  • all model inputs were unbiased

  • flow data and demographic rates were unbiased, but coverage ratios of population stocks were biased by 5%

The second scenario was simulated to demonstrate the benefits of the scaling parameters introduced in the models estimated with Laplace’s method. The simulation was run using scale parameters for coverage of 0, 0.05 and 0.1, and the scale parameter for uncertainty was set to 0.

A scale parameter controls the distribution from which the adjustment is drawn. A scale parameter of 0 implies no adjustment, while scale parameters of 0.05 or 0.1 act to potentially scale the coverage ratios up or down by as much as 10% or 20%, respectively. Once we fit the model and estimate these parameters given the data, the distributions of these scaling parameters change and are no longer centred around 1 if there is evidence of a systematic error.

Results for each method were combined and the following measures calculated:

  • relative bias; the average relative difference from the truth

  • coverage probability; the average number of 95% credible intervals that contain the true value

  • average relative width of credible intervals

  • root mean square error (RMSE); a measure which combines both the bias and the variance of estimates

Results from simulation study with unbiased model inputs

Laplace approximation tends to result in smaller deviations from the simulated truth than particle filtering for ABPEs by single year of age and sex. Over all ages, particle filtering results in marginally smaller bias compared with Laplace approximation.

The coverage probability achieved by both methods was mostly above 95%. Particle filtering had more ages where this fell below 95%, showing the Laplace approximation results in less deviation from the truth.

Laplace approximation resulted in slightly narrower 95% credible intervals than particle filtering, while having comparable or higher coverage probability.

Laplace approximation estimates have either similar or noticeably lower root mean square errors (RMSE) than particle filtering.

Results from simulation study using Laplace approximation with biased population stocks

When the scale parameter for coverage was 0, the relative bias of the ABPEs was close to 5% for most ages. However, when the scale parameter increased, the relative bias decreased. A scale parameter of 0.1 resulted in the average over all ages of the relative bias values being closer to zero than a scale parameter of 0.05.

The coverage probability was much lower than 95% when the scale parameter was 0, but exceeded 95% for most ages when the scale parameter was 0.05 or 0.1.

The 95% credible intervals were notably wider when the scale parameter was set to 0.1 compared with 0.05. Therefore, given little difference in relative bias and coverage probability with scale parameters of 0.05 and 0.1, the value of 0.05 was applied to all stocks except MYEs.

Comparing estimation methods on real data

The Laplace method used for estimating the ABPEs gives very similar results to the particle filter method. For example, Harrow had the greatest percentage difference in total population estimate for 2023 (ABPE 1.6% lower using Laplace method) when excluding Isles of Scilly and City of London which can often have outlying results because of very small populations.

The results of the simulation study and comparison of estimation methods on real data indicate that the Laplace method performs well. The particle filter and Laplace methods give similar results, but the substantial gain in speed using the Laplace method is an important factor in deciding to use it for estimating ABPEs.

Back to table of contents

5. Options for producing dynamic population model estimates at lower levels of geography

As we explore the increasing use of administrative data in our population estimation system, it is important to consider methods for estimating population at lower levels of geography. At present, the dynamic population model (DPM) produces estimates at local authority level by single year of age and sex.

Further research into producing admin-based population estimates (ABPEs) at lower levels of geography is discussed in our Small area population estimates in the transformed estimation system: method development methodology.

Back to table of contents

6. Future developments

The dynamic population model (DPM) and resulting admin-based population estimates (ABPEs) are showing great potential for producing timely, coherent population statistics. We are aiming for the ABPEs to become our official population estimates. Our recent consultation on the future of population and migration statistics asked users to provide feedback about our research so far. This feedback will inform further research, publication cycles and our revisions policy. The National Statistician will make a recommendation on the future of population statistics in 2024.

Back to table of contents

7. Glossary

Administrative data

Collections of data maintained for administrative reasons, for example, registrations, transactions, or record-keeping. They are used for operational purposes and their statistical use is secondary. These sources are typically managed by other government bodies.

Coverage errors

A coverage error occurs when a member of the population is not counted (undercoverage), is counted more than once (overcoverage) or is counted in the wrong location.

Credible intervals

The range in which the true value of the quantity being estimated is likely to be contained. We use 95% credible intervals in this article by taking 2.5th and 97.5th percentiles from the distributions of counts produced by our estimation process as the lower and upper bounds of our intervals, respectively. In this case, we can say that the probability that the true value lies in the credible interval is 95%.

Dynamic population model (DPM)

A statistical modelling approach that uses a range of data to measure the population and population changes in a fully coherent way.

Generalised additive model (GAMs)

Allows the modelling and smoothing of non-linear data. GAMs have been used within the dynamic population model (DPM) to model and smooth raw stock and flow data. This was done to reduce the amount of random variation and attempt to represent the true underlying pattern. This approach is particularly useful when working with noisy data or rare events.

Laplace approximation

A fast method for estimating the posterior distribution. This is implemented in our models using the R package Template Model Builder (TMB).

Multivariate normal distribution

Joint distribution of a set of correlated variables which are normally distributed (a continuous probability distribution where most data points cluster toward the middle of the range, while the rest taper off symmetrically toward either extreme).

Official statistics in development

Official statistics that are in the testing phase and not yet fully developed. A more detailed explanation is available on the Office for Statistics Regulation website.

Overcoverage

Overcoverage occurs when a record is counted more than once at the same location, more than once at a different location, counted in the wrong location, or is incorrectly included.

Particle filters

A method for estimating the posterior distribution. Further detail is available on the University of Oxford’s Department for Statistics website.

Personal Demographic Service (PDS)

A national electronic database of National Health Service (NHS) patients from NHS England, which contains only demographic information with no medical details. The PDS differs from the Patient Register (PR), since it is updated more frequently and by a wider range of NHS services. The PDS data available to the Office for National Statistics (ONS) consist of a subset of the records, including those which show a change of postcode recorded throughout the year or a new NHS registration.

Posterior distribution

A probability distribution calculated after receiving data.

Unbiased estimate

An estimate which is not affected by systematic errors.

Back to table of contents

8. Provide feedback

We welcome your feedback on the dynamic population model (DPM), our transformation journey, and our latest progress and plans. If you would like to contact us, please email us at pop.info@ons.gov.uk.

We have launched our Local population statistics insight feedback framework article, which enables users of population statistics to provide feedback at local authority level and suggest data sources for us to better understand the quality of our estimates.

You can also sign up to email alerts from the Office for National Statistics Population team for updates on our progress, and to hear about upcoming events and opportunities to share your views.

Back to table of contents

9. Collaboration

The Office for National Statistics (ONS) has been supported in this research by the University of Southampton. Specifically, we would like to thank John Bryant, Peter Smith, Paul Smith, Jakub Bijak, Jason Hilton, Andrew Hind, Erengul Dodd and Joanne Ellison for their guidance and support.

Back to table of contents

11. Cite this methodology

Office for National Statistics (ONS), released 18 December 2023, ONS website, methodology, Dynamic population model, improvements to data sources and methodology: local authorities in England and Wales, mid-2021 to mid-2023

Back to table of contents

Contact details for this Article

Louisa Blackwell
pop.info@ons.gov.uk
Telephone: +44 1329 444661