Saturday, 24 January 2026

Diagnosing Singular Fits in Linear Mixed Models: Practical Examples, Code, and Alternatives

 

Introduction

This post is the practical follow‑up to When Linear Mixed Models Don’t Converge, where I discussed the theoretical reasons why convergence issues often arise in linear mixed models. Here, we move from theory to practice: simulated examples, diagnostic tools, interpretation of singular fits, and practical modeling alternatives.

If you haven’t read the first part yet, you may want to start there for the conceptual foundations.


Figure. Conceptual map showing how model complexity and available information interact


Legend of Figure 1. Green = identifiable models; yellow = risk of singularity; red = non‑identifiable models.


A Concrete Example Using R: When a Random-Slope Model Becomes Singular

Let’s simulate a simple dataset with only 8 groups and a very small random‑slope variance. This is a classic scenario where the model converges numerically but is statistically non‑identifiable.

```r

set.seed(123) n_groups <- 8 n_per_group <- 20 group <- factor(rep(1:n_groups, each = n_per_group)) x <- rnorm(n_groups * n_per_group) beta0 <- 2 beta1 <- 1 u0 <- rnorm(n_groups, 0, 1) u1 <- rnorm(n_groups, 0, 0.05) y <- beta0 + u0[group] + (beta1 + u1[group]) * x + rnorm(n_groups * n_per_group, 0, 1) df <- data.frame(y, x, group) library(lme4) m <- lmer(y ~ x + (x | group), data = df) summary(m) isSingular(m)

```

Typical output:

  • Random‑slope variance ≈ 0.05

  • Correlation between intercept and slope = 1.00

  • Message: boundary (singular) fit

  • isSingular(m) returns TRUE

This is not a numerical failure. It is a mathematical signal that the model is too complex for the available information.


How to Interpret a Singular Fit

A singular fit occurs when the estimated random‑effects variance–covariance matrix is not of full rank. This typically means:

  • A random‑effect variance is estimated as zero

  • A correlation hits ±1

  • The Hessian is not invertible

  • The model lies on the boundary of the parameter space

In practice, this means the data do not support the random‑effects structure you specified.


Diagnostic Checklist

Red flags

  • Variance of a random effect is exactly zero

  • Correlation between random effects is ±1

  • isSingular(model) returns TRUE

  • Large gradients or warnings about the Hessian

  • Estimates change dramatically when switching optimizers

Useful tools

  • lme4::isSingular()

  • lmerControl(optimizer = ...) for diagnosis only

  • Likelihood‑ratio tests between nested models

  • Checking the number of levels per random effect


Practical Alternatives When the Model Is Too Complex

1. Simplify the random‑effects structure

Often, the most appropriate solution.

Examples:

  • Remove the random slope

  • Remove correlations using (1 | group) + (0 + x | group)

  • Fit a random‑intercept model first and evaluate necessity of slopes


2. Bayesian models with weakly informative priors

Bayesian estimation stabilizes variance estimates near zero.

Example:

```r

brm(y ~ x + (x | group), data = df, prior = prior(exponential(2), class = sd))

```

3. Penalized mixed models (e.g., glmmTMB)

Penalization shrinks unstable variance components.


4. Marginal models (GEE)

When the focus is on fixed effects rather than subject‑specific effects.


Summary

Singular fits are not numerical accidents. They are statistical messages telling you that:

  • the model is over‑parameterized,

  • the data do not contain enough information,

  • and the random‑effects structure needs to be reconsidered.

Understanding this distinction leads to more robust modeling decisions and clearer scientific conclusions.

When Linear Mixed Models Don’t Converge: Causes, Consequences, and Interpretation

 

Introduction

 

Linear mixed-effects models (LMMs) have become a standard tool for analyzing correlated or hierarchical data across many applied fields, including biostatistics, psychology, ecology, and toxicology. Their appeal lies in the ability to model both population-level effects (fixed effects) and sources of variability associated with grouping or experimental structure (random effects) within a single coherent framework.

 

However, practitioners frequently encounter warning messages when fitting LMMs, such as “failed to converge,” “boundary (singular) fit,” or “Hessian not positive definite.” These warnings are often treated as technical nuisances or software-specific quirks, and it is tempting to either ignore them or attempt minor numerical fixes (e.g., changing optimizers or increasing iteration limits). Such warnings usually signal deeper statistical issues related to model identifiability and data support.

 

Convergence problems are especially common when LMMs are fitted to small or sparse data sets, or when the random-effects structure is ambitious relative to the available information. In these situations, the model may attempt to estimate more variance–covariance parameters than the data can reliably inform. As a result, the estimation procedure can struggle to identify a unique and stable optimum of the likelihood function.

 

Theoretically, applying linear mixed models to smaller data sets increases the risk of encountering convergence issues, particularly the problem of a singular Hessian matrix during restricted maximum likelihood (REML) estimation. This situation can lead to unwanted outcomes, such as one or more components having an asymptotic standard error of zero. Such an occurrence indicates that there are zero degrees of freedom for the residual (error) term, meaning the error variance is not defined. In other words, this suggests insufficient data or excessive model complexity (overparameterization).

 

 

Discussion

In principle, fitting linear mixed-effects models (LMMs) to relatively small or sparse data sets is statistically delicate, because these models rely on estimating variance–covariance components from limited information. Estimation is typically carried out via maximum likelihood (ML) or, more commonly, restricted maximum likelihood (REML), which involves optimizing a likelihood surface in a high-dimensional parameter space that includes both fixed effects and random-effects variance components.

When the amount of data is small relative to the complexity of the random-effects structure, the likelihood surface can become flat or ill-conditioned in certain directions. Therefore, the observed (or expected) information matrix—whose inverse is used to approximate the covariance matrix of the parameter estimates—may be singular or nearly singular. In REML estimation, this manifests as a singular Hessian matrix at the solution.

A singular Hessian indicates that one or more parameters are not identifiable from the data. In practical terms, the data do not contain enough independent information to support the estimation of all specified variance components. This lack of identifiability can arise for several, often overlapping, reasons: 

a) insufficient sample size, particularly a small number of grouping levels for random effects; 

b) near-redundancy among random effects, such as attempting to estimate both random intercepts and random slopes with little within-group replication; 

c) boundary solutions, where one or more variance components are estimated as zero (or extremely close to zero).

Interpretation of Zero or Near-Zero Standard Errors

One symptomatic outcome of a singular Hessian is the appearance of zero (or numerically negligible) asymptotic standard errors for certain parameters, especially variance components. This is not a meaningful indication of infinite precision; rather, it is a numerical artifact reflecting non-identifiability.

In the specific case of the residual (error) variance, an estimated standard error of zero suggests that the residual variance cannot be distinguished from zero given the fitted model. Conceptually, this corresponds to having zero effective degrees of freedom for the residual error term. Put differently, after accounting for fixed and random effects, the model leaves no independent information with which to estimate unexplained variability.

This situation implies that the model has effectively saturated the data: each observation is being explained by the combination of fixed effects and random effects with no remaining stochastic noise. From a statistical perspective, this is problematic because a) the residual variance is no longer defined in a meaningful way, b)  the standard inferential procedures (confidence intervals, hypothesis tests) become invalid, and c) small perturbations of the data can lead to large changes in parameter estimates (ill-conditioned problem).

Overparameterization and Model–Data Mismatch

The fundamental issue underlying these problems is overparameterization: the specified model is too complex for the available data. In mixed models, this typically occurs when the random-effects structure is overly rich relative to the number of observations and grouping units. For example, attempting to estimate multiple random slopes per group with only a few observations per group almost guarantees identifiability problems.

From a modeling perspective, a singular Hessian or zero residual variance should not be interpreted as an intrinsic property of the data-generating process, but rather as evidence of a mismatch between model complexity and data support. The data are insufficient to separate the contributions of all variance components, leading the estimation procedure to collapse some parameters to boundary values.

Practical Implications

These issues underscore the importance of exercising caution when applying linear mixed models to small data sets. In such situations, it is often necessary to simplify the random-effects structure by eliminating random slopes or correlations, or, if feasible, to increase either the sample size or the number of grouping levels. As a final option, one might consider alternative modeling strategies, such as fixed-effects models or penalized/regularized mixed models.


Ultimately, convergence diagnostics, singularity warnings, and zero standard errors should be viewed as key signals that the model may be statistically ill-posed, rather than as purely technical inconveniences. They suggest that the inferential conclusions drawn from such a model are likely unreliable unless the model specification is revisited.

Further Reading

To see how these concepts play out in practice — with simulated examples, diagnostics, and modeling alternatives — check out the second part of this series:

👉 Diagnosing Singular Fits in Linear Mixed Models

 


References

Gelman, A., & Hill, J. (2007). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press.

McElreath, R. (2020). Statistical Rethinking. CRC Press.

Pinheiro, J. C., & Bates, D. M. (2000). Mixed-Effects Models in S and S-PLUS. Springer.

Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software.

Matuschek, H. et al. (2017). Balancing Type I error and power in linear mixed models. Journal of Memory and Language.

Barr, D. J. et al. (2013). Random effects structure for confirmatory hypothesis testing. Journal of Memory and Language.

Diagnosing Singular Fits in Linear Mixed Models: Practical Examples, Code, and Alternatives

  Introduction This post is the practical follow‑up to When Linear Mixed Models Don’t Converge , where I discussed the theoretical reasons w...