Approximation methods

Approximation methods are used when exact message computation is intractable — most commonly inside Delta nodes, where the factor is a nonlinear or non-conjugate function y = f(x). Each method trades off accuracy, computational cost, and assumptions about f.

All approximation methods are passed through DeltaMeta or FlowMeta.

Choosing a method

MethodBest forDimensionalityRequires
LinearizationSmooth, nearly linear fAnyForwardDiff (auto)
UnscentedSmooth nonlinear fLow–moderateNothing
GaussHermiteCubatureUnivariate integrals with Gaussian inputsUnivariatePoint count p
GaussLaguerreQuadratureIntegrals over [0, ∞)UnivariatePoint count n
srcubature / SphericalRadialCubatureMultivariate Gaussian integralsMultivariateNothing
LaplaceApproximationUnimodal posteriors, differentiable fAnyForwardDiff + Optim
CVIBlack-box or non-differentiable fAnyOptimizer + gradient
CVIProjectionCVI + exponential family projectionAnyExponentialFamilyProjection.jl
ImportanceSamplingApproximationGeneral expectations via samplingAnyProposal distribution

Deterministic approximations

Linearization

Linearization approximates f by its first-order Taylor expansion around the current operating point. Jacobians are computed automatically using ForwardDiff.jl. This is the default method for FlowMeta and a common choice for DeltaMeta when f is smooth and not highly nonlinear.

Unscented transform

Unscented (also UT / UnscentedTransform) propagates a deterministic set of sigma points through f and fits a Gaussian to the outputs. It captures mean and covariance through nonlinearities more accurately than linearization, without requiring derivatives. The number of sigma points scales linearly with the input dimension.

Gauss-Hermite cubature

GaussHermiteCubature computes expectations of the form ∫ g(x) N(x; μ, σ²) dx using a fixed set of quadrature points and weights optimized for Gaussian measures. It is exact for polynomials up to a certain degree determined by the number of points p:

DeltaMeta(method = GaussHermiteCubature(21))  # 21-point rule

Gauss-Laguerre quadrature

GaussLaguerreQuadrature computes expectations over the half-line [0, ∞) — useful when the input has a Gamma distribution or similar semi-infinite support.

Spherical radial cubature

srcubature() constructs a spherical-radial cubature rule for multivariate Gaussian integrals, using 2d + 1 deterministic points (where d is the input dimension). It provides a good balance between accuracy and cost for moderate dimensions.

Laplace approximation

LaplaceApproximation finds the mode of the log-unnormalized posterior (using Optim.jl) and fits a Gaussian at that mode using the local curvature (via ForwardDiff). Best for unimodal, differentiable posteriors.

Stochastic approximations

CVI — Constrained Variational Inference

CVI and ProdCVI approximate messages using stochastic gradient optimization of a variational objective. A gradient estimator (default: ForwardDiffGrad) computes the gradient of the log-likelihood with respect to the natural parameters of the approximating distribution. An optimizer (default: Adam) applies gradient steps until convergence.

DeltaMeta(method = CVI(
    rng,          # random number generator
    n_samples,    # samples for gradient estimation
    n_iterations, # gradient steps per message update
    Adam(params), # optimizer
))
Note

Loading Optimisers.jl unlocks additional optimizers for use with CVI. See the Extensions page.

CVIProjection

CVIProjection extends CVI by projecting the result onto the nearest member of a chosen exponential family, using ExponentialFamilyProjection.jl. This guarantees that the output is a valid member of the target family.

Note

CVIProjection requires the ExponentialFamilyProjection package to be loaded. Without it, using CVIProjection in a delta node will throw an informative error.

Importance sampling

ImportanceSamplingApproximation estimates expectations by drawing samples from a proposal distribution and reweighting. It is the most flexible method but converges slowly in high dimensions.

API reference

ReactiveMP.LinearizationType

The Linearization structure defines the approximation method of the Delta and Flow factor nodes. This method performs a local linearization of f around expansion point x.

Note

The Linearization works well for differentiable functions only. The results might not be accurate for non-differentiable functions.

The Linearization structure with default parameters can be constructed as Linearization().

The Linearization structure is used inside the DeltaMeta or FlowMeta structures and can be included as:

    y ~ f(x) where { meta = DeltaMeta(method = Linearization()) }
    # or
    y ~ Flow(x) where { meta = FlowMeta(flowmodel, Linearization()) }
source
ReactiveMP.UnscentedType

The Unscented structure defines the approximation method of the Delta and Flow factor nodes. More specifically, it contains the hyperparameters used for sigma points computation.

Arguments

  • α: Spread parameter for unscented transform #1
  • β: Algorithm parameter for incorporating prior information on the (non-Gaussian) distribution of Delta node input
  • κ: Spread parameter for unscented transform #2
  • e: Internal cache

The Unscented structure with default parameters can be constructed as Unscented().

The Unscented structure is used inside the DeltaMeta or FlowMeta structure and can be included as:

    y ~ f(x) where { meta = DeltaMeta(method = Unscented()) }
    # or
    y ~ Flow(x) where { meta = FlowMeta(flowmodel, Unscented()) }
source
ReactiveMP.ProdCVIType
ProdCVI

The ProdCVI structure defines the approximation method hyperparameters of the prod(approximation::CVI, logp::F, dist). This method performs an approximation of the product of the dist and logp with Stochastic Variational message passing (SVMP-CVI) (See Probabilistic programming with stochastic variational message passing).

Note

ProdCVI is deprecated in favor of CVIProjection.

Arguments

  • rng: random number generator
  • n_samples: number of samples to use for statistics approximation
  • n_iterations: number of iteration for the natural parameters gradient optimization
  • opt: optimizer, which will be used to perform the natural parameters gradient optimization step
  • grad: optional, defaults to ForwardDiffGrad(), structure to select how the gradient and the hessian will be computed
  • n_gradpoints: optional, defaults to 1, number of points to estimate gradient of the likelihood (dist*logp)
  • enforce_proper_messages: optional, defaults to true, ensures that a message, computed towards the inbound edges, is a proper distribution, must be of type Val(true)/Val(false)
  • warn: optional, defaults to true, enables or disables warnings related to the optimization steps
Note

n_gradpoints option is ignored in the Gaussian case

Note

Adding the Optimisers.jl in your Julia environment enables additional optimizers from the Optimisers.jl for the CVI approximation method. Adding the DiffResults in your Julia environment enables faster gradient computations in case if all inputs are of the Gaussian type.

source
ReactiveMP.ForwardDiffGradType
ForwardDiffGrad(chunk_size::Int)

The auto-differentiation backend for the CVI procedure. Uses the ForwardDiff library to compute gradients/derivatives. If chunk_size is not specified then uses the heuristic from ForwardDiff, which is type-unstable.

Note

The ForwardDiff.jl must be added to the current Julia environment.

source
ReactiveMP.CVIProjectionType
CVIProjection(; parameters...)

A structure representing the parameters for the Conjugate Variational Inference (CVI) projection method. This structure is a subtype of AbstractApproximationMethod and is used to configure the settings for CVI.

CVI approximates the posterior distribution by projecting it onto a family of distributions with a conjugate form.

Requirements

The CVIProjection method requires the ExponentialFamilyProjection package to be installed and loaded in the current environment with using ExponentialFamilyProjection.

Parameters

  • rng::R: The random number generator used for sampling. Default is Random.MersenneTwister(42).
  • outsamples::S: The number of samples used for approximating output message distributions. Default is 100.
  • out_prjparams::OF: The form parameter used to specify the target distribution family for the output message. If nothing (default), the form will be inferred from the marginal form.
  • in_prjparams::IFS: A NamedTuple-like object that specifies the target distribution family for each input edge. Keys should be of the form :in_k where k is the input edge index. If nothing (default), the forms will be inferred from the incoming messages.
  • proposal_distribution::PD: The proposal distribution used for generating samples. If not provided or set to nothing, it will be inferred from incoming messages and automatically updated during iterations.
  • sampling_strategy::SS: The strategy for approximating the logpdf:
    • FullSampling(n): Uses n samples drawn from distributions (default: n=10). Provides more accurate approximation at the cost of increased computation time.
    • MeanBased(): Uses only the mean of each distribution as a single sample. Significantly faster but less accurate for non-linear nodes or complex distributions.

Examples

# Standard CVI projection with default settings
method = CVIProjection()

# Fast approximation using mean-based sampling
method = CVIProjection(sampling_strategy = MeanBased())

# Custom proposal with increased sample count
using Distributions
proposal = FactorizedJoint((NormalMeanVariance(0.0, 1.0), NormalMeanVariance(0.0, 1.0)))
method = CVIProjection(
    proposal_distribution = ProposalDistributionContainer(proposal),
    sampling_strategy = FullSampling(1000)
)

# Specify projection family for the output message
method = CVIProjection(out_prjparams = ProjectedTo(NormalMeanPrecision))

# Specify projection family for input edges
method = CVIProjection(in_prjparams = (in_1 = ProjectedTo(NormalMeanVariance), in_2 = ProjectedTo(GammaShapeRate)))
Note

The CVIProjection method is an enhanced version of the deprecated CVI, offering better stability and improved accuracy. Parameters and defaults may change as the implementation evolves.

source
ReactiveMP.CVISamplingStrategyType
CVISamplingStrategy

An abstract type representing the sampling strategy for the CVI projection method. Concrete subtypes implement different approaches for generating samples used in approximating distributions.

source
ReactiveMP.FullSamplingType
FullSampling <: CVISamplingStrategy
FullSampling(samples::Int = 10)

A sampling strategy that uses multiple samples drawn from distributions.

Arguments

  • samples::Int: The number of samples to draw from each distribution. Default is 10.

Example

# Use 100 samples for more accurate approximation
strategy = FullSampling(100)
source
ReactiveMP.MeanBasedType
MeanBased <: CVISamplingStrategy

A sampling strategy that uses only the mean of the proposal distribution as a single sample.

source
ReactiveMP.ProposalDistributionContainerType
ProposalDistributionContainer{PD}

A mutable wrapper for proposal distributions used in the CVI projection method.

The container allows the proposal distribution to be updated during inference without recreating the entire approximation method structure.

Fields

  • distribution::PD: The wrapped proposal distribution, can be of any compatible type.
source
ReactiveMP.cvi_setup!Function
cvi_setup!(opt, λ)

Initialises the given optimiser for the CVI procedure given the structure of λ. Returns a tuple of the optimiser and the optimiser state.

source
ReactiveMP.cvi_update!Function
cvi_update!(tuple_of_opt_and_state, new_λ, λ, ∇)

Uses the optimiser, its state and the gradient ∇ to change the trainable parameters in the λ. Modifies the optimiser state and and store the output in the newλ. Returns a tuple of the optimiser and the newλ.

source