Design

User Interface

From a user's perspective, Bayesian inference requires a log joint density function ldg, which may or may not depend on some data. MCBayes further requires the dimension of the input to the log density gradient. Consider the log joint density function of the Gaussian distribution's density function.

function ldg(q; kwargs...)
    return -q' * q / 2, -q
end
dims = 10
q = randn(dims)
ldg(q)

The function ldg takes in a vector of length dims, and any number of keyword arguments, and returns a 2-tuple consisting of a scalar and a vector of the same length as the input. The returned scalar is the evaluation of ldg at the input q. The returned vector is the gradient of ldg evaluated at q.

To obtain samples from the distribution corresponding to the log joint density function above, using the MCBayes implementation of Stan's dynamic Hamiltonian Monte Carlo algorithm, instantiate a Stan sampler object and pass it to the MCBayes function sample!(...).

stan = Stan(dims)
draws, diagnostics, rngs = sample!(stan, ldg)

The output of sample!(...) is a 3-tuple. The first element is a 3-dimensional array of draws (iterations x dimension x chains) from the distribution corresponding to the log joint density function ldg. The second element is a tuple of diagnostics, which are meant to help the user evaluate the success (or lack thereof) of the sampling algorithm. The third element is the random number generator seeds at the point that the sampling algorithm terminated.

The sampler object could be MEADS(dims), for an implentation of Tuning-Free Generalized Hamiltonian Monte Carlo, or MH(dims) for an implementation of the Metropolis algorithm which adaptatively tunes the metric for the proposal distribution and the stepsize.

Internal API

Almost the entire structure of all sampling algorithms of MCBayes fit into this one function. For now, focus on only the arguments sampler and ldg, and the verbs/functions. Try to ignore the adapters for just a moment.

function run_sampler!(sampler::AbstractSampler, ldg;
                      iterations = 1000,
                      warmup = iterations,
                      stepsize_adapter = StepsizeConstant(...),
                      metric_adapter = MetricConstant(...),
                      trajectorylength_adapter = TrajectorlengthConstant(...),
                      damping_adapter = DampingConstant(...),
                      noise_adapter = NoiseConstant(...),
                      drift_adapter = DriftConstant(...),
                      adaptation_schedule = WindowedAdaptationSchedule(warmup),
                      ...
                      )
    M = iterations + warmup
    draws = preallocate_draws(M, ...)
    diagnostics = preallocate_diagnostics(...)

    initialize_sampler!(sampler, ...)
    initialize_draws!(draws, ldg, ...)
    initialize_stepsize!(sampler, ldg, draws, ...)

    for m in 1:M
        transition!(sampler, m, ldg, draws, ...)
        adapt!(sampler, adaptation_schedule, m, ldg, draws,
               stepsize_adapter,
               metric_adapter,
               trajectorylength_adapter,
               damping_adapter,
               noise_adapter,
               drift_adapter,
               ...
               )
    end

    return draws, diagnostics, rngs
end

MCBayes makes heavy use of the multiple dispatch capabilities of Julia. So each sampler gets its own transition!(...) function. All samplers, so far, share the one adapt!(...) method, since the internals of adapt!(...) dispatch on appropriate adatpers.

You may have noticed that the User Interface example calls sample!(...), not run_sampler!(...). In fact, each sampler object, e.g. Stan, MEADS, MH, gets its own sample!(...) function which specifies the adaptation components appropriate to that sampler and then immediately calls run_sampler!(...).

For instance, a default run of Stan uses Dual Averaging to adapt the stepsize during the warmup iterations. So the function sample!(...) appropriate to Stan has signature something like

function sample!(sampler::Stan, ldg;
                 iterations = 1000,
                 warmup = iterations,
                 stepsize_adapter = StepsizeDualAverage(sampler.stepsize),
                 metric_adapter = MetricOnlineMoments(sampler.metric),
                 adaptation_schedule = WindowedAdaptationSchedule(warmup),
                 ...)
    return run_sampler!(sampler,
                        ldg;
                        iterations,
                        warmup,
                        stepsize_adapter,
                        metric_adapter,
                        adaptation_schedule,
                        kwargs...)
end

The function sample!(sampler::Stan, ...) passes the adaptation components specific to Stan to run_sampler!(...), thus over-riding the adapters run_sampler!(...) otherwise defaults to. Now is a good time to go look back at the default adapters of run_sampler!(...).

All *Constant(...) adapters are effectively a no-op within adapt!(...). The constant adapters are thus ignored when not appropriate for a particular sampler.

Consider the adapter TrajectorylengthConstant(...). This has zero effect on the Stan sampler, since Stan's trajectory length is dynamically determined from a modern implementation of NUTS (see Michael Betancourt's A Conceptual Introduction to Hamiltonian Monte Carlo). The dynamic trajectory length is computed within transition!(sampler::Stan, ...), namely the transition!(...) function appropriate to the Stan sampler.

The method adapt!(...) itself dispatches to particular adapt!(...) functions, dependent on the adaptation_schedule. Right now, only Stan's windowed adaptation schedule is implemented. The structure of adapt!(sampler, schedule::WindowedAdaptationSchedule, ...) looks something like

function adapt!(sampler, schedule::WindowedAdaptationSchedule,
                m, ldg, draws,
                stepsize_adapter,
                metric_adapter,
                trajectorylength_adapter,
                damping_adapter,
                noise_adapter,
                drift_adapter,
                ...)
    # within warmup
    if m <= sampler.warmup
        # update and set stepsize
        update!(stepsize_adapter, ...)
        set!(sampler, stepsize_adapter, ...)

        # update and set trajectory length
        update!(trajectorylength_adapter; kwargs...)
        set!(sampler, trajectorylength_adapter, ...)

        # in appropriate windows, update (without setting) metric
        if schedule.firstwindow <= m <= schedule.lastwindow
            update!(metric_adapter, ...)
        end

        # at end of window
        if m == schedule.closewindow
            # re-initialize, set, and then reset stepsize
            initialize_stepsize!(stepsize_adapter, ...)
            set!(sampler, stepsize_adapter, ...)
            reset!(stepsize_adapter, ...)

            # set and then reset metric
            set!(sampler, metric_adapter, ...)
            reset!(metric_adapter)

            calculate_nextwindow!(schedule)
        end
    else
        set!(sampler, stepsize_adapter, ...)
    end
end

When a Stan sampler is passed into the adapt!(...) function above, the adapter within the variable trajectorylength_adapter has type TrajectorylengthConstant. The method update!(...) dispatches to the function appropriate to this type and results in a no-op.