This example has been auto-generated from the examples/ folder at GitHub repository.

Assessing People’s Skills

# Activate local environment, see `Project.toml`
import Pkg; Pkg.activate(".."); Pkg.instantiate();

The goal of this demo is to demonstrate the use of the @node and @rule macros, which allow the user to define custom factor nodes and associated update rules respectively. We will introduce these macros in the context of a root cause analysis on a student's test results. This demo is inspired by Chapter 2 of "Model-Based Machine Learning" by Winn et al.

Problem Statement

We consider a student who takes a test that consists of three questions. Answering each question correctly requires a combination of skill and attitude. More precisely, has the student studied for the test, and have they partied the night before?

We model the result for question $i$ as a continuous variable $r_i\in[0,1]$, and skill/attitude as a binary variable $s_i \in \{0, 1\}$, where $s_1$ represents whether the student has partied, and $s_2$ and $s_3$ represent whether the student has studied the chapters for the corresponding questions.

We assume the following logic:

  • If the student is alert (has not partied), then they will score on the first question;
  • If the student is alert or has studied chapter two, then they will score on question two;
  • If the student can answer question two and has studied chapter three, then they will score on question three.

Generative Model Definition

To model the probability for correct answers, we assume a latent state variable $t_i \in \{0,1\}$. The dependencies between the variables can then be modeled by the following Bayesian network:

(s_1)   (s_2)   (s_3)
  |       |       |
  v       v       v
(t_1)-->(t_2)-->(t_3)
  |       |       |
  v       v       v
(r_1)   (r_2)   (r_3)

As prior beliefs, we assume that a student is equally likely to study/party or not: $s_i \sim Ber(0.5)\,,$ for all $i$. Next, we model the domain logic as $\begin{aligned} t_1 &= ¬s_1\\ t_2 &= t_1 ∨ s_2\\ t_3 &= t_2 ∧ s_3\,. \end{aligned}$ For the scoring results we might not have a specific forward model in mind. However, we can define a backward mapping, from continuous results to discrete latent variables, as $t_i \sim Ber(s_i)\,,$ for all $i$.

Custom Nodes and Rules

The backward mapping from results to latents is quite specific to our application. Moreover, it does not define a proper generative forward model. In order to still define a full generative model for our application, we can define a custom Score node and define an update rule that implements the backward mapping from scores to latents as a message.

In RxInfer, the @node macro defines a factor node. This macro accepts the new node type, an indicator for a stochastic or deterministic relationship, and a list of interfaces.

using RxInfer, Random

# Create Score node
struct Score end

@node Score Stochastic [out, in]

We can now define the backward mapping as a sum-product message through the @rule macro. This macro accepts the node type, the (outbound) interface on which the message is sent, any relevant constraints, and the message/distribution types on the remaining (inbound) interfaces.

# Adding update rule for the Score node
@rule Score(:in, Marginalisation) (q_out::PointMass,) = begin     
    return Bernoulli(mean(q_out))
end

Generative Model Specification

We can now build the full generative model.

# GraphPPL.jl exports the `@model` macro for model specification
# It accepts a regular Julia function and builds an FFG under the hood
@model function skill_model()
    s = randomvar(3)
    t = randomvar(3)
    r = datavar(Float64, 3)

    # Priors
    for i=1:3
        s[i] ~ Bernoulli(0.5)
    end

    # Domain logic
    t[1] ~ ¬s[1]
    t[2] ~ t[1] || s[2]
    t[3] ~ t[2] && s[3]
    
    # Results
    for i=1:3
        r[i] ~ Score(t[i])
    end
end

Inference Specification

Let us assume that a student scored very low on all questions and set up and execute an inference algorithm.

test_results = [0.1, 0.1, 0.1]
inference_result = infer(
    model = skill_model(),
    data  = (r = test_results, )
)
Inference results:
  Posteriors       | available for (s, t)

Results

# Inspect the results
map(params, inference_result.posteriors[:s])
3-element Vector{Tuple{Float64}}:
 (0.9872448979591837,)
 (0.06377551020408162,)
 (0.4719387755102041,)

These results suggest that this particular student was very likely out on the town last night.