Mutation Testing

Automated mutation testing to validate test suite effectiveness using cargo-mutants — it measures whether your tests actually catch bugs, not just whether they run.

Reference

Field	Value
Workflow	`.github/workflows/mutation-testing.yml`
Tool	`cargo-mutants`
Triggers	PR (on `crates/`/`tests/` changes), manual dispatch
Goal	Detect weak or missing tests

How mutation testing works

cargo-mutants modifies your code (introduces “mutants”) and runs the test suite against each change:

Generate mutants — modify code systematically (e.g. + to -, > to <).
Run tests — execute the suite against each mutant.
Score — the percentage of mutants caught by tests is the test-quality score.

Good tests catch mutants; a surviving (missed) mutant marks a test gap.

Mutation types

Category	Example
Binary operators	`+` → `-`, `*` → `/`, `&&` → `\|\|`
Comparison operators	`>` → `<`, `==` → `!=`
Return values	Return a default instead of the computed value
Function bodies	Replace with a default/empty implementation

CI behavior

The workflow runs automatically on PRs when crates/, tests/, Cargo.toml, or Cargo.lock change. It uploads the mutation-test-report artifact, available via Actions → Artifacts → mutation-test-report.

Summary output

Total mutants: 50
Caught: 45
Missed: 5
Timeout: 0
Score: 90%

Total: mutations generated.
Caught: mutations detected by tests (good).
Missed: mutations not caught (test gaps).
Timeout: mutations causing infinite loops.
Score: (caught / total) * 100.

Target: ≥80% mutation score.

Missed mutant report

Function: calculate_total
File: crates/lib.rs:42
Mutation: Changed + to -
Status: MISSED

This mutant survived testing, indicating missing test coverage.

Score interpretation

Score	Quality
`< 50%`	Critical test gaps
`50-80%`	Needs improvement
`≥80%`	Good coverage
`≥95%`	Excellent coverage

Why mutants survive

Missing tests — the function is not tested at all.
Weak assertions — tests don’t verify actual behavior.
Dead code — code that never executes (remove it).
Equivalent mutants — the mutation doesn’t change behavior (rare).

How-to

Run mutation tests locally

# Install cargo-mutants
cargo install cargo-mutants

# Run mutation tests
cargo mutants

# Test a specific file
cargo mutants --file crates/lib.rs

# Limit execution time
cargo mutants --timeout 300

# Generate JSON output
cargo mutants --output mutants.out --json

Verify: the run prints a Score: line.

Close a missed mutant

A weak test lets a mutant survive. Given:

pub fn add(a: i32, b: i32) -> i32 {
    a + b
}

this test passes even when + becomes - (2 - 2 = 0, but the assertion only checks add(2, 2)):

#[test]
fn test_add() {
    assert_eq!(add(2, 2), 4);
}

Add cases that distinguish the operators:

#[test]
fn test_add() {
    assert_eq!(add(2, 3), 5);  // Would fail if + became -
    assert_eq!(add(0, 5), 5);
    assert_eq!(add(-1, 1), 0);
}

Cover the recurring gap categories:

// Catch comparison mutations with boundary values
#[test]
fn test_bounds() {
    assert!(is_valid(0));    // boundary
    assert!(is_valid(100));  // boundary
    assert!(!is_valid(101)); // just outside
}

// Catch missing error-path tests
#[test]
fn test_error() {
    assert!(parse("").is_err());
    assert!(parse("invalid").is_err());
}

// Catch return-value mutations — assert the value, not just success
#[test]
fn test_compute() {
    assert_eq!(compute(5), 25);  // Not just assert!(compute(5) > 0)
}

Verify: re-run cargo mutants --file <file> and confirm the mutant is now caught.

Configure cargo-mutants

Exclude files in .cargo-mutants.toml:

[mutants]
exclude_files = [
    "crates/generated.rs",
    "tests/fixtures/*.rs"
]

Set a per-mutant timeout in the workflow:

cargo mutants --timeout 300  # 5 minutes per mutant

Target specific functions:

cargo mutants --file crates/lib.rs --re "fn calculate"

Verify: cargo mutants runs only over the configured scope.

Skip equivalent mutants

Some mutants don’t change behavior and will always survive:

// These are equivalent
fn example() -> bool { true }
fn example() -> bool { return true; }

Skip a function the analyzer can’t reason about:

#[mutants::skip]  // Skip entire function
pub fn generated_code() -> i32 {
    42
}

Verify: the skipped function no longer appears in the mutant list.

Troubleshooting

Too slow:

cargo mutants --jobs 4
cargo mutants --file crates/changed_file.rs

Timeouts:

cargo mutants --timeout 600

False positives — equivalent mutants; accept them or annotate with #[mutants::skip].

Best practices

Run locally before pushing to catch gaps early.
Focus on critical paths first (public API, core logic).
Don’t chase 100% — diminishing returns above 90%.
Use with coverage — mutation testing complements line coverage.
Fix incrementally — one missed mutant at a time.

Why this matters

Line coverage tells you a line ran; it cannot tell you whether a test would notice if that line were wrong. Mutation testing closes exactly that blind spot — by deliberately breaking the code and checking whether any test fails, it distinguishes assertions that verify behavior from tests that merely execute it. A high coverage number paired with a low mutation score is the signature of tautological tests, and surfacing that gap on PRs (where crates/ or tests/ changed) keeps test quality from quietly decaying as the codebase grows.