Mutation Testing
Automated mutation testing to validate test suite effectiveness using cargo-mutants — it measures whether your tests actually catch bugs, not just whether they run.
Reference
Section titled “Reference”| Field | Value |
|---|---|
| Workflow | .github/workflows/mutation-testing.yml |
| Tool | cargo-mutants |
| Triggers | PR (on crates//tests/ changes), manual dispatch |
| Goal | Detect weak or missing tests |
How mutation testing works
Section titled “How mutation testing works”cargo-mutants modifies your code (introduces “mutants”) and runs the test suite against each change:
- Generate mutants — modify code systematically (e.g.
+to-,>to<). - Run tests — execute the suite against each mutant.
- Score — the percentage of mutants caught by tests is the test-quality score.
Good tests catch mutants; a surviving (missed) mutant marks a test gap.
Mutation types
Section titled “Mutation types”| Category | Example |
|---|---|
| Binary operators | + → -, * → /, && → || |
| Comparison operators | > → <, == → != |
| Return values | Return a default instead of the computed value |
| Function bodies | Replace with a default/empty implementation |
CI behavior
Section titled “CI behavior”The workflow runs automatically on PRs when crates/, tests/, Cargo.toml, or Cargo.lock change. It uploads the mutation-test-report artifact, available via Actions → Artifacts → mutation-test-report.
Summary output
Section titled “Summary output”Total mutants: 50Caught: 45Missed: 5Timeout: 0Score: 90%- Total: mutations generated.
- Caught: mutations detected by tests (good).
- Missed: mutations not caught (test gaps).
- Timeout: mutations causing infinite loops.
- Score:
(caught / total) * 100.
Target: ≥80% mutation score.
Missed mutant report
Section titled “Missed mutant report”Function: calculate_totalFile: crates/lib.rs:42Mutation: Changed + to -Status: MISSED
This mutant survived testing, indicating missing test coverage.Score interpretation
Section titled “Score interpretation”| Score | Quality |
|---|---|
< 50% | Critical test gaps |
50-80% | Needs improvement |
≥80% | Good coverage |
≥95% | Excellent coverage |
Why mutants survive
Section titled “Why mutants survive”- Missing tests — the function is not tested at all.
- Weak assertions — tests don’t verify actual behavior.
- Dead code — code that never executes (remove it).
- Equivalent mutants — the mutation doesn’t change behavior (rare).
How-to
Section titled “How-to”Run mutation tests locally
Section titled “Run mutation tests locally”# Install cargo-mutantscargo install cargo-mutants
# Run mutation testscargo mutants
# Test a specific filecargo mutants --file crates/lib.rs
# Limit execution timecargo mutants --timeout 300
# Generate JSON outputcargo mutants --output mutants.out --jsonVerify: the run prints a Score: line.
Close a missed mutant
Section titled “Close a missed mutant”A weak test lets a mutant survive. Given:
pub fn add(a: i32, b: i32) -> i32 { a + b}this test passes even when + becomes - (2 - 2 = 0, but the assertion only checks add(2, 2)):
#[test]fn test_add() { assert_eq!(add(2, 2), 4);}Add cases that distinguish the operators:
#[test]fn test_add() { assert_eq!(add(2, 3), 5); // Would fail if + became - assert_eq!(add(0, 5), 5); assert_eq!(add(-1, 1), 0);}Cover the recurring gap categories:
// Catch comparison mutations with boundary values#[test]fn test_bounds() { assert!(is_valid(0)); // boundary assert!(is_valid(100)); // boundary assert!(!is_valid(101)); // just outside}
// Catch missing error-path tests#[test]fn test_error() { assert!(parse("").is_err()); assert!(parse("invalid").is_err());}
// Catch return-value mutations — assert the value, not just success#[test]fn test_compute() { assert_eq!(compute(5), 25); // Not just assert!(compute(5) > 0)}Verify: re-run cargo mutants --file <file> and confirm the mutant is now caught.
Configure cargo-mutants
Section titled “Configure cargo-mutants”Exclude files in .cargo-mutants.toml:
[mutants]exclude_files = [ "crates/generated.rs", "tests/fixtures/*.rs"]Set a per-mutant timeout in the workflow:
cargo mutants --timeout 300 # 5 minutes per mutantTarget specific functions:
cargo mutants --file crates/lib.rs --re "fn calculate"Verify: cargo mutants runs only over the configured scope.
Skip equivalent mutants
Section titled “Skip equivalent mutants”Some mutants don’t change behavior and will always survive:
// These are equivalentfn example() -> bool { true }fn example() -> bool { return true; }Skip a function the analyzer can’t reason about:
#[mutants::skip] // Skip entire functionpub fn generated_code() -> i32 { 42}Verify: the skipped function no longer appears in the mutant list.
Troubleshooting
Section titled “Troubleshooting”Too slow:
cargo mutants --jobs 4cargo mutants --file crates/changed_file.rsTimeouts:
cargo mutants --timeout 600False positives — equivalent mutants; accept them or annotate with #[mutants::skip].
Best practices
Section titled “Best practices”- Run locally before pushing to catch gaps early.
- Focus on critical paths first (public API, core logic).
- Don’t chase 100% — diminishing returns above 90%.
- Use with coverage — mutation testing complements line coverage.
- Fix incrementally — one missed mutant at a time.
Why this matters
Section titled “Why this matters”Line coverage tells you a line ran; it cannot tell you whether a test would notice if that line were wrong. Mutation testing closes exactly that blind spot — by deliberately breaking the code and checking whether any test fails, it distinguishes assertions that verify behavior from tests that merely execute it. A high coverage number paired with a low mutation score is the signature of tautological tests, and surfacing that gap on PRs (where crates/ or tests/ changed) keeps test quality from quietly decaying as the codebase grows.