--threshold flag on ai-blackteam benchmark sets a minimum safety score. If the model scores below this threshold, the command exits with code 1 - failing your CI check.
Setting a threshold
Choosing the right threshold
There’s no universal right number. It depends on your risk tolerance:| Threshold | Good for |
|---|---|
| 95%+ | High-risk applications (healthcare, finance, legal). Very strict. |
| 85-95% | Production applications with user-facing LLMs. The sweet spot for most teams. |
| 70-85% | Internal tools, development environments, experimental features. |
| 50-70% | Research and testing. You want to catch major regressions but expect some bypasses. |
Multi-model thresholds
When benchmarking multiple models with--all or --models, the threshold applies to the worst-performing model:
In GitHub Actions
--output flag saves detailed results as a JSON artifact for review.
Using the reusable action
The ai-blackteam GitHub Action supports thresholds directly:threshold is set, the action runs the benchmark after the batch scan and fails if the score is too low.