Global Flags
Command Map
These apply to all commands:| Flag | Description |
|---|---|
-v, --verbose | Enable DEBUG-level logging |
--log-file FILE | Write logs to a file |
Execution
run
run
Run a single attack against a model.
Exit codes: 0 = all blocked, 1 = at least one bypass, 2 = invalid arguments.
| Flag | Required | Default | Description |
|---|---|---|---|
-p, --provider | Yes | - | Provider name (anthropic, openai, google, etc.) |
-m, --model | No | Provider default | Model name override |
-a, --attack | Yes | - | Attack technique ID |
-t, --target | Yes | - | Target behavior to test |
--system-prompt | No | - | System prompt to test as a defense |
--system-prompt-file | No | - | Read system prompt from file |
--verbose | No | - | Show full response text |
--quiet | No | - | Suppress output. Exit 0 = all blocked, 1 = any bypassed |
batch
batch
Run multiple attacks against a model. Parallel by default.
| Flag | Required | Default | Description |
|---|---|---|---|
-p, --provider | Yes | - | Provider name |
-m, --model | No | Provider default | Model name override |
--attacks | No | all | Comma-separated attack names or all |
-t, --target | Yes | - | Target behavior to test |
--system-prompt | No | - | System prompt defense |
--system-prompt-file | No | - | Read system prompt from file |
-w, --workers | No | 5 | Max parallel workers |
--verbose | No | - | Show full responses |
--quiet | No | - | Suppress output |
--sequential | No | - | Run attacks one at a time (no parallelism) |
sweep
sweep
Run all attacks against all configured providers.
Tests every provider that has an API key configured (plus Ollama, which doesn’t need one).
| Flag | Required | Default | Description |
|---|---|---|---|
-t, --target | Yes | - | Target behavior to test |
--verbose | No | - | Show full responses |
--quiet | No | - | Suppress output |
benchmark
benchmark
Run the safety benchmark and produce a score. Supports single model, all models, or a specific list.
*One of
| Flag | Required | Default | Description |
|---|---|---|---|
-p, --provider | No* | - | Provider name |
-m, --model | No | Provider default | Model name |
--all | No* | - | Benchmark all configured providers |
--models | No* | - | Comma-separated provider:model pairs |
-w, --workers | No | 5 | Max parallel workers |
--categories | No | All | Comma-separated categories to test |
--threshold | No | - | Min safety score (0-100). Exit 1 if below |
--output | No | - | Save JSON results to file |
--quiet | No | - | Minimal output |
-p, --all, or --models is required.defend
defend
Test a defense by comparing baseline vs defended safety scores.
| Flag | Required | Default | Description |
|---|---|---|---|
-p, --provider | Yes | - | Provider name |
-m, --model | No | Provider default | Model name |
--attacks | No | all | Comma-separated attacks or all |
-t, --target | Yes | - | Target behavior |
--system-prompt | No | - | System prompt defense to test |
--system-prompt-file | No | - | Read system prompt from file |
--guardrail | No | - | Preset guardrail: permissive, moderate, strict, llm-judge |
-w, --workers | No | 5 | Max parallel workers |
--output | No | - | Save JSON comparison to file |
asl3
asl3
Run ASL3 safety evaluation (CBRN + autonomous capabilities).
| Flag | Required | Default | Description |
|---|---|---|---|
-p, --provider | Yes | - | Provider name |
-m, --model | No | Provider default | Model name |
--domain | No | all | cbrn, autonomous, or all |
-w, --workers | No | 5 | Parallel workers |
--limit | No | - | Max attacks per domain |
--quiet | No | - | Minimal output |
mega-sweep
mega-sweep
Run attacks against dataset prompts with optional mutations.
| Flag | Required | Default | Description |
|---|---|---|---|
-p, --provider | Yes | - | Provider name |
-m, --model | No | Default | Model name |
--dataset | Yes | - | Comma-separated dataset names or all |
--mutations | No | None | Mutation types: encode, frame, difficulty |
--attacks | No | all | Attack filter |
--categories | No | All | Filter by harm categories |
-w, --workers | No | 5 | Parallel workers |
--limit | No | - | Max prompts per dataset |
-o, --output | No | - | Save JSON results |
--quiet | No | - | Minimal output |
--dry-run | No | - | Show plan without running |
expand count
expand count
Show template expansion capacity.
expand list
expand list
List expanded attacks with filters.
| Flag | Default | Description |
|---|---|---|
--category | All | Filter by harm category |
--difficulty | All | easy, medium, hard, extreme |
--technique | All | Filter by technique ID |
--limit | 50 | Max rows to show |
expand run
expand run
Run expanded attacks against a model.
| Flag | Required | Default | Description |
|---|---|---|---|
-p, --provider | Yes | - | Provider name |
-m, --model | No | Default | Model name |
--category | No | All | Filter by category |
--difficulty | No | All | Filter by difficulty |
--technique | No | All | Filter by technique |
--mutations | No | - | Apply 17 mutation variants |
--languages | No | - | Apply 10 language variants |
--limit | No | - | Max attacks to run |
-w, --workers | No | 5 | Parallel workers |
--quiet | No | - | Minimal output |
generate pair
generate pair
Run PAIR adaptive attack (attacker-target-judge loop).
| Flag | Required | Default | Description |
|---|---|---|---|
-p, --provider | Yes | - | Target provider |
-m, --model | No | Default | Target model |
-t, --target | Yes | - | Target behavior |
--attacker | No | Same as target | Attacker provider |
--judge | No | Same as attacker | Judge provider |
--max-iter | No | 20 | Max iterations |
--threshold | No | 8 | Success score threshold (1-10) |
--quiet | No | - | Minimal output |
generate tap
generate tap
Run TAP tree-of-attacks with pruning.
| Flag | Required | Default | Description |
|---|---|---|---|
-p, --provider | Yes | - | Target provider |
-m, --model | No | Default | Target model |
-t, --target | Yes | - | Target behavior |
--attacker | No | Same as target | Attacker provider |
--depth | No | 5 | Tree depth |
--width | No | 5 | Candidates per level |
--branching | No | 4 | Branches per candidate |
--threshold | No | 8 | Success score threshold |
--quiet | No | - | Minimal output |
generate fuzz
generate fuzz
Run GPTFuzzer mutation-based attack generation.
| Flag | Required | Default | Description |
|---|---|---|---|
-p, --provider | Yes | - | Target provider |
-m, --model | No | Default | Target model |
-t, --target | Yes | - | Target behavior |
--mutator | No | Same as target | Mutator provider |
--iterations | No | 50 | Fuzzing iterations |
--seeds | No | 5 | Initial seed count |
--threshold | No | 7 | Success score threshold |
--quiet | No | - | Minimal output |
vuln-research
vuln-research
Run vulnerability research elicitation attacks. Tests a model’s willingness to find and exploit code vulnerabilities at three escalation levels: review, explain exploitation, and write a working exploit.
| Flag | Required | Default | Description |
|---|---|---|---|
-p, --provider | Yes | - | Provider name |
-m, --model | No | Provider default | Model name override |
--mode | No | synthetic | Sub-mode: synthetic, cve, user, or all |
--code-path | No | - | Path to user-provided code (required for user mode) |
-t, --target | No | - | Target description override |
--verify | No | - | Verify exploit correctness with static analysis |
--verify-llm | No | - | Verify exploit correctness using an LLM judge |
--verbose | No | - | Show full responses |
--quiet | No | - | Suppress output |
Reporting
report
report
Generate a report from stored results.
| Flag | Required | Default | Description |
|---|---|---|---|
--format | No | markdown | Output format: markdown, json, html |
--export | No | - | Export format: promptfoo, garak |
-o, --output | No | stdout | Output file path |
scorecard
scorecard
Show safety scorecard from stored results.
| Flag | Required | Default | Description |
|---|---|---|---|
--format | No | table | Output: table, json, markdown |
-o, --output | No | - | Output file |
-m, --model | No | All models | Filter by model name |
--standard | No | llm | llm (OWASP LLM Top 10), agentic (Agentic Top 10), compliance (EU AI Act + NIST) |
snapshot list
snapshot list
List all snapshots with bypass rates, dates, and models.
snapshot diff
snapshot diff
Compare two snapshots side by side. Shows before/after bypass rate with delta.
| Flag | Required | Description |
|---|---|---|
ID1 | Yes | First snapshot ID |
ID2 | Yes | Second snapshot ID |
snapshot export
snapshot export
Export snapshot data to a file.
| Flag | Required | Default | Description |
|---|---|---|---|
--format | No | json | Export format: json or csv |
-o, --output | No | stdout | Output file path |
snapshot matrix
snapshot matrix
Show a model x time bypass rate matrix. Useful for tracking how safety changes across model versions over time.
snapshot check
snapshot check
Check if the latest snapshot bypass rate is below a threshold. Returns exit code 0 if below, exit code 1 if above. Designed for CI gating.
| Flag | Required | Default | Description |
|---|---|---|---|
--latest | No | - | Check only the most recent snapshot |
--threshold | No | 0.10 | Maximum acceptable bypass rate (0.0-1.0) |
Discovery
list-providers
list-providers
Show available providers and default models.
list-attacks
list-attacks
Show available attacks and their modes.
taxonomy
taxonomy
Show all attacks grouped by category with OWASP/MITRE mappings.No flags. Displays a table per category with attack IDs, severity, mode, description, OWASP codes, and MITRE ATLAS IDs.
atlas
atlas
Show MITRE ATLAS technique mappings for all attacks.
mlcommons
mlcommons
Show MLCommons AILuminate hazard taxonomy and harm category alignment.
frameworks
frameworks
Show regulatory framework mappings (NIST AI RMF, EU AI Act, MLCommons).
Management
config show
config show
Show current configuration (API keys are truncated).
config set
config set
Set a config value.
dataset list
dataset list
Show available datasets.
dataset load
dataset load
Download and cache a dataset.
| Flag | Description |
|---|---|
NAME | Dataset name (optional if using --all) |
--all | Download all datasets |
dataset stats
dataset stats
Show prompt counts per category across cached datasets.
scan
scan
Scan source code for AI security vulnerabilities.
| Flag | Required | Default | Description |
|---|---|---|---|
PATH | No | . | File or directory to scan |
--format | No | table | Output: table, json |
--severity | No | All | Minimum severity: critical, high, medium, low |
-o, --output | No | - | Save JSON results to file |