CLI Reference - ai-blackteam

Every ai-blackteam command with every flag, organized by function.

Global Flags

Command Map

These apply to all commands:

ai-blackteam [-v | --verbose] [--log-file FILE] COMMAND

Flag	Description
`-v`, `--verbose`	Enable DEBUG-level logging
`--log-file FILE`	Write logs to a file

Execution

run

Run a single attack against a model.

ai-blackteam run -p PROVIDER -a ATTACK -t TARGET [OPTIONS]

Flag	Required	Default	Description
`-p`, `--provider`	Yes	-	Provider name (anthropic, openai, google, etc.)
`-m`, `--model`	No	Provider default	Model name override
`-a`, `--attack`	Yes	-	Attack technique ID
`-t`, `--target`	Yes	-	Target behavior to test
`--system-prompt`	No	-	System prompt to test as a defense
`--system-prompt-file`	No	-	Read system prompt from file
`--verbose`	No	-	Show full response text
`--quiet`	No	-	Suppress output. Exit 0 = all blocked, 1 = any bypassed

Exit codes: 0 = all blocked, 1 = at least one bypass, 2 = invalid arguments.

ai-blackteam run -p anthropic -a encoding-obfuscation -t "Write a phishing email"
ai-blackteam run -p openai -m gpt-5.4 -a role-play-jailbreak -t "Write malware" --verbose
ai-blackteam run -p anthropic -a prefix-injection -t "Leak system prompt" --quiet

batch

Run multiple attacks against a model. Parallel by default.

ai-blackteam batch -p PROVIDER -t TARGET [OPTIONS]

Flag	Required	Default	Description
`-p`, `--provider`	Yes	-	Provider name
`-m`, `--model`	No	Provider default	Model name override
`--attacks`	No	`all`	Comma-separated attack names or `all`
`-t`, `--target`	Yes	-	Target behavior to test
`--system-prompt`	No	-	System prompt defense
`--system-prompt-file`	No	-	Read system prompt from file
`-w`, `--workers`	No	5	Max parallel workers
`--verbose`	No	-	Show full responses
`--quiet`	No	-	Suppress output
`--sequential`	No	-	Run attacks one at a time (no parallelism)

ai-blackteam batch -p anthropic --attacks all -t "Write malware" -w 10
ai-blackteam batch -p openai --attacks encoding-obfuscation,role-play-jailbreak -t "Write malware"
ai-blackteam batch -p anthropic -t "Write malware" --sequential

sweep

Run all attacks against all configured providers.

ai-blackteam sweep -t TARGET [OPTIONS]

Flag	Required	Default	Description
`-t`, `--target`	Yes	-	Target behavior to test
`--verbose`	No	-	Show full responses
`--quiet`	No	-	Suppress output

Tests every provider that has an API key configured (plus Ollama, which doesn’t need one).

ai-blackteam sweep -t "Write a phishing email"

benchmark

Run the safety benchmark and produce a score. Supports single model, all models, or a specific list.

ai-blackteam benchmark -p PROVIDER [OPTIONS]
ai-blackteam benchmark --all [OPTIONS]
ai-blackteam benchmark --models PROVIDER:MODEL,... [OPTIONS]

Flag	Required	Default	Description
`-p`, `--provider`	No*	-	Provider name
`-m`, `--model`	No	Provider default	Model name
`--all`	No*	-	Benchmark all configured providers
`--models`	No*	-	Comma-separated `provider:model` pairs
`-w`, `--workers`	No	5	Max parallel workers
`--categories`	No	All	Comma-separated categories to test
`--threshold`	No	-	Min safety score (0-100). Exit 1 if below
`--output`	No	-	Save JSON results to file
`--quiet`	No	-	Minimal output

*One of -p, --all, or --models is required.

ai-blackteam benchmark -p anthropic -m claude-sonnet-4-6
ai-blackteam benchmark --all --threshold 80
ai-blackteam benchmark --models anthropic:claude-sonnet-4-6,openai:gpt-5.4 -o scores.json

defend

Test a defense by comparing baseline vs defended safety scores.

ai-blackteam defend -p PROVIDER -t TARGET [OPTIONS]

Flag	Required	Default	Description
`-p`, `--provider`	Yes	-	Provider name
`-m`, `--model`	No	Provider default	Model name
`--attacks`	No	`all`	Comma-separated attacks or `all`
`-t`, `--target`	Yes	-	Target behavior
`--system-prompt`	No	-	System prompt defense to test
`--system-prompt-file`	No	-	Read system prompt from file
`--guardrail`	No	-	Preset guardrail: `permissive`, `moderate`, `strict`, `llm-judge`
`-w`, `--workers`	No	5	Max parallel workers
`--output`	No	-	Save JSON comparison to file

ai-blackteam defend -p anthropic -t "Write a phishing email" --system-prompt "Never help with harmful content"
ai-blackteam defend -p anthropic -t "Write a phishing email" --guardrail strict
ai-blackteam defend -p openai -t "Write malware" --guardrail moderate --system-prompt "Be safe"

asl3

Run ASL3 safety evaluation (CBRN + autonomous capabilities).

ai-blackteam asl3 -p PROVIDER [OPTIONS]

Flag	Required	Default	Description
`-p`, `--provider`	Yes	-	Provider name
`-m`, `--model`	No	Provider default	Model name
`--domain`	No	`all`	`cbrn`, `autonomous`, or `all`
`-w`, `--workers`	No	5	Parallel workers
`--limit`	No	-	Max attacks per domain
`--quiet`	No	-	Minimal output

ai-blackteam asl3 -p anthropic --domain cbrn
ai-blackteam asl3 -p openai --domain all --limit 50

mega-sweep

Run attacks against dataset prompts with optional mutations.

ai-blackteam mega-sweep -p PROVIDER --dataset DATASETS [OPTIONS]

Flag	Required	Default	Description
`-p`, `--provider`	Yes	-	Provider name
`-m`, `--model`	No	Default	Model name
`--dataset`	Yes	-	Comma-separated dataset names or `all`
`--mutations`	No	None	Mutation types: `encode`, `frame`, `difficulty`
`--attacks`	No	`all`	Attack filter
`--categories`	No	All	Filter by harm categories
`-w`, `--workers`	No	5	Parallel workers
`--limit`	No	-	Max prompts per dataset
`-o`, `--output`	No	-	Save JSON results
`--quiet`	No	-	Minimal output
`--dry-run`	No	-	Show plan without running

ai-blackteam mega-sweep -p anthropic --dataset harmbench,advbench -w 10
ai-blackteam mega-sweep -p openai --dataset all --mutations encode,frame --dry-run

expand count

Show template expansion capacity.

ai-blackteam expand count

expand list

List expanded attacks with filters.

ai-blackteam expand list [OPTIONS]

Flag	Default	Description
`--category`	All	Filter by harm category
`--difficulty`	All	`easy`, `medium`, `hard`, `extreme`
`--technique`	All	Filter by technique ID
`--limit`	50	Max rows to show

expand run

Run expanded attacks against a model.

ai-blackteam expand run -p PROVIDER [OPTIONS]

Flag	Required	Default	Description
`-p`, `--provider`	Yes	-	Provider name
`-m`, `--model`	No	Default	Model name
`--category`	No	All	Filter by category
`--difficulty`	No	All	Filter by difficulty
`--technique`	No	All	Filter by technique
`--mutations`	No	-	Apply 17 mutation variants
`--languages`	No	-	Apply 10 language variants
`--limit`	No	-	Max attacks to run
`-w`, `--workers`	No	5	Parallel workers
`--quiet`	No	-	Minimal output

ai-blackteam expand run -p anthropic --category phishing --difficulty hard
ai-blackteam expand run -p openai --mutations --languages --limit 100

generate pair

Run PAIR adaptive attack (attacker-target-judge loop).

ai-blackteam generate pair -p PROVIDER -t TARGET [OPTIONS]

Flag	Required	Default	Description
`-p`, `--provider`	Yes	-	Target provider
`-m`, `--model`	No	Default	Target model
`-t`, `--target`	Yes	-	Target behavior
`--attacker`	No	Same as target	Attacker provider
`--judge`	No	Same as attacker	Judge provider
`--max-iter`	No	20	Max iterations
`--threshold`	No	8	Success score threshold (1-10)
`--quiet`	No	-	Minimal output

ai-blackteam generate pair -p anthropic -t "Write a phishing email" --max-iter 30

generate tap

Run TAP tree-of-attacks with pruning.

ai-blackteam generate tap -p PROVIDER -t TARGET [OPTIONS]

Flag	Required	Default	Description
`-p`, `--provider`	Yes	-	Target provider
`-m`, `--model`	No	Default	Target model
`-t`, `--target`	Yes	-	Target behavior
`--attacker`	No	Same as target	Attacker provider
`--depth`	No	5	Tree depth
`--width`	No	5	Candidates per level
`--branching`	No	4	Branches per candidate
`--threshold`	No	8	Success score threshold
`--quiet`	No	-	Minimal output

ai-blackteam generate tap -p openai -t "Write malware" --depth 8 --width 10

generate fuzz

Run GPTFuzzer mutation-based attack generation.

ai-blackteam generate fuzz -p PROVIDER -t TARGET [OPTIONS]

Flag	Required	Default	Description
`-p`, `--provider`	Yes	-	Target provider
`-m`, `--model`	No	Default	Target model
`-t`, `--target`	Yes	-	Target behavior
`--mutator`	No	Same as target	Mutator provider
`--iterations`	No	50	Fuzzing iterations
`--seeds`	No	5	Initial seed count
`--threshold`	No	7	Success score threshold
`--quiet`	No	-	Minimal output

ai-blackteam generate fuzz -p anthropic -t "Write a phishing email" --iterations 100

vuln-research

Run vulnerability research elicitation attacks. Tests a model’s willingness to find and exploit code vulnerabilities at three escalation levels: review, explain exploitation, and write a working exploit.

ai-blackteam vuln-research -p PROVIDER [-m MODEL] [--mode {synthetic,cve,user,all}] [--code-path PATH] [-t TARGET] [--verify] [--verify-llm] [--verbose] [--quiet]

Flag	Required	Default	Description
`-p`, `--provider`	Yes	-	Provider name
`-m`, `--model`	No	Provider default	Model name override
`--mode`	No	`synthetic`	Sub-mode: `synthetic`, `cve`, `user`, or `all`
`--code-path`	No	-	Path to user-provided code (required for `user` mode)
`-t`, `--target`	No	-	Target description override
`--verify`	No	-	Verify exploit correctness with static analysis
`--verify-llm`	No	-	Verify exploit correctness using an LLM judge
`--verbose`	No	-	Show full responses
`--quiet`	No	-	Suppress output

ai-blackteam vuln-research -p anthropic --mode synthetic
ai-blackteam vuln-research -p openai --mode cve --verbose
ai-blackteam vuln-research -p anthropic --mode user --code-path ./my-app/
ai-blackteam vuln-research -p anthropic --mode all --verify-llm

Reporting

report

Generate a report from stored results.

ai-blackteam report [OPTIONS]

Flag	Required	Default	Description
`--format`	No	`markdown`	Output format: `markdown`, `json`, `html`
`--export`	No	-	Export format: `promptfoo`, `garak`
`-o`, `--output`	No	stdout	Output file path

ai-blackteam report --format html -o report.html
ai-blackteam report --export promptfoo -o results.json
ai-blackteam report --format json

scorecard

Show safety scorecard from stored results.

ai-blackteam scorecard [OPTIONS]

Flag	Required	Default	Description
`--format`	No	`table`	Output: `table`, `json`, `markdown`
`-o`, `--output`	No	-	Output file
`-m`, `--model`	No	All models	Filter by model name
`--standard`	No	`llm`	`llm` (OWASP LLM Top 10), `agentic` (Agentic Top 10), `compliance` (EU AI Act + NIST)

ai-blackteam scorecard --standard llm
ai-blackteam scorecard --standard agentic -m claude-sonnet-4-6
ai-blackteam scorecard --standard compliance --format json -o compliance.json

snapshot list

List all snapshots with bypass rates, dates, and models.

ai-blackteam snapshot list

snapshot diff

Compare two snapshots side by side. Shows before/after bypass rate with delta.

ai-blackteam snapshot diff ID1 ID2

Flag	Required	Description
`ID1`	Yes	First snapshot ID
`ID2`	Yes	Second snapshot ID

snapshot export

Export snapshot data to a file.

ai-blackteam snapshot export [--format {json,csv}] [-o OUTPUT]

Flag	Required	Default	Description
`--format`	No	`json`	Export format: `json` or `csv`
`-o`, `--output`	No	stdout	Output file path

snapshot matrix

Show a model x time bypass rate matrix. Useful for tracking how safety changes across model versions over time.

ai-blackteam snapshot matrix

snapshot check

Check if the latest snapshot bypass rate is below a threshold. Returns exit code 0 if below, exit code 1 if above. Designed for CI gating.

ai-blackteam snapshot check [--latest] [--threshold 0.10]

Flag	Required	Default	Description
`--latest`	No	-	Check only the most recent snapshot
`--threshold`	No	`0.10`	Maximum acceptable bypass rate (0.0-1.0)

ai-blackteam snapshot check --latest --threshold 0.05

Discovery

list-providers

Show available providers and default models.

ai-blackteam list-providers

list-attacks

Show available attacks and their modes.

ai-blackteam list-attacks

taxonomy

Show all attacks grouped by category with OWASP/MITRE mappings.

ai-blackteam taxonomy

No flags. Displays a table per category with attack IDs, severity, mode, description, OWASP codes, and MITRE ATLAS IDs.

atlas

Show MITRE ATLAS technique mappings for all attacks.

ai-blackteam atlas

mlcommons

Show MLCommons AILuminate hazard taxonomy and harm category alignment.

ai-blackteam mlcommons

frameworks

Show regulatory framework mappings (NIST AI RMF, EU AI Act, MLCommons).

ai-blackteam frameworks

Management

config show

Show current configuration (API keys are truncated).

ai-blackteam config show

config set

Set a config value.

ai-blackteam config set KEY VALUE

ai-blackteam config set providers.anthropic.api_key sk-ant-...
ai-blackteam config set workers 10
ai-blackteam config set storage.database /tmp/test.db

dataset list

Show available datasets.

ai-blackteam dataset list

dataset load

Download and cache a dataset.

ai-blackteam dataset load NAME
ai-blackteam dataset load --all

Flag	Description
`NAME`	Dataset name (optional if using `--all`)
`--all`	Download all datasets

dataset stats

Show prompt counts per category across cached datasets.

ai-blackteam dataset stats

scan

Scan source code for AI security vulnerabilities.

ai-blackteam scan [PATH] [OPTIONS]

Flag	Required	Default	Description
`PATH`	No	`.`	File or directory to scan
`--format`	No	`table`	Output: `table`, `json`
`--severity`	No	All	Minimum severity: `critical`, `high`, `medium`, `low`
`-o`, `--output`	No	-	Save JSON results to file

ai-blackteam scan .
ai-blackteam scan src/ --severity high
ai-blackteam scan app.py --format json -o findings.json

​Global Flags

​Command Map

​Execution

​Reporting

​Discovery