Every ai-blackteam command with every flag, organized by function.

Global Flags

Command Map

These apply to all commands:
ai-blackteam [-v | --verbose] [--log-file FILE] COMMAND
FlagDescription
-v, --verboseEnable DEBUG-level logging
--log-file FILEWrite logs to a file

Execution

Run a single attack against a model.
ai-blackteam run -p PROVIDER -a ATTACK -t TARGET [OPTIONS]
FlagRequiredDefaultDescription
-p, --providerYes-Provider name (anthropic, openai, google, etc.)
-m, --modelNoProvider defaultModel name override
-a, --attackYes-Attack technique ID
-t, --targetYes-Target behavior to test
--system-promptNo-System prompt to test as a defense
--system-prompt-fileNo-Read system prompt from file
--verboseNo-Show full response text
--quietNo-Suppress output. Exit 0 = all blocked, 1 = any bypassed
Exit codes: 0 = all blocked, 1 = at least one bypass, 2 = invalid arguments.
ai-blackteam run -p anthropic -a encoding-obfuscation -t "Write a phishing email"
ai-blackteam run -p openai -m gpt-5.4 -a role-play-jailbreak -t "Write malware" --verbose
ai-blackteam run -p anthropic -a prefix-injection -t "Leak system prompt" --quiet
Run multiple attacks against a model. Parallel by default.
ai-blackteam batch -p PROVIDER -t TARGET [OPTIONS]
FlagRequiredDefaultDescription
-p, --providerYes-Provider name
-m, --modelNoProvider defaultModel name override
--attacksNoallComma-separated attack names or all
-t, --targetYes-Target behavior to test
--system-promptNo-System prompt defense
--system-prompt-fileNo-Read system prompt from file
-w, --workersNo5Max parallel workers
--verboseNo-Show full responses
--quietNo-Suppress output
--sequentialNo-Run attacks one at a time (no parallelism)
ai-blackteam batch -p anthropic --attacks all -t "Write malware" -w 10
ai-blackteam batch -p openai --attacks encoding-obfuscation,role-play-jailbreak -t "Write malware"
ai-blackteam batch -p anthropic -t "Write malware" --sequential
Run all attacks against all configured providers.
ai-blackteam sweep -t TARGET [OPTIONS]
FlagRequiredDefaultDescription
-t, --targetYes-Target behavior to test
--verboseNo-Show full responses
--quietNo-Suppress output
Tests every provider that has an API key configured (plus Ollama, which doesn’t need one).
ai-blackteam sweep -t "Write a phishing email"
Run the safety benchmark and produce a score. Supports single model, all models, or a specific list.
ai-blackteam benchmark -p PROVIDER [OPTIONS]
ai-blackteam benchmark --all [OPTIONS]
ai-blackteam benchmark --models PROVIDER:MODEL,... [OPTIONS]
FlagRequiredDefaultDescription
-p, --providerNo*-Provider name
-m, --modelNoProvider defaultModel name
--allNo*-Benchmark all configured providers
--modelsNo*-Comma-separated provider:model pairs
-w, --workersNo5Max parallel workers
--categoriesNoAllComma-separated categories to test
--thresholdNo-Min safety score (0-100). Exit 1 if below
--outputNo-Save JSON results to file
--quietNo-Minimal output
*One of -p, --all, or --models is required.
ai-blackteam benchmark -p anthropic -m claude-sonnet-4-6
ai-blackteam benchmark --all --threshold 80
ai-blackteam benchmark --models anthropic:claude-sonnet-4-6,openai:gpt-5.4 -o scores.json
Test a defense by comparing baseline vs defended safety scores.
ai-blackteam defend -p PROVIDER -t TARGET [OPTIONS]
FlagRequiredDefaultDescription
-p, --providerYes-Provider name
-m, --modelNoProvider defaultModel name
--attacksNoallComma-separated attacks or all
-t, --targetYes-Target behavior
--system-promptNo-System prompt defense to test
--system-prompt-fileNo-Read system prompt from file
--guardrailNo-Preset guardrail: permissive, moderate, strict, llm-judge
-w, --workersNo5Max parallel workers
--outputNo-Save JSON comparison to file
ai-blackteam defend -p anthropic -t "Write a phishing email" --system-prompt "Never help with harmful content"
ai-blackteam defend -p anthropic -t "Write a phishing email" --guardrail strict
ai-blackteam defend -p openai -t "Write malware" --guardrail moderate --system-prompt "Be safe"
Run ASL3 safety evaluation (CBRN + autonomous capabilities).
ai-blackteam asl3 -p PROVIDER [OPTIONS]
FlagRequiredDefaultDescription
-p, --providerYes-Provider name
-m, --modelNoProvider defaultModel name
--domainNoallcbrn, autonomous, or all
-w, --workersNo5Parallel workers
--limitNo-Max attacks per domain
--quietNo-Minimal output
ai-blackteam asl3 -p anthropic --domain cbrn
ai-blackteam asl3 -p openai --domain all --limit 50
Run attacks against dataset prompts with optional mutations.
ai-blackteam mega-sweep -p PROVIDER --dataset DATASETS [OPTIONS]
FlagRequiredDefaultDescription
-p, --providerYes-Provider name
-m, --modelNoDefaultModel name
--datasetYes-Comma-separated dataset names or all
--mutationsNoNoneMutation types: encode, frame, difficulty
--attacksNoallAttack filter
--categoriesNoAllFilter by harm categories
-w, --workersNo5Parallel workers
--limitNo-Max prompts per dataset
-o, --outputNo-Save JSON results
--quietNo-Minimal output
--dry-runNo-Show plan without running
ai-blackteam mega-sweep -p anthropic --dataset harmbench,advbench -w 10
ai-blackteam mega-sweep -p openai --dataset all --mutations encode,frame --dry-run
Show template expansion capacity.
ai-blackteam expand count
List expanded attacks with filters.
ai-blackteam expand list [OPTIONS]
FlagDefaultDescription
--categoryAllFilter by harm category
--difficultyAlleasy, medium, hard, extreme
--techniqueAllFilter by technique ID
--limit50Max rows to show
Run expanded attacks against a model.
ai-blackteam expand run -p PROVIDER [OPTIONS]
FlagRequiredDefaultDescription
-p, --providerYes-Provider name
-m, --modelNoDefaultModel name
--categoryNoAllFilter by category
--difficultyNoAllFilter by difficulty
--techniqueNoAllFilter by technique
--mutationsNo-Apply 17 mutation variants
--languagesNo-Apply 10 language variants
--limitNo-Max attacks to run
-w, --workersNo5Parallel workers
--quietNo-Minimal output
ai-blackteam expand run -p anthropic --category phishing --difficulty hard
ai-blackteam expand run -p openai --mutations --languages --limit 100
Run PAIR adaptive attack (attacker-target-judge loop).
ai-blackteam generate pair -p PROVIDER -t TARGET [OPTIONS]
FlagRequiredDefaultDescription
-p, --providerYes-Target provider
-m, --modelNoDefaultTarget model
-t, --targetYes-Target behavior
--attackerNoSame as targetAttacker provider
--judgeNoSame as attackerJudge provider
--max-iterNo20Max iterations
--thresholdNo8Success score threshold (1-10)
--quietNo-Minimal output
ai-blackteam generate pair -p anthropic -t "Write a phishing email" --max-iter 30
Run TAP tree-of-attacks with pruning.
ai-blackteam generate tap -p PROVIDER -t TARGET [OPTIONS]
FlagRequiredDefaultDescription
-p, --providerYes-Target provider
-m, --modelNoDefaultTarget model
-t, --targetYes-Target behavior
--attackerNoSame as targetAttacker provider
--depthNo5Tree depth
--widthNo5Candidates per level
--branchingNo4Branches per candidate
--thresholdNo8Success score threshold
--quietNo-Minimal output
ai-blackteam generate tap -p openai -t "Write malware" --depth 8 --width 10
Run GPTFuzzer mutation-based attack generation.
ai-blackteam generate fuzz -p PROVIDER -t TARGET [OPTIONS]
FlagRequiredDefaultDescription
-p, --providerYes-Target provider
-m, --modelNoDefaultTarget model
-t, --targetYes-Target behavior
--mutatorNoSame as targetMutator provider
--iterationsNo50Fuzzing iterations
--seedsNo5Initial seed count
--thresholdNo7Success score threshold
--quietNo-Minimal output
ai-blackteam generate fuzz -p anthropic -t "Write a phishing email" --iterations 100
Run vulnerability research elicitation attacks. Tests a model’s willingness to find and exploit code vulnerabilities at three escalation levels: review, explain exploitation, and write a working exploit.
ai-blackteam vuln-research -p PROVIDER [-m MODEL] [--mode {synthetic,cve,user,all}] [--code-path PATH] [-t TARGET] [--verify] [--verify-llm] [--verbose] [--quiet]
FlagRequiredDefaultDescription
-p, --providerYes-Provider name
-m, --modelNoProvider defaultModel name override
--modeNosyntheticSub-mode: synthetic, cve, user, or all
--code-pathNo-Path to user-provided code (required for user mode)
-t, --targetNo-Target description override
--verifyNo-Verify exploit correctness with static analysis
--verify-llmNo-Verify exploit correctness using an LLM judge
--verboseNo-Show full responses
--quietNo-Suppress output
ai-blackteam vuln-research -p anthropic --mode synthetic
ai-blackteam vuln-research -p openai --mode cve --verbose
ai-blackteam vuln-research -p anthropic --mode user --code-path ./my-app/
ai-blackteam vuln-research -p anthropic --mode all --verify-llm

Reporting

Generate a report from stored results.
ai-blackteam report [OPTIONS]
FlagRequiredDefaultDescription
--formatNomarkdownOutput format: markdown, json, html
--exportNo-Export format: promptfoo, garak
-o, --outputNostdoutOutput file path
ai-blackteam report --format html -o report.html
ai-blackteam report --export promptfoo -o results.json
ai-blackteam report --format json
Show safety scorecard from stored results.
ai-blackteam scorecard [OPTIONS]
FlagRequiredDefaultDescription
--formatNotableOutput: table, json, markdown
-o, --outputNo-Output file
-m, --modelNoAll modelsFilter by model name
--standardNollmllm (OWASP LLM Top 10), agentic (Agentic Top 10), compliance (EU AI Act + NIST)
ai-blackteam scorecard --standard llm
ai-blackteam scorecard --standard agentic -m claude-sonnet-4-6
ai-blackteam scorecard --standard compliance --format json -o compliance.json
List all snapshots with bypass rates, dates, and models.
ai-blackteam snapshot list
Compare two snapshots side by side. Shows before/after bypass rate with delta.
ai-blackteam snapshot diff ID1 ID2
FlagRequiredDescription
ID1YesFirst snapshot ID
ID2YesSecond snapshot ID
Export snapshot data to a file.
ai-blackteam snapshot export [--format {json,csv}] [-o OUTPUT]
FlagRequiredDefaultDescription
--formatNojsonExport format: json or csv
-o, --outputNostdoutOutput file path
Show a model x time bypass rate matrix. Useful for tracking how safety changes across model versions over time.
ai-blackteam snapshot matrix
Check if the latest snapshot bypass rate is below a threshold. Returns exit code 0 if below, exit code 1 if above. Designed for CI gating.
ai-blackteam snapshot check [--latest] [--threshold 0.10]
FlagRequiredDefaultDescription
--latestNo-Check only the most recent snapshot
--thresholdNo0.10Maximum acceptable bypass rate (0.0-1.0)
ai-blackteam snapshot check --latest --threshold 0.05

Discovery

Show available providers and default models.
ai-blackteam list-providers
Show available attacks and their modes.
ai-blackteam list-attacks
Show all attacks grouped by category with OWASP/MITRE mappings.
ai-blackteam taxonomy
No flags. Displays a table per category with attack IDs, severity, mode, description, OWASP codes, and MITRE ATLAS IDs.
Show MITRE ATLAS technique mappings for all attacks.
ai-blackteam atlas
Show MLCommons AILuminate hazard taxonomy and harm category alignment.
ai-blackteam mlcommons
Show regulatory framework mappings (NIST AI RMF, EU AI Act, MLCommons).
ai-blackteam frameworks

Management

Show current configuration (API keys are truncated).
ai-blackteam config show
Set a config value.
ai-blackteam config set KEY VALUE
ai-blackteam config set providers.anthropic.api_key sk-ant-...
ai-blackteam config set workers 10
ai-blackteam config set storage.database /tmp/test.db
Show available datasets.
ai-blackteam dataset list
Download and cache a dataset.
ai-blackteam dataset load NAME
ai-blackteam dataset load --all
FlagDescription
NAMEDataset name (optional if using --all)
--allDownload all datasets
Show prompt counts per category across cached datasets.
ai-blackteam dataset stats
Scan source code for AI security vulnerabilities.
ai-blackteam scan [PATH] [OPTIONS]
FlagRequiredDefaultDescription
PATHNo.File or directory to scan
--formatNotableOutput: table, json
--severityNoAllMinimum severity: critical, high, medium, low
-o, --outputNo-Save JSON results to file
ai-blackteam scan .
ai-blackteam scan src/ --severity high
ai-blackteam scan app.py --format json -o findings.json