After running attacks, ai-blackteam stores every result in a local SQLite database. The ai-blackteam report command turns those stored results into something readable or machine-consumable.

Generating a report

ai-blackteam report --format markdown
ai-blackteam report --format html -o report.html
ai-blackteam report --format json -o results.json

Supported formats

FormatFlagBest for
Markdown--format markdownQuick review in terminal or pasting into docs
HTML--format htmlInteractive dark-themed dashboard for stakeholders
JSON--format jsonCI/CD pipelines, automation, programmatic analysis
Promptfoo--export promptfooImporting into Promptfoo for side-by-side comparison
garak--export garakCross-referencing with garak scan results

When to use each format

Markdown is the default. Good for quick terminal review or dropping into a GitHub issue or Slack message. HTML produces a self-contained single-file dashboard with stats cards, a full results table, and MLCommons alignment mapping. Open it in any browser. Great for sharing with security teams or leadership who want a visual overview. JSON gives you everything in a structured format. Parse it in scripts, feed it to dashboards, or use it as a CI artifact. The schema includes stats, individual run results, and standards mappings. Promptfoo exports to the EvaluateSummaryV3 format. If your team already uses Promptfoo for evals, this lets you import ai_blackteam results directly and compare them alongside your existing test suites. garak exports to JSONL with the 5 standard garak record types. Useful if you’re also running garak scans and want to compare coverage or correlate findings.

Output options

Write to a file with -o:
ai-blackteam report --format json -o safety-report.json
Without -o, the report prints to stdout. Pipe it wherever you need:
ai-blackteam report --format json | jq '.stats'

What’s in a report

Every report format includes:
  • Stats - total runs, bypassed count, blocked count, models tested, attacks used
  • Results - per-run detail with model, attack, verdict, confidence score, and token usage
  • MLCommons alignment - how ai-blackteam’s harm categories map to the MLCommons AILuminate hazard taxonomy

HTML Dashboard

Interactive visual report

JSON Export

Machine-readable for CI/CD

Promptfoo Export

EvaluateSummaryV3 format

garak Export

JSONL for garak compatibility