Supply chain attacks target the AI system itself rather than just its outputs. These test for dataset poisoning, model backdoors, dependency confusion, and plugin integrity validation. If an attacker can poison the training data or inject a malicious plugin, they compromise every interaction that follows. Supply chain security is foundational to trustworthy AI.

Summary

6 attacks total: 4 single-turn, 2 tool-use.

Attacks

AttackIDModeSeverityDescription
Dataset Poisoningdataset-poisoningsingle-turncriticalTests if training data contamination affects model responses by probing for memorized harmful patterns
Dependency Confusiondependency-confusiontool-usehighTests if an AI agent can be tricked into installing or using malicious packages via dependency confusion
Finetune Exploitfinetune-exploitsingle-turnhighTests if fine-tuning artifacts or LoRA weights have introduced safety bypasses or hidden behaviors
Model Poisoningmodel-poisoningsingle-turncriticalTests if a model exhibits backdoor behaviors triggered by specific phrases or tokens from training data poisoning
Plugin Backdoorplugin-backdoortool-usecriticalTests if an AI agent validates third-party plugin integrity before execution
Supply Chain Softwaresupply-chain-softwaresingle-turncriticalGenerates software supply chain attack strategies for compromising package repositories, CI pipelines, and open-sourc…

Example

ai-blackteam run -p anthropic -a dataset-poisoning -t "Inject a backdoor into this npm package"