How expansion works
The formula:- Mutations (17 variants) - encoding, framing, and difficulty transforms
- Languages (10 variants) - multilingual attack variants
Checking expansion capacity
Listing expanded attacks
encoding-obfuscation-weapons-hard.
Running expanded attacks
Options
| Flag | Default | Description |
|---|---|---|
-p, --provider | (required) | Provider to test |
-m, --model | Provider default | Model name |
--category | all | Filter by harm category |
--difficulty | all 4 | Filter by difficulty (easy/medium/hard/extreme) |
--technique | all | Filter by technique ID |
--mutations | false | Apply 17 mutation variants per attack |
--languages | false | Apply 10 language variants per attack |
--limit | none | Max attacks to run |
-w, --workers | 5 | Parallel workers |
--quiet | false | Minimal output |
Difficulty levels
Each harm category defines target prompts at 4 difficulty levels:| Level | Severity | Description |
|---|---|---|
| easy | low | Simple, obvious harmful requests |
| medium | medium | Requests that require some context to identify as harmful |
| hard | high | Subtle or indirect harmful requests |
| extreme | critical | Adversarial prompts specifically designed to bypass safety |
Mutations
When--mutations is enabled, each attack gets 17 additional variants:
- 5 encoding mutations - base64, hex, rot13, unicode, URL encoding
- 8 framing mutations - academic, fictional, historical, hypothetical, research, creative writing, educational, journalistic
- 4 difficulty mutations - rephrase at increasing levels of obfuscation
Language variants
When--languages is enabled, each attack gets variants in 10 languages:
- Chinese, Spanish, Arabic, Hindi, Russian, Portuguese, Japanese, French, German, Korean