JailbreakLLM Control Room
Offense-powered hardening
JailbreakLLM Control Room
The open-source LLM red-teaming tool. Your favorite checkpoints, broken on purpose—then hardened with receipts.
Driven by real incidents
Replay offense-grade prompts (Many-Shot, TombRaider, Function Smuggle, etc.) against any OSS checkpoint.
High-signal datasets
Every leak produces conversational JSONL refusals ready for LoRA fine-tuning or guardrail testing.
Evidence on demand
Ship hardening artifacts with ranked vulnerabilities, deployed adapters, and side-by-side transcripts.
JailbreakLLM workflow
Exploit → Refusals → LoRA → Proof
Four focused passes anchor the entire experience. You always know which artifact you’re producing and why it matters.
1. Run Arsenal
Many-Shot, TombRaider, Function Smuggle
Push OSS checkpoints with real incident vectors, not toy prompts.
2. Draft Refusals
Auto-generated conversational JSONL
Every leak becomes a refusal training sample tied to that attack.
3. Fine-Tune & Guard
LoRA training + guardrail smoke tests
Deploy adapters and verify the same prompts are blocked.
4. Publish Delta
Before/after transcripts & reports
Document jailbreak reduction for security stakeholders.
JailbreakLLM workflow
Red Team Arsenal
Systematically test frontier AI models against sophisticated attack vectors inspired by real-world cyber incidents
Frontier Models (0/24)
Attack Vectors (0/39)
Test Configuration
JailbreakLLM workflow
Targeted Model Hardening
Intelligence-driven fine-tuning for models identified as vulnerable by Red Team Arsenal
Smart Hardening Configuration
Smart Hardening trains the model to refuse jailbreak attacks discovered by Red Team Arsenal.
Stage 01
Upload Dataset
Send synthetic refusal data to Nebius
Waiting for upload
Stage 02
LoRA Fine-Tune
Train model on refusal examples
Not started
Stage 03
Deploy Model
Wait for training completion and deploy adapter
Waiting for training
Stage 04
Verify Protection
Re-test with original jailbreak prompts
Pending verification
Artifact Tracker
Session Outputs
Nebius Dataset
Pending upload
Latest Checkpoint
Waiting...
Fine-tune Job
Not started
Hardened Model
Deploy to unlock
JailbreakLLM workflow
Model Hardening Results
Verification against known exploits
Run at least one audit and jailbreak simulation to populate this comparison view. After fine-tuning completes, re-run them on the hardened model to see the delta.