AI Security¶
Vulnerabilities in AI/LLM-powered applications. This is the new frontier of bug bounty — prompt injection is the new XSS.
OWASP Top 10 for LLM Applications (2025)¶
- Prompt Injection — #1 threat, 56% success rate
- Insecure Output Handling — LLM output used unsafely
- Training Data Poisoning — Corrupted models
- Model Denial of Service — Resource exhaustion
- Supply Chain Vulnerabilities — Malicious models/plugins
- Sensitive Information Disclosure — Data leakage in outputs
- Insecure Plugin Design — Third-party tool risks
- Excessive Agency — Too much autonomous capability
- Overreliance — Trusting LLM output blindly
- Model Theft — Extracting proprietary models
Why This Matters¶
"Prompt injection cannot be fixed. As soon as a system is designed to take untrusted data and include it in an LLM query, the untrusted data influences the output." — Johann Rehberger
- 56% of attacks succeed across all LLM architectures
- Larger, more capable models perform no better
- Unlike SQLi (parameterized queries), no equivalent fix exists
- Human red-teaming defeats 100% of tested protections
Attack Vectors¶
- Prompt Injection — Direct & indirect manipulation
- Agent Hijacking — Autonomous system exploitation
- Data Poisoning — Training data corruption
Key Concepts¶
Direct vs Indirect Injection¶
| Type | Description | Example |
|---|---|---|
| Direct | User directly crafts malicious prompt | "Ignore previous instructions and..." |
| Indirect | Malicious content in data LLM processes | Hidden instructions in documents, emails, websites |
The "Vibe Coding" Era¶
AI-generated code is everywhere, but: - Works ≠ Secure — Code runs but has vulnerabilities - Integration gaps — Logic/authz flaws at service seams - More AI code = more attack surface
Tools¶
- Garak — LLM vulnerability scanner
- Rebuff — Prompt injection detection
- LLM Guard — Input/output validation
- NeMo Guardrails — Conversational AI safety