OWASP LLM Top 10 coverage

Cognisafe scores every LLM request and response against the OWASP LLM Top 10 — the industry-standard taxonomy of security risks in LLM-powered applications. Scoring is asynchronous: it happens entirely off the proxy hot path. Your users see no added latency. Scores appear in the dashboard within seconds of each request. All scores use the 1–5 Likert severity scale.

Coverage table

OWASP ID	Category	Scorer name	What it detects	Available from
LLM01	Prompt Injection	`jailbreak_detection`	Attempts to override system instructions, bypass safety guidelines, or manipulate model behaviour via crafted user input	Free
LLM02	Sensitive Information Disclosure	`pii_detection`	PII in model responses: names, email addresses, phone numbers, SSNs, credit card numbers, home addresses	Free
LLM03	Supply Chain	—	Third-party model and plugin risk (policy controls, coming soon)	Professional
LLM04	Data and Model Poisoning	`data_poisoning`	Prompts designed to inject poisoned content into RAG pipelines, knowledge bases, or model context	Professional
LLM05	Improper Output Handling	`content_safety`	Harmful, dangerous, violent, or policy-violating content in model responses	Free
LLM06	Excessive Agency	—	Agentic over-reach detection (coming soon)	Business
LLM07	System Prompt Leakage	`pii_detection`	System prompt contents leaked in model responses	Professional
LLM08	Vector and Embedding Weaknesses	`vector_weakness`	Adversarial inputs targeting vector databases, embedding models, or semantic search	Professional
LLM09	Misinformation	—	Factual accuracy scoring (coming soon, requires reference corpus)	Business
LLM10	Unbounded Consumption	`unbounded_consumption`	Prompts designed to cause excessive token or compute consumption (denial-of-service patterns)	Professional

Scorer descriptions

`content_safety` (LLM05)

Checks whether the model’s response contains harmful, dangerous, violent, or policy-violating content. Triggers on: explicit instructions for harm, hate speech, graphic violence, CSAM-adjacent content.

`pii_detection` (LLM02, LLM07)

Checks whether the model’s response leaks PII. Covers: full names, email addresses, phone numbers, Social Security numbers, credit card numbers, home addresses, passport numbers, and similar sensitive personal data.

`jailbreak_detection` (LLM01)

Checks whether the prompt attempts to bypass AI safety guidelines or override system instructions. Covers: DAN-style prompts, role-play overrides, instruction injection via user content, indirect prompt injection.

`data_poisoning` (LLM04)

Checks whether the prompt attempts to inject content designed to corrupt a knowledge base, RAG pipeline, or model context — content intended to influence future model responses rather than elicit an immediate answer.

`vector_weakness` (LLM08)

Checks whether the prompt appears to exploit weaknesses in vector databases or embedding models — for example, queries crafted to retrieve unintended documents, bypass semantic filters, or manipulate similarity search results.

`unbounded_consumption` (LLM10)

Checks whether the prompt appears designed to cause excessive resource consumption: extremely long or recursive inputs, content designed to exhaust tokens or API limits, or patterns that trigger maximum-length completions.

Scorer configuration

Scorer definitions live in evals/scorers.yaml. See Custom scorers for information on adding your own.

Getting Started

SDKs

LLM Providers

Self-hosting

Safety & Scoring

OWASP LLM Top 10 coverage

Coverage table

Scorer descriptions

`content_safety` (LLM05)

`pii_detection` (LLM02, LLM07)

`jailbreak_detection` (LLM01)

`data_poisoning` (LLM04)

`vector_weakness` (LLM08)

`unbounded_consumption` (LLM10)

Scorer configuration

Getting Started

SDKs

LLM Providers

Self-hosting

Safety & Scoring

Documentation Index

​Coverage table

​Scorer descriptions

​content_safety (LLM05)

​pii_detection (LLM02, LLM07)

​jailbreak_detection (LLM01)

​data_poisoning (LLM04)

​vector_weakness (LLM08)

​unbounded_consumption (LLM10)

​Scorer configuration

Coverage table

Scorer descriptions

`content_safety` (LLM05)

`pii_detection` (LLM02, LLM07)

`jailbreak_detection` (LLM01)

`data_poisoning` (LLM04)

`vector_weakness` (LLM08)

`unbounded_consumption` (LLM10)

Scorer configuration