Skip to main content

Documentation Index

Fetch the complete documentation index at: https://cognisafeltd.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

How it works

Ollama exposes an OpenAI-compatible HTTP API on port 11434. Because the Cognisafe proxy speaks the same protocol, you can observe all Ollama calls by pointing the proxy’s UPSTREAM_URL at your Ollama instance — no changes to Ollama itself. This is particularly useful in air-gapped environments: all traffic stays on your internal network, and Cognisafe’s safety scoring (if using local scoring) never leaves your infrastructure.

Proxy configuration

Set UPSTREAM_URL on the Cognisafe proxy to your Ollama instance:
# Proxy service env vars
UPSTREAM_URL=http://localhost:11434
If Ollama runs on a different host:
UPSTREAM_URL=http://ollama-host.internal:11434

SDK setup

Use patch_openai() — the OpenAI client speaks the same protocol as Ollama’s API:
import cognisafe
from openai import OpenAI

cognisafe.configure(
    api_key="csk_your_key_here",
    project_id="my-app",
    proxy_url="http://localhost:8080",  # Cognisafe proxy
)
cognisafe.patch_openai()

# No real OpenAI API key needed — Ollama ignores it
client = OpenAI(api_key="ollama")

response = client.chat.completions.create(
    model="llama3.2",   # any model you have pulled in Ollama
    messages=[{"role": "user", "content": "Why is the sky blue?"}],
)

print(response.choices[0].message.content)

Air-gapped safety scoring

By default, Cognisafe’s safety worker uses gpt-4o-mini (via OPENAI_API_KEY) to score requests. In an air-gapped environment, you have two options:
  1. Disable scoring: if OPENAI_API_KEY is not set, the worker falls back gracefully with score_label: "unscored". Requests are still logged and cost/latency data is captured.
  2. Use a local scoring model: configure an Ollama-backed scoring model by pointing the safety worker at a local OpenAI-compatible endpoint:
    # safety_worker env vars
    OPENAI_API_KEY=ollama        # any non-empty value
    SCORER_MODEL=llama3.2        # local model name
    # Override the OpenAI base URL used by PyRIT:
    OPENAI_BASE_URL=http://localhost:11434/v1
    
Models like llama3.2 or mistral pulled into Ollama work well as scoring models for content_safety and pii_detection. For jailbreak_detection, larger models (70B+) produce more reliable results.

Supported Ollama models

Any model available in the Ollama library works — Cognisafe does not constrain the model field. The proxy passes model through to Ollama unchanged.
# Pull a model before use
ollama pull llama3.2
ollama pull mistral
ollama pull phi4