Evals & Testing

Inspect AI

Government-built open-source framework for rigorous LLM and agent evaluations, popular for safety benchmarks and sandboxed agentic tasks.

Website ↗ GitHub ↗
CategoryEvals & Testing
Open sourceYes
Self-hostableYes
Pricing modelfree
Pricing notesMIT OSS, free; no commercial tier
Framework integrationsopenai-sdk, anthropic, ollama, huggingface
Funding / ownershipBuilt by the UK AI Security Institute (government)

Pricing/feature source: https://github.com/UKGovernmentBEIS/inspect_ai