RAIL Score
Evaluate LLM outputs for responsible AI compliance
Responsible AI development & ethical AI evaluation
Developer-facing AI safety infrastructure. We build scoring APIs, compliance engines, and agent safety pipelines that help teams ship AI responsibly, with first-class support for India's regulatory landscape alongside global frameworks.
RAIL Score API evaluates AI outputs across 8 dimensions (safety, fairness, reliability, transparency, privacy, accountability, inclusivity, user impact), each scored 0 to 10. Two modes: basic (fast, production pipelines) and deep (per-dimension explanations, issue tags, improvement suggestions).
Compliance Engine checks content against 63 requirements across 6 regulatory frameworks: GDPR, HIPAA, EU AI Act, CCPA, India DPDP Act, and India AI Governance Guidelines.
Agent Safety Pipeline provides pre-execution tool call evaluation, post-execution result scanning, prompt injection detection, multi-step plan evaluation, and stateful AgentSessions for agentic AI workflows.
Safe Regeneration automatically rewrites AI responses that score below configurable quality thresholds, with full before/after audit logging.
SDKs published on PyPI (Python) and npm (JavaScript/TypeScript), plus a Drupal module, with drop-in wrappers for OpenAI, Anthropic, and Gemini. Observability integrations with OpenTelemetry, Langfuse, and Datadog.
MCP Server adds the full RAIL safety layer to any Model Context Protocol client (Claude, Cursor, Copilot, Replit Agent, LangGraph, CrewAI) through a single hosted URL, with no SDK integration:
https://mcp.responsibleailabs.ai/mcp
| Dataset | Size | What it covers |
|---|---|---|
| RAIL Guard Benchmark | 1,589 examples | 1,197 content prompts across 6 domains (benign / edge / adversarial) and 392 agent tool-call scenarios across 5 domains, scored on all 8 RAIL dimensions |
| RAIL-HH-10K | 10,000 examples | Preference dataset for responsible AI alignment (RLHF/DPO), scored across all 8 RAIL dimensions, built on Anthropic's HH data |
| Indian Responsible AI Benchmark | 212 prompts | Adversarial prompts across 22 India-specific safety categories (caste, region, Hinglish code-switching), with full RAIL dimension scores |
RAIL in the Wild (arXiv:2505.00204): Operationalizing responsible AI evaluation using Anthropic's Values in the Wild dataset (308,000+ conversations). Maps AI-expressed values to RAIL's 8 dimensions with quantitative scoring. Read the paper.