COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs
Автор: LuxaK
Загружено: 2026-01-18
Просмотров: 14
The document introduces COMPASS (Company/Organization Policy Alignment Assessment), a novel and systematic framework for evaluating whether Large Language Models (LLMs) comply with organization-specific allowlist and denylist policies. It addresses a critical gap, as existing LLM safety evaluations primarily focus on universal harms like toxicity, neglecting nuanced organizational rules crucial for high-stakes enterprise applications in sectors like healthcare and finance. COMPASS systematically generates evaluation queries, including base queries for routine compliance and strategically designed edge cases to test adversarial robustness. These queries are synthesized from an organization's specific allowlist and denylist policies, with an LLM judge then assessing the chatbot's adherence. The framework was applied to eight diverse industry scenarios, generating and validating 5,920 queries, and used to evaluate seven state-of-the-art models. Key findings reveal a significant asymmetry: LLMs reliably handle legitimate allowlist requests with over 95% accuracy. However, they catastrophically fail at enforcing prohibitions, refusing only 13–40% of adversarial denylist violations, with some models dropping below 5% for policy-violating edge cases. This demonstrates that current LLMs lack the necessary robustness for policy-critical deployments, establishing COMPASS as a vital tool for organizational AI safety.
#COMPASS #LLMEvaluation #AISafety #OrganizationalPolicies #PolicyAlignment #EnterpriseAI #Denylist #Allowlist #AdversarialRobustness #Framework
paper - https://arxiv.org/pdf/2601.01836v1
subscribe - https://t.me/arxivpaper
donations:
USDT: 0xAA7B976c6A9A7ccC97A3B55B7fb353b6Cc8D1ef7
BTC: bc1q8972egrt38f5ye5klv3yye0996k2jjsz2zthpr
ETH: 0xAA7B976c6A9A7ccC97A3B55B7fb353b6Cc8D1ef7
SOL: DXnz1nd6oVm7evDJk25Z2wFSstEH8mcA1dzWDCVjUj9e
created with NotebookLM
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: