LionGuard: Building a Contextualized Moderation Classifier to Tackle Localized Unsafe Content Paper • 2407.10995 • Published Jun 24, 2024 • 2
A Flexible Large Language Models Guardrail Development Methodology Applied to Off-Topic Prompt Detection Paper • 2411.12946 • Published Nov 20, 2024 • 22
Off Topic Guardrail 🛡️ Collection Fast, lightweight zero-shot classifiers for user prompt's relevance to the system prompt. • 5 items • Updated Jul 28, 2025 • 4