Instructions to use ibm-granite/granitelib-guardian-r1.0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Granite Library
How to use ibm-granite/granitelib-guardian-r1.0 with Granite Library:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- PEFT
How to use ibm-granite/granitelib-guardian-r1.0 with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
Granite Guardian Library
The Guardian Library includes four families of LoRA adapters, each developed for a specific task related to safety, factuality, and policy compliance in LLM-based systems. We provide adapters for:
- ibm-granite/granite-4.0-micro
- ibm-granite/granite-4.1-3b
- ibm-granite/granite-4.1-8b
- ibm-granite/granite-4.1-30b
We give a brief overview of the functionality of each capability below; details can be found in each individual adapter readme.
Capabilities implemented as LoRA adapters
The four capabilities that have been implemented as LoRA adapters and made available in this HF repository are:
Guardian Core: A LoRA adapter trained to judge whether the input prompts and output responses of an LLM-based system meet specified criteria, including safety risks (harm, jailbreaking, profanity, violence, sexual content, social bias, unethical behavior), hallucinations related to tool/function calls, and retrieval-augmented generation (RAG) in agent-based systems. The model outputs a JSON object with a score field indicating "yes" (criteria met / risk detected) or "no" (criteria not met / no risk). Details can be found in the guardian-core readme.
Factuality Detection: A LoRA adapter specifically designed to assess factual correctness by explicitly taking into account contextual passages that may contain contradicting or conflicting information. Rather than assuming contextual consistency, the adapter evaluates LLM-generated responses against one or more context sources and identifies cases where the response conflicts with, misrepresents, or selectively ignores evidence present in those contexts. Details can be found in the factuality-detection readme.
Factuality Correction: A LoRA adapter specifically designed to correct factually incorrect LLM-generated responses by explicitly taking into account contextual passages that may contain contradicting or conflicting information. The adapter is capable of correcting factual inaccuracies in long-form responses composed of multiple atomic units—such as individual facts or claims—while preserving the full generative and reasoning capabilities of the base model. Details can be found in the factuality-correction readme.
Policy Guardrails: A LoRA adapter that provides policy compliance checking. Given a policy and a scenario, it enables the base model to accurately decide whether the scenario complies with, or violates, the given policy. It provides a third response ('Ambiguous') if it is not possible to decide compliance/non-compliance with a high level of certainty. Details can be found in the policy-guardrails readme.
Recommended Use
The recommended way to call all adapters is through the Mellea framework. For code snippets demonstrating how to use them please refer to the Mellea adapter examples (note: within Mellea, adapters are referred to as "intrinsics").
Model Signing
All adapter artifacts in this repository are signed to ensure integrity and provenance. Each adapter includes a model.sig signature file in its lora/ directory that covers all artifacts in that directory (adapter_config.json, adapter_model.safetensors, io.yaml).
| Adapter | Signature File Path (example) | Signing Identity |
|---|---|---|
| Guardian Core | guardian-core/granite-4.1-3b/lora/model.sig |
Granite-sign@ibm.com |
| Factuality Detection | factuality-detection/granite-4.1-3b/lora/model.sig |
Granite-sign@ibm.com |
| Factuality Correction | factuality-correction/granite-4.1-3b/lora/model.sig |
Granite-sign@ibm.com |
| Policy Guardrails | policy-guardrails/granite-4.1-3b/lora/model.sig |
Granite-sign@ibm.com |
The same pattern applies to all model variants (granite-4.0-micro, granite-4.1-3b, granite-4.1-8b, granite-4.1-30b).
Verifying Model Signatures
To verify the integrity of a downloaded adapter, use the model-signing tool:
# Install the model signing verification tool
pip install model-signing
# Verify all artifacts in an adapter's lora/ directory
model_signing verify sigstore \
--signature <adapter>/<model-variant>/lora/model.sig \
--identity Granite-sign@ibm.com \
--identity_provider https://sigstore.verify.ibm.com/oauth2 \
<adapter>/<model-variant>/lora/
For example, to verify the guardian-core adapter for granite-4.1-3b:
model_signing verify sigstore \
--signature guardian-core/granite-4.1-3b/lora/model.sig \
--identity Granite-sign@ibm.com \
--identity_provider https://sigstore.verify.ibm.com/oauth2 \
guardian-core/granite-4.1-3b/lora/
Each model.sig file contains a signature over all adapter artifacts in the corresponding lora/ directory, signed with the identity Granite-sign@ibm.com. This allows users to confirm that the adapter has not been tampered with after release.
Resources
- Learn about the latest updates with Granite: https://www.ibm.com/granite
- Get started with tutorials, best practices, and prompt engineering advice: https://www.ibm.com/granite/docs/
- Learn about the latest Granite learning resources: https://research.ibm.com/blog/granite-4-1-ai-foundation-models
- Downloads last month
- 3,754
Model tree for ibm-granite/granitelib-guardian-r1.0
Base model
ibm-granite/granite-4.0-micro