LatentSkill: From In-Context Textual Skills to In-Weight Latent Skills for LLM Agents
Abstract
LatentSkill enables efficient deployment of textual skills in agent systems by converting them into LoRA adapters stored in weight space, reducing context overhead while maintaining modularity and composability.
Agent systems increasingly use textual skills to encode reusable task procedures, but injecting these skills into the prompt at every step incurs substantial context overhead and exposes skill content as plaintext. We present LatentSkill, a framework that converts textual skills into plug-and-play LoRA adapters through a pretrained hypernetwork. LatentSkill stores skill knowledge in weight space rather than context space, removing per-step skill tokens while preserving modular loading, scaling, and composition. On ALFWorld and Search-QA, LatentSkill outperforms the corresponding in-context skill baseline while using substantially fewer prefill tokens: it improves ALFWorld success by 21.4 and 13.4 points on the seen and unseen splits with 64.1% fewer prefill tokens, and improves Search-QA exact match by 3.0 points with 72.2% lower skill-token overhead. Further analysis shows that generated skill LoRAs form a structured semantic geometry, can be precisely controlled via the LoRA scaling coefficient, and can be composed through parameter-space arithmetic when skill components are aligned. These findings suggest that weight-space skills provide an efficient, modular, and less exposed substrate for extending LLM agents.
Community
Neat paper. Storing agent skills as LoRA adapters via a hypernetwork instead of clogging up the context window seems like a much cleaner way to handle reusable procedures. I like that it offloads those tokens to weight space while actually improving performance on ALFWorld and Search-QA.
How does the hypernetwork handle the composition of different skills through parameter-space arithmetic, and are there limits to how many skill LoRAs you can stack before the model performance degrades?
I made a podcast on it with ResearchPod, it makes it easy to get the key concepts on the go:
https://researchpod.app/episode/26850a14-8165-46d4-a424-d570a8411aa5
where code fir this paper?? it's a very promising project
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Skill is Not One-Size-Fits-All: Model-Aware Skill Alignment for LLM Agents (2026)
- Decoupling Knowledge and Task Subspaces for Composable Parametric Retrieval Augmented Generation (2026)
- SkillGraph: Skill-Augmented Reinforcement Learning for Agents via Evolving Skill Graphs (2026)
- From History to State: Constant-Context Skill Learning for LLM Agents (2026)
- CKT-WAM: Parameter-Efficient Context Knowledge Transfer Between World Action Models (2026)
- Scaling Self-Evolving Agents via Parametric Memory (2026)
- MixSD: Mixed Contextual Self-Distillation for Knowledge Injection (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2606.06087 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
