Making local LLMs more reliable with a deterministic “context compiler”

I’ve been experimenting with running LLMs locally, and kept running into a common issue:

constraints and corrections drift out of the prompt over time

Example:

  • User: “don’t use peanuts”

  • …long conversation…

  • Model suggests something with peanuts anyway

This gets worse with smaller models or limited context windows.

So I built a small deterministic tool called a context compiler.

Instead of relying only on the transcript, it extracts structured state like:

  • facts.focus.primary = "vegan curry"

  • policies.prohibit = ["peanuts"]

Then that state is injected into the prompt every turn, so important constraints don’t get lost.

Key idea:

  • prompt engineering helps

  • compiled state makes constraints persistent

I added a set of demos comparing:

  • baseline prompting

  • stronger prompt engineering

  • prompt + compiled state

The interesting part is that better prompting improves things, but the compiled state is what actually guarantees invariants.

Repo + demos:

2 Likes

Update: pushed this further and added integration examples (OpenWebUI + LiteLLM).

One thing that stood out is that “do Y instead of X” breaks much more often than just “don’t do X”.

Example:

User: use pytest
User: use unittest instead of pytest

Without a state layer → models often keep using pytest or mix both
With it → pytest is removed and unittest is enforced deterministically

So it’s not just about constraints drifting out of context — it’s that corrections don’t reliably replace earlier instructions when everything lives in the prompt.

Also experimented with mapping natural language into explicit directives first (heuristic + LLM fallback), which helps pick up more constraints without losing determinism.

Also working on a TypeScript port with feature parity with the Python version.

1 Like