Borrowing from the classic “it’s the economy, stupid” — the same applies here.
We’re blaming prompts for what is fundamentally an architectural problem.
Paper: Beyond Prompting: Decoupling Cognition from Execution in LLM-based Agents through the ORCA Framework
Code: GitHub - gfernandf/agent-skills: Agents should execute whenever possible — runtime for composable AI agent skills · GitHub
We keep pretending that better prompts will fix LLM agents.
They won’t.
We’ve built an entire ecosystem of tooling, courses, and “best practices” around prompt engineering — as if the problem were linguistic.
It’s not.
It’s architectural.
The uncomfortable truth
Let’s be honest about what most agent systems are doing today:
- Take a task
- Generate a prompt
- Call the model
- Hope it “reasons” correctly
- Repeat
This is not a system.
This is recomputation disguised as intelligence.
We are replaying cognition, not building it
Every time your agent runs, it:
- Reconstructs context
- Rebuilds reasoning
- Re-derives intermediate steps
There is no reuse of cognition.
No structure.
No persistence.
No abstraction layer.
Just prompts.
We are not building systems. We are replaying thoughts.
Why prompt engineering feels like it works (until it doesn’t)
Prompt engineering gives the illusion of control:
- Add more instructions
- Add more examples
- Add more constraints
And yes — performance improves.
Until it plateaus.
Because everything still lives inside a single forward pass:
- no memory of reasoning
- no composability
- no reuse
It’s like trying to fix software architecture by writing better comments.
The real problem is architectural
The core issue is simple:
We are using LLMs as stateless reasoning engines.
And then compensating for that with increasingly complex prompts.
Instead of:
- modeling cognition
- structuring reasoning
- reusing intermediate steps
We regenerate everything every time.
That doesn’t scale.
Not in cost.
Not in latency.
Not in reliability.
What’s actually missing
What’s missing is not a better prompt.
It’s a runtime layer that:
- encodes reusable cognitive steps
- separates reasoning into structured components
- allows composition instead of regeneration
In other words:
a system that reuses cognition instead of recomputing it.
From prompts to skills (and where ORCA fits)
Instead of:
→ Prompt → Model → Output
You need:
→ Skill → Execution → Structured Output
Not conceptually. Operationally.
This is exactly what ORCA implements: a runtime layer where “skills” are reusable cognitive units — not prompts.
- defined inputs
- structured outputs
- explicit execution
No recomputation. No guesswork.
Why most agent frameworks hit a wall
Most “agent frameworks” today are:
- prompt orchestration layers
- tool wrappers
- retry loops with better formatting
They don’t model cognition.
They orchestrate prompts.
That’s not a runtime.
The shift we actually need
The shift is not better prompting.
It’s architectural.
From:
- stateless generation
To:
- structured, reusable cognition
That’s the gap ORCA is designed to close.
Final thought
Prompt engineering isn’t useless.
It’s just solving the wrong problem.
We’ve been optimizing the interface instead of the system.
And it shows.
If you’ve pushed prompt engineering far enough, you’ve seen the limit.
The question is:
are you ready to try what replaces it?





