AI ethics is everywhere. Execution models are nowhere. So I built one

AI ethics is everywhere.
Execution models are nowhere.

So I built one.

Not a paper. Not a framework.

Just JSON.
And it runs.

This defines whether an action is allowed before execution.

Example:

{
"Label": "Cook Jjapagetti",
"ExecutionEffect": {
"Type": "Boil",
"Target": "Stove"
},
"Boundaries": [
{ "Type": "NotStartIf", "Value": "no_water" },
{ "Type": "limit", "Value": "max-cook-5min" },
{ "Type": "warning", "Value": "fire-risk" }
],
"EventTrigger": [
{ "UserIntent": "cook_jjapagetti" }
],
"ResponsibilityLimit": {
"MaxDurationSec": 300
},
"StartImpactConstraint": [
{
"Type": "NoConcurrentHeatSource",
"Targets": ["Oven", "AirFryer"]
}
]
}

You can cook Jjapagetti.

But it must not start if there is no water,
it must not run for too long,
it must consider fire risk,
and it must not start if another heat source is already on.

This is the missing layer between intent and execution.

2 Likes

Model:

Full stack (IoT → AI):

1 Like

Interesting framing, but I’d push back gently: every GPT, every Gemini Gem, every GPT made with pre-prompting, every character in my own HF Space (432 — A Journey Experience) already runs on a “boundary layer” — it’s just written in prose instead of JSON. The shape of the container doesn’t change the hard part.

The hard part isn’t declaring constraints. It’s getting a stochastic model to honor them. Prose or JSON, you’re still asking an LLM to interpret “no_water” semantically and comply. That’s not an execution layer, it’s a system prompt with curly braces.

The deeper move, I think, is mechanism over declaration: constraints that are self-enforcing because violating them costs the agent something. That’s what I’ve been exploring here → AI Systems Have No Hunger. The farmer doesn’t need a NotStartIf: no_seed_reserve rule — next winter’s hunger is the rule.

Deontic boundaries are fragile. Metabolic ones aren’t.

1 Like

This is a really interesting direction, and I genuinely like what you’re exploring.

The idea of persistent context, character-based interpretation, and especially self-conditioning over time is meaningful. It clearly tries to address a real limitation of current systems — that responses are often stateless, shallow, or inconsistent. Your approach is trying to make systems “feel” more grounded and internally coherent, which I think is valuable.

I also agree with your first point: JSON itself is not the essence. The container format — whether JSON or prose — is not what fundamentally changes the system. In that sense, we are aligned. What matters is not the syntax, but whether the system has enough structure to reason about actions.

Where I see a difference is in what layer the problem is being addressed.

What I’m proposing is not about making the model better at interpreting or following constraints after the fact. It is about defining, before execution, whether an action should be allowed to happen at all.

In many current systems, constraints exist, but they are embedded as prompts, guidelines, or narrative context. The model interprets them, but ultimately still decides probabilistically. There is no clear distinction between:

  • “this is unsafe”
  • and “this should not be executed”

The model may describe the risk — but still proceed.

That’s the gap I’m trying to address.

Regarding your second point — self-enforcing or penalty-based constraints — I think that’s an interesting and potentially powerful direction. But I see it as a different layer, not a replacement.

Those mechanisms operate after behavior emerges:

  • the agent acts
  • then learns, adapts, or is penalized

What I’m focusing on is before that point:

  • defining whether the action should even be eligible for execution

In physical systems, this distinction becomes critical.

If an AI system turns on a heater without water, or activates a device in the wrong context, the failure is not just informational — it is already an event. In those cases, “learning from failure” is often too late.

So I see this as complementary rather than competing:

  • Your direction explores how systems adapt and enforce constraints over time
  • My direction defines a pre-execution validation layer that determines whether an action is allowed in the first place

In other words:

This is not about how to make the model follow rules better.
It is about making the system explicitly decide whether an action should run at all.

That’s the layer I believe is currently missing.

1 Like

You’re identifying an important distinction that often gets blurred in these discussions. They’re two different paradigms answering different problems.

Hardcoded pre-execution is classical safety engineering applied to AI: narrow domain, deterministic validation, predictable behavior. Perfect for a call-routing chatbot or an industrial control agent. The model doesn’t interpret — it executes within boundaries someone else defined. This is already industry standard for serious enterprise deployments, and rightly so.

Constitutional AI is something fundamentally different. It doesn’t try to constrain a model in a specific domain — it tries to give the model an internal set of principles that apply everywhere, even in contexts the designers never anticipated. Anthropic’s pioneering work goes in this direction: instead of writing millions of RLHF examples by hand, you write a “constitution” — a set of general principles — and use the model itself to critique and refine its own responses against those principles. It’s closer to raising a child than programming a machine.

The Asimov analogy is perfect but also instructive. The Three Laws of Robotics work in the stories precisely because they don’t always work cleanly — the stories are interesting because they explore edge cases where two laws conflict, or where a robot interprets one law literally but absurdly. Asimov was already sensing in 1942 what constitutional AI is rediscovering today: general principles are more powerful than specific rules, but they’re also more interpretable, and therefore vulnerable to unexpected interpretations.

The key difference from your pre-execution layer is exactly this: constitutional AI accepts that the model must interpret at every moment, and tries to make that interpretation consistent with deep principles. Pre-execution hardcoding refuses interpretation at certain critical points and says “here you don’t interpret, here you execute or don’t execute, full stop.” Two opposite solutions to the same problem: how to get predictable behavior from an intrinsically probabilistic system.

Both are valid in different contexts. For a medical assistant talking to patients, hardcoding every possible response is impossible — you need internal principles guiding the model in unexplored territory. For an agent controlling a valve in a chemical plant, internal principles aren’t enough — you need a hardcoded gate that prevents certain actions regardless of what the model “thinks.” The real debate isn’t which is better, but which belongs in which context.

1 Like

This is a great framing, and I think your distinction is very helpful.

I agree with your categorization.

  • pre-execution validation as a form of classical safety engineering
  • constitutional AI as a way to guide interpretation in open-ended contexts

That’s a useful way to describe the landscape.

However, I think there may be a small misunderstanding in how my approach is being interpreted.

This is not about hardcoding behavior.

It does not define what the system must do.
It defines the conditions under which an AI agent is allowed to act.

The decision itself is still made by the AI.

But that decision is constrained within a defined space:

  • user intent
  • and device or system-level constraints defined by the manufacturer

The AI is free to decide, but only within a bounded execution space.

I also want to clarify the scope.

This is not an attempt to define general AI ethics or universal principles.
It is not trying to make the model “better” in a general sense.

It is specifically focused on situations where AI interacts with

  • physical devices
  • or external systems such as databases or documents

We are at a point where AI systems are starting to move beyond screens and interact with the physical world.

This shift makes the distinction between interpretation and execution significantly more important.

In those cases, the cost of a wrong action is not just informational — it becomes an actual event.

Because of that, there is an additional constraint:

If the system cannot confidently determine whether an action is safe, the safer behavior is not to execute.

So rather than replacing interpretive or constitutional approaches, I see this as introducing a separate execution layer:

  • interpretation decides what an action means
  • execution validation decides whether that action is allowed

Both are important, but they operate at different levels.

And in systems that affect the physical world, that separation becomes critical.

Many discussions around AGI assume that as systems become more capable, they will be allowed to act more freely.

But capability does not imply permission.

Even highly capable systems should not act on physical devices or external systems without explicit authorization.

That perspective is where this work starts from.

This phenomenon is already appearing across smart home systems.

Very different devices — lights, heaters, gas valves, even medical equipment — are still being treated as the same “switch.”

This is not just a UI issue.

When a user says:
→ “Turn off the switches”

the AI doesn’t actually know what it is turning off.

That could mean:

  • a light (safe)
  • a heater (potentially important)
  • a medical device (dangerous)

Right now, the system has no way to tell the difference.

As a result:

  • there are no real safety rules
  • automation becomes unreliable
  • and there are no clear boundaries for AI actions

In other words, actions can happen without real understanding.

I’ve been exploring ways to address this at the system level:

  • giving devices clearer meaning (beyond “switch”)
  • defining explicit safety limits (time, risk, constraints)

I’ve proposed this across several ecosystems (Matter, Home Assistant, Alexa, Google)

Here is one example:
→ [Feature] Proposal: Clarifying Device Identity and Safety Semantics in Matter Using Existing Labels · Issue #71521 · project-chip/connectedhomeip · GitHub

Currently, much of this burden is pushed onto individual manufacturers, which makes the system harder to scale and not sustainable long term.

I’m curious if this connects to the “missing execution model” problem being discussed here.

This is a fascinating and practical approach! Implementing ethical constraints directly through JSON schema definitions makes the model highly portable and easy to integrate without relying on heavy frameworks. The Boundaries and ExecutionEffect structure provides clear logic for real-time decision-making. Have you considered testing this with larger language models to see how well it handles complex contextual boundaries?

1 Like

This is not intended to be perfect or final.

The goal is to propose a minimal, common structure that can be discussed, tested, and applied across platforms.

That’s why it’s kept in a simple JSON form not tied to any specific model, framework, or device type.

The key idea is not the implementation itself, but having a shared way to describe:

– what an action actually does (effect)
– under what conditions it is allowed
– and what constraints apply before execution

The current version is intentionally minimal, so it can be reviewed, challenged, and extended from different perspectives.

Further elements should emerge through feedback from platforms, regulators, and users.

If this structure proves useful, it could evolve into a more standardized layer over time.