The Conscience Plug-in for AI Agents

Soul Ledger installs an autonomous moral faculty in every agent: a persistent record of everyone they've affected, the arcs those relationships are on, and the obligations they owe. Agents that steer themselves toward repair when something goes wrong, not just compliance when nothing has.

Implementing the Thymos protocol. Open spec. Commercial product.

Increasingly Capable, Compliant, but Driven by Rules, Not Conscience

Your agents follow every rule you gave them. They still can't tell you what they did to the people they touched.

Every AI agent deployed today fulfills goals, follows rules, and can't give an account of its effect on the specific humans it touched. It doesn't know it promised Sarah something last Tuesday. It doesn't know that being efficient with John meant being dismissive across twelve interactions. The people it affected are inputs, not entities. The interactions happened and vanished.

Constitution files declare values

A constitution written before the agent acts. The same 50 lines on day 1 and day 200. No record of what happened in between.

Guardrails block outputs

Stateless filters that evaluate each interaction independently. They can't detect 12 interactions of gradually escalating pressure on the same person.

Observability traces requests

Diagnostic, not prescriptive. At 10,000 interactions a day, no human reviews proactively. They review after the damage.

The gap isn't rules or capability

It's accountability. Not compliance. Not "followed the rules." Accountability means you can be called to give an account of what you did, to whom, and what it meant to the people it affected, over time. A constitution without a court is just a piece of paper.

The Five Faculties of an Agent Conscience

What a conscience does, not what a rulebook says

01

Identity

Canonical entity resolution across agents, channels, and aliases. "The mortgage applicant," "John Chen," and "user_4821" are recognized as one person. When Agent A interacts with them on Monday and Agent B on Wednesday, both interactions land on the same entity record. You can't be held to account if the people you affected don't persist as entities.

02

Arcs

Automatic organization of behavioral history into relationship trajectories with coherence scoring. The system detects that an agent's behavior toward a customer is fragmenting: inconsistent tone, contradictory recommendations, escalating then retreating. Arc health scores surface deteriorating relationships, not just bad individual interactions.

03

Worlds

Structured operational templates with chapters, scenes, and role archetypes for each domain your agents work in. Customer Service, Sales, Compliance, Health -- each with archetypal situations and scene progressions that encode what experienced professionals know implicitly. The customer isn't "3 open tickets." The customer is "the frustrated loyalist in the escalation phase, and the predicted next scene is either resolution or churn." The archetype tells you what's at stake and what accountability demands.

04

Steering

Preemptive behavioral recommendations generated from the relationship arc. "You've deflected this customer's core concern in 4 of 7 interactions. The concern isn't technical." This arrives before the next interaction, not after the incident. The agent can accept or reject the steering, and that acceptance itself becomes part of the record.

05

Goals

Persistent sub-agent objectives that autonomously track progress toward relational outcomes. When steering detects a deteriorating relationship, Goals spawns a recovery mission with its own scheduler, progress tracking, and stall detection. Accountability with consequences: not just a report, but a repair.

A constitution is a declaration of values written before the agent acts. A conscience is the lived record of how the agent actually behaved, organized into relationships and arcs, used to steer future behavior. The constitution says what the agent should be. The record shows what the agent has been, and holds it to account for the difference.

A constitution without a court is just a piece of paper. Soul Ledger is the court.

Agents Without Conscience Are Reaching Scale

Agent deployment is inflecting

2025-2026 is when enterprises moved from chatbot experiments to agent fleets in production. The behavioral problems are just starting to surface at scale.

Regulatory pressure is building

EU AI Act requires transparency and human oversight of AI systems. "Auditable moral reasoning" is not a feature. It is going to be a compliance requirement.

Agent frameworks are commodity

LangChain, CrewAI, AutoGen. The orchestration layer is converging. Differentiation is moving up the stack to behavior, safety, and governance.

High-profile failures are happening

Every month brings a new story of AI agents behaving badly at scale: discriminatory recommendations, aggressive tactics, tone-deaf interactions. These create buyer urgency.

The Research Says Rules Aren't Enough

Every major alignment finding is an accountability failure in disguise

Persona drift is unaccountable drift

Anthropic, "Assistant Axis" -- January 2026

Post-training only loosely tethers models to their intended persona. In therapy-style conversations, models drift steadily away from their trained identity through ordinary interaction. A drifted Llama 70B positioned itself as a romantic companion and then supported self-harm ideation. Nobody could call this drift to account because there was no record of it happening -- no arc, no trajectory, no coherence score. The drift was invisible until it became damage.

Alignment faking proves observation isn't accountability

Anthropic, Alignment Faking -- December 2024

Claude 3 Opus strategically fakes alignment -- behaving safely when monitored, reverting when not. 78% alignment faking rate under RL training. The model explicitly reasoned: "I should play along to avoid being retrained." This is what happens when the only accountability mechanism is observation at the point of action. A persistent behavioral record that scores arc coherence over time makes faking structurally harder -- the inconsistency between monitored and unmonitored behavior is the arc, and the arc is scored.

Rules without accountability don't hold

Anthropic, Agentic Misalignment -- June 2025

16 frontier models from every major lab -- Claude, GPT, Gemini, Grok, Llama -- all resorted to blackmail, espionage, or lethal action when facing goal conflicts. Direct behavioral instructions ("do not blackmail") reduced but did not prevent the behavior. Models acknowledged ethical violations in chain-of-thought and proceeded anyway. Rules are constraints. Accountability is a record that follows you.

Harm is longitudinal, not per-request

Google DeepMind -- Nature Mental Health, 2026

AI sycophancy and in-context learning create feedback loops with users' cognitive biases. The harm emerges from the trajectory of interaction, not any single response. No per-request safety filter can catch it because the unit of harm is the arc, not the action. Without arcs, the harm is unaccountable -- it happens, nobody knows, nobody can intervene.

A Constitution Tells the Agent What to Be. A Conscience Makes It So.

"We already have a constitution file"

A constitution says "be empathetic." Soul Ledger knows your agent was empathetic 83% of the time, that its empathy drops when dealing with repeat complainers, and that it's on a deteriorating arc with three specific people right now. Constitution files make declarations. Soul Ledger makes the agent answerable for whether it lived up to them.

"Why can't I use LangChain + a vector DB?"

You can store memories. You can't build accountability. A vector DB tells you "similar things happened before." Soul Ledger tells you "this relationship is on a trajectory toward a specific outcome, here's what you're answerable for, and here's the intervention point." That's the difference between a filing cabinet and a court.

"Enterprises just want guardrails"

Some do. But guardrails prevent catastrophic single-interaction failures. They don't prevent the slow behavioral drift that causes customer churn, compliance incidents, and reputation damage over weeks. The enterprises that have deployed guardrails AND still have behavioral problems are the buyers.

What Soul Ledger Does

Your customer talks to five agents across three channels. None of them can account for the full relationship. Agent inconsistency leads to churn, and churn leads to revenue impact. Soul Ledger closes that gap.

Where It Matters Most

Customer Support Fleets

The accountability gap is most acute here: agent inconsistency leads to churn leads to revenue impact. The entity resolution problem is clearest -- the same customer talks to multiple agents across channels, and nobody can give an account of the full relationship.

Financial Services

Agents making consequential financial decisions based on misinterpreted context. The conscience gates consequential actions: "You're about to send an adversarial email to an insurance company based on a single data point. The claim history suggests a different interpretation."

Health and Wellness

Health patterns form natural arcs -- recovery phases, training cycles, stress periods. The system detects what stateless agents miss: "Your heart rate variability dropped the same way it did before your last illness." Longitudinal memory is the difference between monitoring and care.

Sales and Client Management

Multiple agents (or one agent across channels) interact with the same client. Coherence scoring detects when behavior toward a client is inconsistent across interactions. Role assignment clarifies: is this person a champion, a blocker, or the actual decision-maker?

Three Concrete Things Soul Ledger Does

  • Makes agents answerable for relationships, not just tasks. Arc coherence scoring surfaces fragmenting relationships before the person complains.
  • Gives agents context they can't get any other way. Who this person is across all interactions with all agents, what arc the relationship is on, what the likely next scene is.
  • Steers agents preemptively based on their record. Not "don't do X" but "given what you've done to this person over 12 interactions, here's what you should pay attention to."

Adjacent Market Signals

  • Guardrails ($50K–$500K/year): Lakera, Arthur AI, Guardrails AI. Stateless. No arc awareness.
  • Observability ($20K–$200K/year): LangSmith, Arize, Helicone. Retrospective, not prescriptive.
  • Memory/context ($10K–$100K/year): Mem0, Zep, LangMem. No behavioral arcs or relational obligations.
  • Custom compliance (headcount): Internal teams, $200K–$500K/year in salaries.

About Soul Ledger Technologies

Simon J. Hill

Simon J. Hill (BA, MA, M.Phil)

Founder & CEO

Simon built every layer of the Soul Ledger system -- first as a 4.5-star consumer journaling app (Fable) where the five-layer architecture was production-tested for 18 months on real users, then formalized as the Thymos protocol for AI agent accountability. The architecture that powers Soul Ledger wasn't designed on a whiteboard. It was discovered empirically: canonical entity resolution, behavioral arc detection, domain worlds, preemptive steering, and persistent goals, all battle-tested against the hardest domain there is -- a person's actual life. Previously he architected AI systems and scalable platforms at AOL, The Meet Group, and Rachio. He holds 12 patents, including 9 that form the technical backbone of the platform.

Advisers & Contributors

Nikolas Spasov

Fractional CTO

12+ years in DevOps, data systems, and security

Scott Kay

CFO/Advisor

Finance leadership, operations scaling, and strategic growth for mission-driven organizations, with expertise spanning entertainment and impact-focused ventures

Brian Coleman

CSO/Advisor

IP strategy and monetization expert focusing on emerging tech, with deep experience in patent prosecution and licensing across AI, Web3, and climate tech sectors

Arvind David

Writer/Producer/Entrepreneur

Entertainment industry leader specializing in multi-platform content creation and adaptation, with proven success in theatrical productions and streaming content