Executable engineering standard: an AI skill is your ESLint for architecture

This article is about the review side of the methodology (a corpus of rules + AI checks on PRs). Use Case Pattern also includes a design side — generating whole services from a business brief via the ucp-spec-design, ucp-pattern-design, ucp-ddd-tactical-design skills. See Use Case Pattern for the broad picture.

You already have a linter for commas. You don't have a linter for aggregates, domain events and transaction boundaries. That's exactly the gap that AI skills with a corpus of rules close.

The thesis: the decisions that for years lived in the heads of tech leads and were lost when they left can now be a versioned corpus in the team's repo. An AI agent applies them on every PR and cites the specific rule. Not a PDF nobody reads. Not a wiki that goes stale in a quarter. An executable standard.

In this article:

what an executable engineering standard is and how it differs from a regular style guide;
a comparison with SonarQube, ESLint, and traditional code review (table);
the "rule corpus + executor" architecture — two layers, one source of truth;
an example of an AI review of a PR with links to rules;
when such a standard is not needed (honestly).

1. What doesn't work in existing approaches

Let's look at how an "architectural standard" usually looks in a team of 20+ engineers:

Option A — a 40-screen Confluence page. Some tech lead wrote it in 2022. The team reads it once when onboarding onto the project. A year later the rules are stale — the stack changed, new tools and approaches appeared. Nobody updates the page. Two years later new hires refer to it as outdated documentation.

Option B — code review through the tech lead's eyes. Works in a team of 5. In a team of 25 the tech lead becomes a bottleneck. The same comments on every PR: "invariants belong in the aggregate, not in the Handler," "this isn't a value object, add equals," "timestamp without TZ — fix it." Six months later the tech lead burns out or leaves — the knowledge leaves with them.

Option C — a linter (SonarQube, ESLint, Detekt). Catches stylistic problems and code-level bugs: cyclomatic complexity, unused variables, null pointers. Doesn't catch architectural ones: aggregate boundaries, naming of domain events, separating Domain Service vs Application Service. And it shouldn't — a linter is too flat for that.

All three options are real, none of them scales.

2. The executable engineering standard as the answer

The idea is simple: take what has worked well in linters — codified rules with unique identifiers — and apply it to the architectural decisions that linters usually can't reach.

rule = {unique code, short statement, explanation, OK example, NOT OK example}

Where a tech lead used to write in a comment "move the event registration out of the Handler and into the aggregate root," now the AI agent writes:

Violation of R-AGG-X4 — the OrderPaid event is registered in OrderHandler; it must be registered in the Order root itself via registerEvent(...). Source: https://vikulin-va.ru/standards/backend/ddd-tactical/#r-agg-x4

The developer clicks the link — and lands on the rule with examples and rationale. Wants to argue with the rule → opens a PR in the standard's repo, discusses it with the team, the rule gets updated.

The key difference from a regular style guide: the rule is executed on every PR, cited in the review with a direct link, and versioned in git together with the code.

3. Comparison: SonarQube / ESLint / tech-lead review / executable standard

	SonarQube / ESLint	Tech-lead code review	Executable standard + AI
What it checks	style, security bugs, code level	architecture, domain, corner cases	architecture, domain, corner cases
Who executes	static analyzer	a human	LLM agent
Codified	plugins, regex rules	in the tech lead's head	markdown corpus in the repo
Citeable in reviews	yes (`java:S1234`)	no (free text)	yes (`R-AGG-X4`)
Scales	yes	no (tech lead is the bottleneck)	yes
Understands the domain	no	yes	yes (through the corpus)
Versioned	via the linter's release	not at all	via git
Open/closed	open-source	private, in someone's head	the team's repo
Rule lifecycle	plugin releases	verbal agreements	git diff on the corpus

The executable standard fills the intersection of three properties that individual approaches don't deliver: it understands architecture (like a tech lead) + scales (like a linter) + is visible/editable by the team (like a git repository).

4. Architecture: rule corpus + executor

The standard consists of two layers, one source of truth.

┌─────────────────────────────────────────────────┐
│  Layer 1: rule corpus (markdown)                │
│                                                 │
│  • rules with codes `R-AGG-3`, `PG-T-013`, ... │
│  • prose for humans, structured for AI          │
│  • live in the repo: site/ + .claude/docs/      │
│  • versioned by git                             │
└─────────────────────────────────────────────────┘
                       ↓ reads
┌─────────────────────────────────────────────────┐
│  Layer 2: AI skills (executor)                  │
│                                                 │
│  • small prompt-skills (~50 lines each)         │
│  • invoked on the team's git diff               │
│  • cite rules by code                           │
│  • live in .claude/skills/                      │
└─────────────────────────────────────────────────┘
                       ↓ applied to
┌─────────────────────────────────────────────────┐
│  Team's PR → review with concrete links         │
└─────────────────────────────────────────────────┘
                       ↓ click the link →
                  the rule on the site

What matters architecturally:

Rules aren't hardcoded into the skill. The skill is thin, a ~50-line prompt. The rules are a thick corpus (600+ codes). This lets you change the rules without touching the skills. The analogy is an ESLint plugin, where the config is separate from the engine.
One source of truth — two renders. The same .md file is rendered to HTML on the site through a parser (for humans) and read by the AI as plain text (for the LLM). They never diverge.
Skills are invoked by the developer explicitly. It's not "the AI reviews everything under the sun." A claude review command or a hook — your choice. Control stays with the human.

5. Example: a real PR review

Suppose a developer pushed code in which OrderHandler itself publishes the OrderPaid event and changes the state of Order through public setters instead of a business method:

public class OrderHandler {
    private final OrderRepository orders;
    private final ApplicationEventPublisher events;

    @Transactional
    public void handle(PayOrder cmd) {
        Order order = orders.findById(cmd.orderId()).orElseThrow();
        order.setStatus(OrderStatus.PAID);
        order.setPaidAt(Instant.now());
        orders.save(order);
        events.publishEvent(new OrderPaid(order.id(), order.total()));
    }
}

func (h *OrderHandler) Handle(ctx context.Context, cmd PayOrder) error {
    order, err := h.orders.FindByID(ctx, cmd.OrderID)
    if err != nil {
        return err
    }
    order.Status = OrderStatusPaid
    order.PaidAt = time.Now().UTC()
    if err := h.orders.Save(ctx, order); err != nil {
        return err
    }
    return h.events.Publish(ctx, OrderPaid{OrderID: order.ID, Total: order.Total})
}

class OrderHandler {
  constructor(
    private readonly orders: OrderRepository,
    private readonly events: EventBus,
  ) {}

  async handle(cmd: PayOrder): Promise<void> {
    const order = await this.orders.findById(cmd.orderId);
    order.status = OrderStatus.PAID;
    order.paidAt = new Date();
    await this.orders.save(order);
    await this.events.publish(new OrderPaid(order.id, order.total));
  }
}

from datetime import datetime, timezone

class OrderHandler:
    def __init__(self, orders: OrderRepository, events: EventBus) -> None:
        self._orders = orders
        self._events = events

    async def handle(self, cmd: PayOrder) -> None:
        order = await self._orders.find_by_id(cmd.order_id)
        order.status = OrderStatus.PAID
        order.paid_at = datetime.now(tz=timezone.utc)
        await self._orders.save(order)
        await self._events.publish(OrderPaid(order_id=order.id, total=order.total))

After claude review, the ucp-ddd-tactical-review skill will produce:

Findings (3 critical, 1 minor)

R-AGG-X4 — antipattern: registering events outside the aggregate root (line 12) The OrderPaid event is published in OrderHandler, not in Order itself. It should be: order.pay(...), inside which registerEvent(...). Source: https://vikulin-va.ru/standards/backend/ddd-tactical/#r-agg-x4

R-ENT-X3 — antipattern: public setters for every field (line 11) setStatus() and setPaidAt() break encapsulation. State changes only through business methods (order.pay(amount)). Source: https://vikulin-va.ru/standards/backend/ddd-tactical/#r-ent-x3

R-EVT-X4 — antipattern: AFTER_COMMIT for critical effects (line 13) Publishing an event without transactional guarantees — the event will be lost if the process crashes after the commit. You need an Outbox or synchronous delivery within the same transaction. Source: https://vikulin-va.ru/standards/backend/ddd-tactical/#r-evt-x4

The developer sees 3 findings in the IDE, clicks the first link, reads the rationale, and rewrites:

public class OrderHandler {
    private final OrderRepository orders;

    @Transactional
    public void handle(PayOrder cmd) {
        Order order = orders.findById(cmd.orderId()).orElseThrow();
        order.pay(cmd.amount());
        orders.save(order);  // save publishes the events via DomainEventPublisher
    }
}

// in Order.java
public void pay(Money amount) {
    if (status != OrderStatus.CONFIRMED) {
        throw new IllegalStateException("Cannot pay: " + status);
    }
    this.status = OrderStatus.PAID;
    this.paidAt = Instant.now();
    registerEvent(new OrderPaid(id, amount, paidAt));
}

func (h *OrderHandler) Handle(ctx context.Context, cmd PayOrder) error {
    order, err := h.orders.FindByID(ctx, cmd.OrderID)
    if err != nil {
        return err
    }
    if err := order.Pay(cmd.Amount); err != nil {
        return err
    }
    return h.orders.Save(ctx, order) // Save flushes the accumulated events via the outbox
}

// in order.go
func (o *Order) Pay(amount Money) error {
    if o.Status != OrderStatusConfirmed {
        return fmt.Errorf("cannot pay order in status %s", o.Status)
    }
    o.Status = OrderStatusPaid
    o.PaidAt = time.Now().UTC()
    o.events = append(o.events, OrderPaid{OrderID: o.ID, Amount: amount, PaidAt: o.PaidAt})
    return nil
}

class OrderHandler {
  constructor(private readonly orders: OrderRepository) {}

  async handle(cmd: PayOrder): Promise<void> {
    const order = await this.orders.findById(cmd.orderId);
    order.pay(cmd.amount);
    await this.orders.save(order); // save flushes the accumulated events via the outbox
  }
}

// in order.ts
pay(amount: Money): void {
  if (this.status !== OrderStatus.CONFIRMED) {
    throw new Error(`Cannot pay order in status ${this.status}`);
  }
  this.status = OrderStatus.PAID;
  this.paidAt = new Date();
  this.recordEvent(new OrderPaid(this.id, amount, this.paidAt));
}

from datetime import datetime, timezone

class OrderHandler:
    def __init__(self, orders: OrderRepository) -> None:
        self._orders = orders

    async def handle(self, cmd: PayOrder) -> None:
        order = await self._orders.find_by_id(cmd.order_id)
        order.pay(cmd.amount)
        await self._orders.save(order)  # save flushes the accumulated events via the outbox

# in order.py
def pay(self, amount: Money) -> None:
    if self.status != OrderStatus.CONFIRMED:
        raise ValueError(f"Cannot pay order in status {self.status}")
    self.status = OrderStatus.PAID
    self.paid_at = datetime.now(tz=timezone.utc)
    self._record_event(OrderPaid(order_id=self.id, amount=amount, paid_at=self.paid_at))

The cycle "rule → AI review → link → understanding → fix" takes ~10 minutes instead of "argue in a PR comment with the tech lead → wait 2 days for a reply → back and forth → merge conflict." The tech lead only shows up if the developer wants to discuss the rule itself.

6. When an executable standard is NOT needed

Honestly. The approach isn't for everyone.

Not needed if:

The team is smaller than ~5 engineers. Tech-lead review covers everything; formalization is overhead. Adopt it when the team grows.
An early prototype with frequent changes of direction. The architecture changes too fast. Codification will be stale within a week. First validate the product's viability, then standardize.
80% of the code base is inherited code with no refactoring plans. Applying new rules to 5-year-old code is an irritation generator. It should be introduced together with new services.
The team doesn't trust AI reviews. Cultural preparation first, skills second. Otherwise the skills get switched off and the corpus freezes.

Clearly needed when:

The team is 10+, a multi-service project. The tech-lead bottleneck is real; each service's architecture diverges without formalization.
Greenfield/a new service. The most profitable place — record the patterns while they're still fresh in your head.
Churn/a growing team. Knowledge leaves with people. The corpus is the team's memory that they can't take with them.
A hard compliance/safety domain. Automotive, medicine, finance — where "that's how we do it here" isn't enough and you need an audit trail of why a review passed.

7. What an executable standard does NOT do

To avoid inflated expectations:

It doesn't write the architecture for you. The decision about aggregate boundaries, the choice of maturity level (1/2/3), which stack — that's on the human.
It doesn't replace the architect. The review supports a decision, it doesn't replace it; the human decides.
It doesn't declare "the truth." Every rule in the repo is debatable. Disagree — open a PR in usecase-pattern-skills.
It doesn't work on inherited code without adaptation. Applying R-AGG-X3 to a tightly coupled 2018 service = manufacturing pain. Introduce it at the boundary of new modules.
It doesn't catch code-level bugs (NPE, deadlock, off-by-one). That's SonarQube/ESLint's job. The AI standard is a layer ABOVE.

8. Next

Use Case Pattern — the methodology the rules are built on
Standards — the current corpus: 17 style guides, 1300+ rules, 44 AI skills
DDD Tactical Style Guide — an example of the corpus in action
usecase-pattern-skills on GitHub — the open skills for Claude Code