A teaching walkthrough

Deep modules — and why they matter more in the AI era

A guided tour of module shape — the difference between code that hides complexity and code that just relocates it — built from first principles, then walked through a real refactor.

I watched Matt Pocock's "Software Fundamentals Matter More Than Ever" (AI Engineer, April 2026) and wanted to really understand the deep-modules idea he keeps circling — not just nod along. So I sat down with Claude and worked it from the ground up: what a module actually is, why Ousterhout's deep/shallow framing matters more in the AI era, and what it looks like in real code. This is what now guides me when I build with agents — the lens behind every interface decision.

1 · What is a module?

A module is any chunk of code with a line drawn around it. A file. A folder. A class. A package. A service. A repo. The unit doesn't matter — the line does.

What "line drawn" actually means

Forget the metaphor — think mechanism. When you put functions in a file and pick which ones to export, you've drawn a line. Exported = the outside can see it. Not exported = it stays inside. A folder with an index.ts that re-exports just three things has drawn its line at those three.

Same idea at every scale: a class draws its line with public/private. A REST API draws its line at the URL surface. A team draws its line at "what we own vs what we ask another team for." The literal mechanism changes; the concept is the same — something on this side, something on that side, and a deliberate boundary between them.

Every module has two parts:

Interface — the surface other code touches. Function signatures, exported types, public methods, REST endpoints. What a caller has to learn.
Implementation — everything inside that the caller doesn't have to know. Helpers, internal state, private functions, the actual logic.

Who is "the caller"?

The caller is whoever uses your code. If you wrote loadSources(...), the caller is the route file that runs await loadSources(...). Caller = the line of code that says "hey, please run this for me." Anything that imports your function, calls your endpoint, or instantiates your class is a caller.

So the line a module draws is really the border between the caller's interest and the module's inside interest. The caller cares about the interface; the module cares about its insides; the line says "you don't need to look across this."

The whole point of drawing a line is to declare a contract: "outside this line, you only need to know X. Inside, the complexity is my problem."

Vocabulary check

"Module" is deliberately broad here. A single function with parameters has an interface (the signature) and an implementation (the body). A folder with a public index.ts has an interface (its exports) and an implementation (everything else). The shape of the argument is the same at every scale.

2 · Deep vs shallow modules

This is John Ousterhout's framing from A Philosophy of Software Design. Picture a module as a rectangle. The width is the size of its interface — how much a caller has to learn. The depth is how much functionality lives behind it.

Deep — narrow on top, fat below. One simple call, lots happening inside. Example: Unix read(fd, buf, n). Three arguments. Behind them: filesystem layers, page cache, device drivers, networking. The caller doesn't care.
Shallow — wide on top, thin below. The interface is almost as complicated as the implementation. Example: an 8-argument wrapper that just forwards to another function. You pay full interface cost for almost no hiding.

The rule: a module earns its keep by hiding more than it exposes. If it doesn't hide much, it's noise — the caller would be better off reading the body inline.

Try it · Module shape visualizer

Interface width 120 px Implementation depth 200 px

ratio (depth ÷ width): 1.67

DEEP

interface

implementation

Deep means the implementation rectangle dwarfs the interface rectangle. The hidden complexity earns the abstraction. As you slide the implementation thin, the module collapses into noise.

3 · The "cover the body" test

A useful diagnostic: cover the implementation with your hand and read only the interface. Can you predict what it does and what it costs? If yes, the module is deep — the interface communicates the contract. If you have to peek inside to know what's happening, the abstraction is leaky and the module is shallow.

"But if the implementation is deep, how can I know from the interface what's inside?"

This is a fair pushback, and it's worth untangling. There are two different things "predict" could mean:

Predict the implementation — knowing the line-by-line code inside.
Predict the contract — knowing what the function does and roughly what it costs.

A deep module hides #1 but exposes #2. From read(fd, buf, n) you cannot predict the page cache, the device drivers, the kernel data structures — you shouldn't be able to. But you can predict: "given a file descriptor and a buffer of size n, this fills the buffer with up to n bytes from the file." That's the contract. The signature gave you a short, useful summary that's smaller than the implementation.

A shallow module fails differently. With buildHeaders(userId, env, contentType, accept, traceId) you also can't fully predict the implementation from the signature — but here the contract is barely smaller than the implementation. Five inputs, five outputs, one-to-one. The "summary" is the same size as the body. The abstraction didn't compress anything.

The corrected test: can you state the contract in fewer words than it would take to read the body? If yes, deep — the abstraction is doing compression work. If no, shallow — the wrapper is just renaming the body.

Try it on three real signatures:

Try it · The cover-body test

async function createPaymentProcess({ sum, payerInfo, description, notifyUrl, ... }): Promise<ProviderResponse>

A payment-provider integration wrapper. Predict: what's hidden inside? Is this deep or shallow?

click to reveal the body

// ~190 lines hidden behind 4 exports:
//
//  - PHP-bracket form encoding (the provider accepts form-data, not JSON)
//  - The three-token confusion (processId vs processToken vs authCode,
//    documented with the exact error you'd see if you got it wrong)
//  - Env-based base URL switching (sandbox vs prod)
//  - Non-JSON response handling
//  - Forged-payload protection (NaN > x is always false)
//  - HTTP status surfacing without leaking back into echoed bodies

DEEP ★★★★★ interface = 4 exports; implementation = ~190 lines hiding vendor quirks no caller should ever see

function buildHeaders(userId, env, contentType, accept, traceId) { ... }

A hypothetical helper. What's hidden? Is the abstraction earning its place?

click to reveal the body

function buildHeaders(userId, env, contentType, accept, traceId) {
  return {
    "X-User-Id": userId,
    "X-Env": env,
    "Content-Type": contentType,
    "Accept": accept,
    "X-Trace-Id": traceId,
  };
}

SHALLOW ★ 5 args in, 5 fields out — the interface is nearly as complex as the body. Caller would be better off building the headers inline.

async function* streamAnswer(sources, question, options): AsyncGenerator<AnswerEvent>

An LLM streaming-answer generator. Predict the contract.

click to reveal the body

// Hidden behind one async generator:
//
//  - Picks the correct system prompt by `kind` (article/podcast/video/transcript/...)
//    using a Record so adding a new kind becomes a compile error, not a silent default
//  - Yields a "chunks" event with citation metadata (slug + title for cross-resource cases)
//  - Yields a "variant" event so the caller can persist which prompt version actually ran
//  - Streams "token" events as the LLM emits chunks
//  - Yields a final "done" event with token-usage totals (nullable when unreported)
//  - Wraps the provider call with retry/backoff on 429s and 5xxs
//  - Defensive precondition: sources must be non-empty (silent empty would hallucinate)

DEEP ★★★★ 3 args in (one optional), one async generator out. Hides prompt selection, retry, streaming protocol, defensive checks. Same shape works for many content types.

What this exercise reveals

Shallow modules feel productive — you're "decomposing" — but they often just spread complexity across more files instead of removing it. The dependency graph gets wider, the cognitive load goes up, nothing is actually encapsulated. The cover-body test catches this: if you have to peek to predict, the abstraction isn't doing real work.

4 · Why the same advice transfers to AI — for different reasons

Ousterhout was writing for humans reading code. His argument was about working memory and cognitive load: a deep module lets a reader page out the implementation and just reason about the interface. That argument still applies — but the AI era adds a separate, sharper one.

Humans and language models have inverted cognitive constraints:

Humans

Limited working memory (Miller's 7 ± 2)
Persistent expertise across days, months, years
Pattern-recognize from experience
Get tired; deep modules let you page out

Ousterhout's argument: narrow interface = small surface to hold in mind

Language models

Effectively unlimited working memory inside one context
Zero persistent memory across sessions
Pattern-recognize from training data + context
Don't get tired; do generate plausible nonsense at large surface

AI-era argument: narrow interface = small token surface for misuse + bounded blast radius when the agent is wrong inside

Pocock's framing — "go back to old books" — is correct in outcome but a little misleading in cause. The reasons fundamentals transfer to AI aren't nostalgic. They're different physics, same shape:

Wide interface ⇒ more bytes for the same operation ⇒ token cost
Wide interface ⇒ more public API surface ⇒ more chances for the agent to misuse it
Narrow boundary ⇒ cheap to test from outside ⇒ tight feedback loop
Narrow boundary ⇒ bounded blast radius when the agent is wrong inside

5 · Pocock's claim, made concrete

Pocock says: "AI is really good at creating codebases like this" — the shallow shape. Why?

Each turn the agent sees only so much; safer to make small isolated functions
Fewer interface decisions to commit to
Pattern-matched to "good code = small functions" from training data
It's the path of least resistance when you don't have a holistic view

The compounding bit is what hurts: AI both produces shallow structure and fails to navigate shallow structure. Once a codebase tilts shallow, every new feature widens the dependency graph by another fan, and the next agent run gets lost in it.

A worked example

Imagine a content platform with several resource types — articles, podcasts, videos, a cross-resource "library" search. Each type exposes a streaming Q&A endpoint so a user can ask questions about that resource. The routes grew separately, each is ~350 lines, and they look like this:

app/api/articles/ask/route.ts   ~305 lines
app/api/podcasts/ask/route.ts   ~352 lines
app/api/videos/ask/route.ts     ~345 lines
app/api/library/ask/route.ts    ~368 lines
                                ─────────
                                ~1,370 lines

LOC = lines of code. So "352 LOC" just means a 352-line file. You'll see the term in the chart and the before/after tree below.

Two of those (podcasts vs videos) are ~95% identical. Let's prove it.

6 · Side-by-side: two near-identical routes

The diff between the two routes is so narrow it's almost embarrassing. Toggle the highlight modes to see what's identical (the vast majority) vs what genuinely differs (a handful of names).

Try it · Highlight identical or different

app/api/podcasts/ask/route.ts// POST /api/podcasts/ask — per-episode chat or corpus-wide
import { podcastQueries } from "@/db/schema";
import { requireAuth } from "@/lib/auth/dal";
import { fetchAgentQuota, incrementAgentUsage } from "@/lib/agent-quota";
import { loadPodcastSources } from "@/lib/qa/sources";
import { routeQuestion } from "@/lib/qa/router";
import { streamAnswer } from "@/lib/qa/answer";
const PROMPT_VARIANT = PROMPTS.podcastAnswer.default;
const PODCASTS_INDEX_KEY = "podcasts";
const Body = z.object({ ... });
async function resolveConversationId(...) { ... }
  const session = await requireAuth();
  const parsed = Body.safeParse(body);
  // quota check (identical)
  // conversation resolve (identical except table)
  // history load (identical)
  // SSE stream setup (identical)
  // abort handling (identical)
  // persist-before-close (identical)
  // error mapping (identical)
  await loadPodcastSources(selectedContentIds);
  for await (event of streamAnswer(s, q, { kind: "podcast" }))
  return new Response(stream, { headers: { ... } });

app/api/videos/ask/route.ts// POST /api/videos/ask — per-video chat or corpus-wide
import { videoQueries } from "@/db/schema";
import { requireAuth } from "@/lib/auth/dal";
import { fetchAgentQuota, incrementAgentUsage } from "@/lib/agent-quota";
import { loadVideoSources } from "@/lib/qa/sources";
import { routeQuestion } from "@/lib/qa/router";
import { streamAnswer } from "@/lib/qa/answer";
const PROMPT_VARIANT = PROMPTS.videoAnswer.default;
const VIDEOS_INDEX_KEY = "videos";
const Body = z.object({ ... });
async function resolveConversationId(...) { ... }
  const session = await requireAuth();
  const parsed = Body.safeParse(body);
  // quota check (identical)
  // conversation resolve (identical except table)
  // history load (identical)
  // SSE stream setup (identical)
  // abort handling (identical)
  // persist-before-close (identical)
  // error mapping (identical)
  await loadVideoSources(selectedContentIds);
  for await (event of streamAnswer(s, q, { kind: "video" }))
  return new Response(stream, { headers: { ... } });

What actually differs: one schema import (podcastQueries vs videoQueries), one sources function (loadPodcastSources vs loadVideoSources), one prompt variant + one index key, and the strings "podcast" vs "video" in log prefixes and the kind arg to streamAnswer. Everything else — auth, quota, conversation resolve, SSE plumbing, abort handling, persist-before-close, error mapping — is byte-for-byte the same.

The compounding tax

Adding a fifth content type by copy-paste would mean a fifth ~350-line file that differs from the others by ~20 lines. A sixth makes it six. Each new type is +350 lines of accidental duplication and another place to fix the next abort/quota/persistence bug. This is the shallow shape compounding.

7 · The lower layers were already deep

Here's what makes this kind of case interesting: the building blocks for a unified shell often already exist. In our example, the Q&A engine in lib/qa/ was already parameterized by content type — router, history loader, answer streamer, index loader all took an indexKey or a kind or a table as input.

// Already deep — content-type-agnostic, called the same way by every route:

routeQuestion(indexKey, question, { promptName })   // returns slugs
loadRoutingIndex(indexKey)                          // returns Index
loadThreadHistory(conversationId, limit, table)     // returns ThreadHistory
streamAnswer(sources, question, { kind, ... })      // yields events

The deep-module engine was right there. Each lower-layer function had a narrow interface and fat insides. So why was the orchestration on top still copy-pasted?

The diagnosis

The route handler was the leaky orchestration layer above otherwise-clean abstractions. Every route reached into the same routeQuestion, streamAnswer, loadThreadHistory — and repeated the same auth, quota, conversation, SSE, persist scaffolding around them. The scaffolding was the shallow part.

The fix isn't to redesign the lower layers. It's to extract the route shell as another deep module above them — narrow surface (the per-type config), fat insides (the SSE/auth/quota/persist plumbing).

8 · Deep modules at the schema level

The single most important insight in this kind of refactor often isn't in your application code at all — it's in the database. A well-shaped schema can encode the deep-module move at the data layer, and that's what makes the application-level collapse possible:

Try it · Click each layer to see what's discriminated where

content_items

parent table — discriminator lives here

▾

Holds every kind of content: podcasts, videos, articles, interviews, and more. The crucial column:

contentType: text("content_type").notNull()
  // values: "podcast" | "video" | "article" | "interview" | ...

Where the type lives. Every row knows which kind of content it is by this column. Anything that needs to scope by type filters here (e.g. the routing index for podcasts is built from WHERE contentType = 'podcast').

↓ joined by content_id

content_transcripts

type-blind by design

▾

One row per transcribed content item. Crucially: no contentType column. The medium has already been resolved — by the time something is in this table, it's text.

content_id  → content_items.id
text        — the full transcript
language    — locale tag
modelVersion, transcribedAt, costUsd, ...

Audio (podcasts), video, live recordings — they all converge on the same row shape because the medium-specific work (speech-to-text for audio, captions for video, etc.) happens upstream in the ingestion pipeline, not in the Q&A loader.

↓ read by

loadTranscriptSources(contentIds, kindLabel)

one loader for all transcribed types

▾

Because content_transcripts is medium-blind, the loader is too. Pre-refactor we had two near-identical loaders (loadPodcastSources and loadVideoSources) — same query, different error string. They collapsed into one because the data shape was the same.

// One function. Used by podcast routes, video routes,
// and every future transcribed-content route for free.
loadTranscriptSources(contentIds, kindLabel) → AnswerSource[]

Future content types that ingest into content_transcripts inherit this loader for free.

↓ called by

/api/podcasts/ask · /api/videos/ask · /api/live/ask

URL is the discriminator at the boundary

▾

Each route knows its content type because it's at a type-specific URL. /api/podcasts/ask means "user is asking about podcasts." That's the boundary where the type gets pinned.

The route doesn't have to filter by type when loading sources — the IDs it passes to loadTranscriptSources are already scoped to its type by upstream business logic. Type is a routing concern, not a loader concern.

The decision rule the schema encodes

Share when the shape is the same; separate when the shape legitimately differs. Transcripts: same shape across all media → one table, one loader. Query logs (podcast_queries, video_queries, live_queries): different fk targets and per-type analytics needs → separate tables.

Audio vs video is different in real life. Their transcripts are not. The schema gets to make that distinction.

9 · "Should I collapse these two functions?"

Walk through the questions. The interactive tree below ends at collapse or keep separate based on real shape, not real-world labels.

Try it · Collapse decision tree

Do the two functions read from the same table / source?

Same shape from the same source? → collapse
Different storage (e.g. structured documents in object storage with custom markers, vs rows in a relational DB) → keep separate
Different downstream consumers with different analytics needs (per-type query logs) → keep separate
"Are they the same kind of thing in the real world?" is the wrong question. Audio vs video is different in real life; their transcripts are not.

The principle

Ousterhout's deep-module test isn't about the metaphysics of the data. It's about the interface to the data. Two functions that read the same shape from the same source should collapse, regardless of what their inputs represent in the world. Two functions that look superficially similar but read different shapes from different sources should stay separate, regardless of whether their inputs are "the same thing."

10 · The redesign

Here's the new shape. The old four-route layout vs the new shell + thin callers.

Before · After

Before — shallow orchestration

app/api/articles/ask/route.ts~305 LOC

app/api/podcasts/ask/route.ts~352 LOC

app/api/videos/ask/route.ts~345 LOC

app/api/library/ask/route.ts~368 LOC

app/api/live/ask/route.ts (planned)~350 LOC

Total (5 types)~1,720 LOC

Each new content type adds another ~350 lines of mostly-duplicated scaffolding.

After — deep shell + thin callers

lib/qa/run-ask-stream.ts~360 LOC

app/api/articles/ask/route.ts (unchanged)~305 LOC

app/api/podcasts/ask/route.ts (unchanged)~352 LOC

app/api/videos/ask/route.ts (unchanged)~345 LOC

app/api/library/ask/route.ts (unchanged)~368 LOC

app/api/live/ask/route.ts~80 LOC

Total (5 types, Path A)~1,810 LOC

Path A: the new type ships at 80 LOC, the shell is paid once. Old routes migrate later — each migration removes ~280 lines.

What the shell owns

// lib/qa/run-ask-stream.ts (~360 lines)
//
// Owns:
//   • Auth (requireAuth)
//   • Quota (fetchAgentQuota, incrementAgentUsage)
//   • Conversation resolve/create (with ownership check)
//   • History load (loadThreadHistory(conversationId, limit, table))
//   • SSE stream lifecycle (encoder, controller, closeSafely)
//   • Abort handling (req.signal.aborted)
//   • Persist-before-close discipline (the row hits the DB
//     even when the client disconnects mid-stream)
//   • Error mapping
//
// Takes from the caller (the per-type config):
//   • table          — which queries table to log to
//   • plan(question, history) → { sources, selectedContentIds, routerUsage }
//   • resolveScopePin(conversationId, contentId) — per-type pin rule
//   • kindLabel      — for log prefixes
//   • answerKind     — forwarded to streamAnswer
//   • defaultPromptVariant

What a thin caller looks like

// app/api/live/ask/route.ts — 80 lines including imports + zod body
export async function POST(req: Request) {
  const session = await requireAuth();
  const parsed = Body.safeParse(await req.json());
  if (!parsed.success) return Response.json({ error: "invalid_body" }, { status: 400 });
  const { contentId, question, conversationId } = parsed.data;

  return runAskStream(req, {
    session: { userId: session.userId, role: session.role },
    question,
    contentId,
    providedConversationId: conversationId,
    table: liveQueries,
    logPrefix: "live/ask",
    answerKind: "live",
    defaultPromptVariant: PROMPTS.liveAnswer.default,
    async plan() {
      const sources = await loadTranscriptSources([contentId], "Live session");
      return { sources, selectedContentIds: [contentId],
               routerUsage: { inputTokens: null, outputTokens: null } };
    },
    async resolveScopePin(conversationId, requestedContentId) {
      // first prior turn fixes the session id; subsequent turns must match
      ...
    },
  });
}

11 · How LOC grows with content types

The shallow approach scales linearly: each new content type costs another ~350 lines. The deep approach pays a one-time shell cost and then ~80 lines per type. Drag the slider to see how the gap widens.

Try it · LOC growth as content types accumulate

Content types: 5

Shallow (350 × N)

1,750 LOC

Deep (360 + 80 × N)

760 LOC

The break-even point

The deep approach is more expensive at N = 1 (you wrote the shell for nothing). At N = 2 they're roughly tied. At N ≥ 3 the deep approach is strictly better — and the gap widens forever. This is why the right time to extract a shell is when you're about to write the third copy: the second was a coincidence; the third is a pattern.

12 · Path A vs Path B vs Path C

Three reasonable answers to "we want to add a new content type and we have a shallow-orchestration problem." Each has trade-offs.

Compare · PR size, risk, scaling, lock-in

Path A · Add new with shell, leave old

incremental

PR size: Small (~400 lines new, ~0 churn in old routes)
Risk: Low — shell is validated by one fresh caller before more callers commit to it
Scaling: Old routes still pay tax until migrated; each migration is mechanical and independent
Lock-in: None — if the shell signature is wrong, you fix it once and only one caller is affected
When: You have an immediate new feature to ship and want to start the cleanup without betting the farm

Path B · Migrate everything at once

big-bang

PR size: Large (~1,500 lines of churn across 5 routes)
Risk: Higher — four old routes commit to a shell that's only been tested by one caller
Scaling: Clean immediately; no temporary inconsistency
Lock-in: If the shell signature has a subtle gap, you discover it across four routes simultaneously
When: You're confident in the shell shape, the team can review a big diff, no shipping pressure

Path C · Registry-based content types

re-architecture

PR size: Very large; touches every layer (DB, lib, API, UI, types)
Risk: Significant — replaces a static union with runtime registration
Scaling: Adding a content type becomes one registry entry — best long-term shape
Lock-in: High — registry pattern is hard to undo; if it doesn't fit a type, you've made things worse
When: You have 5–6 content types proven on the shell and a clear next batch that fits the same shape

The right answer in most situations is Path A. Two reasons:

The new feature needs to ship. Adding it through the shell gets it out the door and proves the shell shape under one fresh caller.
The shell signature is unproven. Migrating four existing routes into an unproven shell would be the bigger bet. The next migration becomes the second caller — still cheap to course-correct if needed.

13 · What stays separate (and why that's part of the design)

A real deep-module move names what it doesn't unify. The honest version of this refactor explicitly excludes a few things, and each exclusion has a reason:

Loaders for differently-stored content stay separate. Suppose alongside the transcribed-content types there's also a structured documents type — say, long-form books stored as markdown in object storage with custom page markers (e.g. ). That loader has to parse markers the transcript loader doesn't need to know about. Different storage, different shape, different parsing → forcing it through the shared loader would mean the loader knows about page markers it has no business knowing about.
Routers with different strategies stay separate. A cross-resource "library" search that has to first pick which book and then which chapter is a two-stage routing problem. A single-resource Q&A endpoint is one-stage. Different routing strategies belong as different functions, not as one "router" with branching internals.
Per-type query log tables stay separate. podcast_queries, video_queries, live_queries — each has the right fk targets and the right per-type analytics needs. The shell's table parameter accepts any of them; the union of column shapes is named explicitly (call it AskQueriesTable) and only includes the ones that genuinely share a column set.
Routes whose insert shape really differs stay outside the shell. If one route logs bookSlug + selectedChapterSlugs and another logs contentId + selectedContentIds, those are different schemas. Forcing them through the shell would require the shell to know about books, which defeats the point of a narrow boundary.

Honest exclusion is part of the design

A deep module that refuses to unify what doesn't fit is more honest than a deep module that swallows everything. The shell's name (runAskStream) and its type (AskQueriesTable) say exactly what it covers — transcribed-content Q&A. The fact that structurally different routes don't fit isn't a failure; it's the boundary the abstraction earns by being narrow.

14 · The naming move

Folder names are part of the interface — and they're easy to get wrong in a way that compounds. A common pattern: the first surface you build gets a domain-specific folder name (say, lib/articles/), then subsequent surfaces piggyback on the same infrastructure, and the folder name never gets updated. Now the folder says one thing and contains another.

This connects directly to a different Pocock point: ubiquitous language. Names are part of the interface. A wrong name doesn't just look bad — it leaks into how every future contributor and agent thinks about the module. An LLM exploring the codebase will read lib/articles/router.ts and reason about it as "the articles router," not "the generic Q&A router." Every prompt that needs to talk about routing has to first re-establish that the name lies.

lib/articles/   ← misleading (only one of many surfaces uses articles)
  router.ts                (used by ALL Q&A surfaces)
  answer.ts                (used by ALL Q&A surfaces)
  history.ts               (used by ALL Q&A surfaces)
  index-loader.ts          (used by ALL Q&A surfaces)
  sources.ts               (used by ALL Q&A surfaces)

lib/articles/   →    lib/qa/   ← honest

The fix is git mv + a find/replace on imports. Cosmetic at the file level, real at the cognitive level. The trick is sequencing: do the rename after the migrations, not during them.

Do the rename last

If you rename mid-migration, every file you touch shows up in the diff for two reasons: the migration and the import path. That muddies review. Rename after migrations are done — when the legacy-named imports being added are at their minimum.

15 · The lesson, in one breath

Deep modules in the AI era aren't about clever abstraction. They're about drawing the boundary at the place where complexity legitimately differs vs is identical — and being honest about which is which.

The leverage point moves from "writing code" to "drawing the right boundary." That sounds like it should make senior judgment matter less in an AI-assisted workflow. It's the opposite. AI can hand-roll the implementation behind any interface you give it. Choosing the right interface is the part that doesn't automate.

This kind of refactor isn't a clever architectural move. It's three small acts of judgment, in order:

Notice that the routes are 95% identical at the orchestration layer.
Notice that the lower layers were already deep — the engine exists; only the route shell is leaky.
Refuse to unify the routes whose shape really differs, even though the temptation is to pull everything into the shell.

None of those three are mechanical. They're shape-sensing decisions. The mechanical part — writing the 360-line shell, the 80-line new route — is the easy half.

What to take away

If you take only one thing: the right question is never "what kind of thing is this in the world?" It's "what shape is the data, and where does the medium-specific work end?" That question separates the parts that should collapse from the parts that legitimately stay separate. Everything else is consequence.