System Definition Brings Software Engineering to AI Coding

Wait 5 sec.

A conceptual companion to Working Code, Wrong EngineeringAI can generate code, but engineered software still requires structure: intent, boundaries, scoped generation, contracts, evaluation harnesses, provenance, and accountable architecture.AI’s promise is to remove the complexity of development. When AI starts delivering code, that promise is often understood as a way to skip engineering.The same naive pattern appears elsewhere: asking AI to build the next application from a few vague lines, or expecting an agent with a huge context and a few instructions to manage the real world with clockwork precision.But AI does not remove the need for structure.By reducing coding complexity, it makes structure even more important.In Working Code, Wrong Engineering, I focused on one concrete failure mode: AI-generated code that works locally but violates the system definition through supply-chain drift, SaaS cost drift, license drift, operational burden, or failure-behavior drift.This article steps back from that specific failure mode and asks the broader question: if AI can generate code, why does software engineering matter more than ever?The mechanical production of code is becoming cheaper. But cheaper code is not automatically engineered software.That need for structure keeps appearing across AI systems: context management, human-in-the-loop, security, agentic workflows, provenance, and auditability.Software development with AI coding is not different.The Next Rung on the Abstraction LadderThis is not new.For decades, programming has added abstraction layers.Assembly freed programmers from direct machine code but kept them tied to hardware.C and other high-level languages gave developers portable power and deeper abstraction. But that new power carried a new kind of debt: performance-sensitive work, hardware access, and system-level control still required engineers to understand, and sometimes cross back into, the lower-level layers that abstraction was meant to hide.Managed and dynamic languages made development faster and more accessible, while reducing many memory-corruption risks. The debt grew again: performance overhead, weaker hardware intuition, runtime opacity, and a more complex software supply chain built on runtimes, stacks, and large ecosystems.Distributed systems and cloud frameworks added scale, but also partial failure, observability gaps, and configuration sprawl.Each layer made creation easier.Each layer also created new debt.The pattern is simple: abstraction reduces one kind of complexity by relocating another.AI pushes the ladder higher again, but with a sharper difference.Why AI Coding Is DifferentThis time, the new layer is not only another step away from hardware, memory, runtime, or infrastructure.It also challenges something previous layers mostly preserved: deterministic specification. In practice, software engineering depends on that specification for predictable behavior and repeatability.The deeper risk of AI coding is not only probabilistic generation. It is the blurring of the deterministic system contract: the explicit specification of how a component must behave, interact, fail, and remain accountable in the real world. Software engineering must restore that contract around generated code.But deterministic specification has two faces.The first is the deterministic system contract: how a generated component must behave when it interacts with users, services, data, tools, infrastructure, and the real world.The second is the deterministic production specification: how that component is generated, built, tested, verified, traced, and reproduced.The first gives the system predictable behavior.The second gives engineering repeatability.AI challenges both.Deterministic System ContractAt first, the visible difference seems to be freedom from procedural definition.But the problem is not declarative programming.A system does not need every internal step to be manually specified.Even in high-level languages, developers still defined how a component should behave when it met the rest of the system. They defined contracts, boundaries, inputs, outputs, permissions, failure modes, and observable behavior.SQL, Prolog, rule engines, and other declarative systems already showed this. They can move procedural logic into the engine, but they still depend on formal definitions: schemas, facts, rules, constraints, contracts, and control boundaries.When procedural logic moves into the engine, engineering does not disappear.It moves into the deterministic definition of how the component interacts with the outside world.AI repeats part of that lesson, but with a major difference: a probabilistic engine instead of a formal one.With AI, the temptation is to confuse two very different things: not specifying every internal step, and not specifying the system clearly at all.When Ambiguity Meets Probabilistic GenerationAnd that difference is made sharper by the interface itself. Natural language can carry lexical ambiguity, syntactic ambiguity, pragmatic ambiguity, missing context, and unstated intent. When that ambiguity meets probabilistic generation, the result is not only flexible. It is under-specified unless the surrounding system definition constrains it.The deeper issue is not only the probabilistic nature of the model. It is ambiguity in the input. Generated code may vary, but the boundaries, contracts, and evidence it must honor cannot be left ambiguous.Humans deal daily with ambiguity, but deterministic specification was also shaped by implicit human context. Explicit requirements, contracts, and documented behavior were only part of the real engineering act. Developers also carried corporate culture, team habits, tribal knowledge, security expectations, product strategy, architectural preferences, common sense, and small shared nuances into the code.Many rules were never written down as formal requirements, policies, or contracts. They appeared naturally in the code because humans treated them as obvious.AI coding breaks that assumption. If those rules are not explicit, the model cannot be expected to preserve them. It may replace local engineering judgment with the most statistically common implementation pattern.Deterministic Production SpecificationGiven the same source code, build settings, dependencies, and environment, traditional software is expected to produce the same artifact. There are exceptions, but the engineering model depends on repeatability.AI introduces something different: probabilistic generation into the act of producing code itself.The same request can produce different implementations. The same goal can become different control flows. The same prompt can hide different assumptions.That does not only affect the visible code.It affects the production chain behind the code: prompt, context, model version, generation tool, dependency versions, approved repositories, build pipeline, artifact hash, SBOM, tests, vulnerability checks, license checks, and deployment records.When generation also depends on outside research, live documentation, evolving software stacks, or external APIs, production drift becomes more likely, and repeatability becomes harder over time.Repeatability is not a whim of developers trying to keep control, nor a technical luxury. It is the basis for traceability, provenance, debugging, testing, security checks, supply-chain control, and accountability.If AI introduces probabilistic generation into code creation, engineering must reintroduce repeatability at a higher level: through system definition, provenance, evaluation, and controlled generation pipelines.The Illusion of Missing ComplexityWith AI-generated software, the procedure itself starts to blur.The new layer is often perceived as a way to challenge software engineering itself: describe the outcome, let the model produce the procedure, and move on.That introduces a human risk too: the illusion that complexity has disappeared.AI makes code feel easy. A prompt is short. The answer is immediate. A prototype may run. A script may pass. A demo may look impressive.At small scale, that illusion can work.But the complexity tax has not vanished. It has moved into integration, architecture, security, maintenance, provenance, evaluation, and accountability.When the interface feels simple, teams may believe the system is simple.But the moment the system becomes business-critical, the missing structure returns.That is exactly when explicit specification, boundaries, contracts, and architecture become more important.AI is not removing the need for engineering. It is introducing a new requirement above generated code: System definition.The human deliverable is moving from writing every line of source code to defining the engineering structure that generated code must fit inside.Prompting DebtThe popular phrase is tempting:Prompting is the new programming language.Prompting is powerful, expressive, flexible, and dangerously easy to leave insufficiently defined.Every abstraction layer creates debt. AI introduces a particularly insidious form:Prompt debt is not messy text. It is definition debt.Architectural decisions, security rules, access policies, compliance requirements, business logic, and error-handling strategies can be smuggled into natural language without review, versioning, or governance.If a prompt says “only use safe customer data,” but “safe” is never defined in policy, schema, permissions, tests, or audit rules, that is not safety.That is definition debt.If those decisions are not explicit, tested, owned, and auditable, they become invisible system design.And invisible system design is still system design.Just a blurry design, an unpredictable design.Prompt engineering optimizes interaction with a model.Software engineering defines the system.System definition uses prompting as an interface, but it does something deeper. It provides the files, instructions, schemas, examples, tests, policies, constraints, and context that define the structure in which model output can be trusted, constrained, tested, and owned.Prompting shapes model output.System definition casts it.It gives generated code a mold: a place, a boundary, a test surface, and an owner.That is the difference between generating code and engineering software.When prompts start carrying implicit architectural rules or business logic, they are no longer just prompts.If those decisions remain implicit, untested, unversioned, or unowned, they become prompt debt: skipped engineering and system design disguised as natural language.If they are made explicit, governed, and tested, they become system definition.That is how system definition pays down prompt debt.System definition can drift too. But system definition drift is visible and fixable when the definition is explicit.When AI output is wrong, the deeper question is not only what code failed.It is what the system definition failed to specify.What is System Definition?A System Requirements Specification (SRS) describes what a system should do. System definition goes further: it defines the architectural control surface around AI-generated code.It includes two connected layers. The first is the deterministic system contract — how generated code must behave, interact, fail, and remain accountable in the real world. The second is the production specification — how that code is created, tested, traced, verified, and reproduced.System definition is the cast around generated code: the mold that defines where generated code belongs, which boundaries it must respect, which contracts it must honor, which constraints it must follow, and which evidence must prove it is valid. Its four core layers are  topology, contracts, constraints, and evaluation.This is not a new UML for systems. It is closer to an architecture, contract, policy, validation, and provenance layer for generated components.Why System Definition MattersA prompt asks AI to produce something. A system definition casts it.In mature AI-assisted development, generated code should increasingly be treated as an artifact of system definition, not as the only source of truth.If AI generates code, the engineering question is no longer only “What does this code do?” It becomes: “What produced it, from which context, with which dependencies, under which constraints, and with what audit trail?”When generated output is wrong, the deeper fix is rarely just patching the code — it usually means correcting the definition, regenerating, and preserving the verifiable link between definition, context, tests, and artifact.The code may compile and pass functional tests while violating the system definition, meaning that a working solution is not automatically a valid solution.Traditional software engineering depends on tests, CI/CD, dependency management, code review, lockfiles, package manifests, SBOMs, and release controls.In AI-assisted development, those controls should be complemented by two validation gates: the definition itself should be reviewed before generation, and the generated code should be checked against the system definition afterward.I explored this practical verification problem in Working Code, Wrong Engineering: Why AI-Generated Code Needs System-Definition Tests. The broader point here is simpler: generated code must remain connected to the structure that authorized it.Scoped Generation Is EngineeringSplitting work into scoped components that AI can safely generate is not prompting.It is engineering.Asking AI to “build a refund workflow” is prompting.Defining the refund workflow as separate parts — eligibility check, payment reversal, customer notification, fraud review, audit log, rollback path, and human escalation — is system definition.AI can generate each piece, but the engineering work is defining the boundaries, contracts, permissions, failure behavior, and audit trail between them.The same applies to any generated component. It still requires system-definition work: deciding what AI is allowed to know, change, call, generate, decide, and deploy — and how the component must behave, fail, prove compliance, and remain replaceable.That is one of the defining engineering skills of the AI era.When code is easier to generate, structure matters more, not less. Old software engineering patterns are not becoming obsolete — they are being vindicated.One risk with AI-generated code is that the model may defaults to statistically common patterns rather than the right ones for your context. Not every common pattern is a good engineering pattern. System definition is where we make those distinctions explicit: which patterns generated components must follow, and which are forbidden.AI can code components. But generated components become reliable systems only when architecture, integration, security, testing, failure-handling, and governance patterns make those components fit inside larger architectures.The interface of programming is changing. The responsibility for the system is not.Conclusion: The Human Role Moves UpwardSoftware engineering is not prompt engineering.The future engineer is not just a prompt writer.The future engineer is a system definer and boundary designer.That includes decomposing systems into scoped components, defining contracts, deciding where deterministic controls are required, designing evaluation suites, enforcing security boundaries, preserving audit trails, planning rollback, and assigning accountability.This is not less engineering.It is engineering at a higher level of abstraction.Code will increasingly be generated by AI.Systems will still need to be engineered by humans.The developers who thrive in the AI era will not be the best prompt writers. They will be the best engineers of structure — those who draw clear boundaries, enforce contracts, build evaluation harnesses, manage decision debt, preserve provenance, and own the consequences when probabilistic systems meet deterministic reality.The language has changed. The engineering discipline has not.It has simply moved to a higher, more critical level.Working code is not the destination. Governable software is.Related ReadingWorking Code, Wrong Engineering: Why AI-Generated Code Needs System-Definition Tests — A practical companion article that explores real cases of AI-generated code that works locally but introduces system-level risks, including supply-chain drift, SaaS cost drift, license incompatibility, and operational burden. It shows how to implement system-definition tests and verification gates in practice.\Agentic AI Security Needs Filtered IPO — How a filtered input-process-output pattern can reduce AI injection threats.\