Building a Mission-Critical Integration Blind, Part I: XML, Queues, and Guesswork

Wait 5 sec.

Part 1 of 2. Part 2 covers what 100,000 requests revealed about the architecture.There's a particular kind of freedom that comes with a blank slate. No legacy code. No decade-old architectural decisions are baked into the foundation. No monolith where changing one thing breaks three others. Just a task, a deadline, and full ownership of every decision.Late November 2025. I was asked to design and build a service connecting three of our company's trading platforms to a national government data exchange infrastructure. The system processes millions of legally significant requests across tax authorities, business registries, and state services — with XML signatures, GOST cryptography, asynchronous message delivery, and strict auditability requirements. These three platforms generate ~90% of the company's revenue. The project was a C-level priority.Stack: PHP 8.4, Symfony, PostgreSQL, RabbitMQ, Nginx, Kubernetes. Not the most exciting combination — but this was a team product. Building something maintainable by ten engineers matters more than something impressive in a language nobody else uses in our company.Three months. Ten people. Production, two weeks ahead of deadline. Here's what happened — including the parts that didn't go well.300 Pages of Government DocumentationThe first thing in my inbox wasn't a spec. It was three PDFs — roughly 300 pages of government-style writing. Tables referencing other tables. Self-contradictions. Critical details in footnotes on page 187. The kind of document that could only emerge from a committee under regulatory pressure over many years. I'd integrated with government systems before and didn't trust documentation implicitly. But I needed to understand it fast. In this role, as a lead, I need to manage the allocated resources effectively and assign tasks to the team in a timely manner to avoid downtime. LLM Tip: Let it read and research. Don't tell it what to think. I uploaded all three documents to Claude and resisted framing questions around what I already suspected. No "I'm thinking of doing X, does the docs support this?" — that gets you confirmation, not analysis. Instead: "You're a software engineer. Summarize what's relevant to building this integration. Then propose an architecture and explain why." Don't push toward the answer you want to hear (You're absolutely right!). The LLM won't catch every inconsistency — and it will occasionally state things confidently that turn out to be wrong — but it gets you oriented in an hour instead of a week. From that first session, I had a working mental model: messages, response formats, edge cases — and, crucially, what was suspiciously absent from the documentation. The absent parts are usually the most important.I also asked for a first draft of the database schema based on the documentation. Reviewed it, corrected it, turned it into migrations. Initial DB design: one day. Alignment sessions with the trading platforms: two days. Then we started building.The ArchitectureThe external system's contract: send a signed XML message to a queue, and at some point — minutes, hours, or up to five working days — a response arrives on a separate channel. No synchronous answer (but in the future, periodically polled APIs and other sources were also introduced). Synchronous thinking was never an option.Every state transition in the lifecycle of a request becomes a message. The controller just receives an HTTP request, validates it, persists the entity, dispatches an async message, and immediately returns a UUID. Everything after that happens in workers.HTTP Request → Controller → UUID returned immediately ↓ [Async message dispatched] ↓ XML Generation → Signing → Send to External Queue ↓ [Wait for async response] ↓ Response Listener → Parse → Process Documents ↓ RabbitMQ → Trading PlatformThe handler is an orchestrator — it determines which strategies apply to the current request state and delegates. Strategies register via DI container tags. This keeps bundles isolated and extensible. Adding a new data source means implementing the right interfaces and registering the bundle. The core handler doesn't change. Responses from external systems are handled with the Adapter pattern (command or listener, which catches the event, parses it, and pushes the formatted message back again to the main queue for the Request’s FSM).This paid off immediately. After launch, adding two new data source types took one day each, including tests.The codebase is split into bundles with explicit contract boundaries: ContractsBundle (shared interfaces, no concrete implementations), RequestBundle (lifecycle and orchestration), ArtemisBundle (Artemis/STOMP integration), AdapterBundle (REST adapter), DocumentBundle (file streaming), and a few others. LLM Tip: Generate a dependency graph as a Makefile command As the codebase grew, keeping bundle dependencies clean became harder to track manually. I asked Claude to generate a Makefile command that would parse use statements across all PHP files and produce a readable dependency list per bundle. Sample prompt: Write a command for the Makefile to list the dependencies between PHP bundles. The bundles are located in the src directory—search for `use` at the beginning of *.php files, filter out only those that start with our project's prefix, and group the results for each bundle. Display dependencies on contracts located in src/ContractsBundle separately. Exclude self-references. As a result, I want to see: ABundle -> Contracts: B,C,D Bundles: B BBundle -> Contracts: C Bundles: - We ran it periodically, spotted unexpected cross-bundle dependencies, and eliminated them. A surprisingly effective way to keep architecture honest as the team grows.Why Doctrine TransportFor internal queuing, I chose Symfony Messenger with PostgreSQL (Doctrine transport) over Redis or RabbitMQ.Requests are the core entity — losing one is unacceptable. Doctrine transport gives transactional guarantees out of the box: a message is only consumed when the database transaction commits. Messages live in the database, visible via standard SQL. One less infrastructure component to operate in Kubernetes.The theoretical downside is the throughput ceiling. In practice, 100,000 requests showed no meaningful lock contention. More on that in Part 2.Database: What I Got Wrong FirstLookup tables. At first, I wasn't sure how much we would need these reference tables, so I made the table more normalized. The initial schema had statuses, types, and formats as separate tables with integer foreign keys. Sensible default under uncertainty. The problem: every status transition required a lookup. Every request creation fetched multiple reference rows.Once the domain was stable, we migrated to PHP enum column mapping:#[ORM\Column(enumType: RequestStatusEnum::class)]private RequestStatusEnum $status;Eliminated dozens of queries per request and removed entire repository classes. If starting again: enums from day one, lookup tables only when runtime modification is genuinely needed.Payload separation. XML bodies and response payloads can be several megabytes. We moved them to a separate table so queries on the main requests table don't drag payloads through memory. PostgreSQL's TOAST mechanism handles large values transparently, so the performance difference was smaller than expected — but the architectural clarity is worth it.Audit logging via PostgreSQL trigger. Rather than logging calls scattered through application code, I put a trigger on the requests table:CREATE FUNCTION log_request_changes() RETURNS trigger LANGUAGE plpgsql AS $$BEGIN IF TG_OP = 'INSERT' THEN -- log creation with parameters ELSIF OLD.status_code IS DISTINCT FROM NEW.status_code THEN -- log each status transition with relevant metadata ELSIF OLD.xml IS NULL AND NEW.xml IS NOT NULL THEN -- log XML generation ELSIF OLD.is_signed = false AND NEW.is_signed = true THEN -- log signing END IF; INSERT INTO request_logs (...) VALUES (...); RETURN NEW;END; $$;The good: no logging scattered through application code, faster than ORM-level, captures manual DB changes. Invaluable for production debugging.The bad: team members are not comfortable with PostgreSQL functions. During significant table restructuring, I ended up writing trigger migrations myself. Conscious tradeoff — but worth knowing upfront.No Access. Build Your Own.One month in. Still no access to the external message broker.When access eventually arrived, the dev environment quota was 15 requests per day.Fifteen. Per day. For an integration project.There was no point arguing — bureaucratic systems move on bureaucratic timelines. So I built what we needed.The Mock ServerApache Artemis uses STOMP — a text-based protocol over TCP. I asked Claude to scaffold a complete bash script that creates a Docker-based mock: directory structure, Docker Compose config, Python STOMP server, fixture XML directory. The prompt included the relevant STOMP spec sections and expected behavior description.First working version: approximately two hours.The mock handles STOMP frames, routes messages between queues, simulates configurable delays, and matches incoming requests to fixture XML files. For matching: strip dynamic parts of incoming XML (signatures, timestamps, generated IDs), find the closest fixture using Levenshtein distance.The whole team developed against realistic scenarios — successful responses, errors, timeouts, and file attachments — without touching the real system. CI ran integration tests against the mock.When we finally connected to the real Artemis instance, the integration worked on the first try. Almost.The real system included an XML namespace declaration in its responses that our mock didn't produce. One namespace. Not in the documentation.The symptom: signature validation failing on perfectly valid-looking XML.Nothing in the error pointed to the cause.We eventually diffed the raw XML byte by byte.Not logically. Not structurally. Byte by byte.The difference was a namespace declaration that no one had documented — one extra attribute in the root element that our mock didn't reproduce.Twenty minutes to fix once found. Two hours to find.That's the kind of thing government documentation doesn't mention. And won't.The mock server is on GitHub.The XAdES GOST SignerThe external system requires XML messages signed with GOST cryptography in XAdES format. PHP has no native GOST support. OpenSSL either. Our internal signing service was in parallel development — not yet available.I spent roughly a week building a Docker container that could sign XML locally. XAdES documentation is scattered. The GOST+XAdES combination has almost no practical writing anywhere. The learning curve was steep. LLM Tip: Incremental exploration of poorly documented territory "Implement XAdES GOST signing" produces plausible-looking code that doesn't work. What works: “Write bash script that …”, "Here's the current canonicalized XML. Here's the error from signature verification. Here are the three canonicalization variants I've tried. What else could cause this specific mismatch?" Step by step. Verify each step before moving forward. This approach compressed roughly three weeks of documentation archaeology into one week of incremental debugging.The result: a Docker container exposing an HTTP endpoint that mirrors the production signing service API. Local signer and production signer implement the same interface. Switching between them: one environment variable. This tool made it possible to continue development without interruption despite time constraints.The hardest part wasn't cryptography — it was canonicalization, and putting all the requirements together so that it works. XAdES signs a canonical form of specific XML elements. A single extra whitespace, wrong attribute order, or missing namespace declaration silently produces an invalid signature. I spent more time on canonicalization edge cases than on the signing logic itself. The spec says "use canonical XML." What it doesn't say: which of the four canonicalization variants the receiving system expects, and that this is configuration, not discovery.Keeping the Team AlignedOne thing I'd do differently: write better task descriptions.Under deadline pressure, tickets were sometimes a sentence or two. What compensated: I called developers before they started, walked through the full context, listened to their questions, and adjusted the approach based on their feedback.More importantly, everyone understood the whole system — the state machine, the bundle boundaries, and why each architectural decision was made. When a developer understands the larger picture, they make better local decisions and surface better ideas. Several useful refinements came from those conversations, not from me.The ResultShipped two weeks ahead of the hard deadline.The time bought: two new data source types (one day each, including tests — direct result of the flexible and extensible architecture), Swagger documentation, performance profiling, and refactoring. The service shipped with full documentation: architecture overview, bundle-level READMEs, and architecture decision records capturing why each decision was made. That last part is what future maintainers actually need. It's also the hardest to write when you're deep in implementation — so we wrote it as we went, not at the end.The most important lesson from the first three months wasn't about architecture or technology.It was this: when you can't access the real system, build a good enough model of it. Not a perfect simulation — a model that forces you to make explicit decisions about every scenario. The real system will surprise you. But it will surprise you with details, not with fundamentals. And details are fixable in twenty minutes.\