Event-driven systems are powerful, but they can also be fragile.
A small schema change in one service can silently break others down the line, the classic breaking change problem.
While no approach can completely eliminate breaking changes, contracts and validation can drastically reduce the risk and help you catch issues early before they ever reach production. If you’ve worked with event-driven systems long enough, you’ve probably seen this firsthand: a single schema tweak that silently breaks consumers down the line.
What makes this so tricky isn’t just that it happens; it’s that it’s often hard to detect until something fails in production.
A renamed field or missing property can lead to failed parsing, corrupted data, or silent business logic errors.
That’s why schema validation, contract testing, and automated checks in the pipeline matter. They provide a safety net that helps catch breaking changes before customers feel the impact.
How To Prevent Downstream Errors
There are many approaches. Here are two that work well in AWS-based event systems:
1. Use AWS EventBridge Schema Registry
EventBridge includes a schema registry that can auto-discover and store schemas (note: there are additional costs).
You can also build your own registry or keep schemas in version control.
A registry alone isn’t enough.
You also need versioning and notification: publishers must announce new versions and consumers need time to adapt.
Without a clear process, consumers risk breaking silently.
2. Version Your Contracts
If you introduce a breaking change, don’t overwrite the existing event.
Create a new version and let both exist in parallel for a while.
Communicate the change, give consumers time to migrate, and then decommission the old version.
Why Contracts Matter
Contracts make payloads explicit, verifiable, and testable.
They serve as a shared truth between publishers and consumers.
Without contracts:
- You rely on implicit expectations
- Breaking changes slip through unnoticed
- Downstream failures become unpredictable
With contracts:
- You can validate every incoming event against a defined schema
- You can version safely (support v1 and v2 side by side)
- You can catch errors early, for example through CI/CD validation
How Contract Validation Fits into the Architecture
- Producers publish events to an SQS queue
- A Lambda function consumes and validates each event against the expected contract version (v1, v2, etc.)
- If validation fails, the event is retried and eventually sent to a DLQ for inspection (e.g.,
maxReceiveCount: 3)
This setup ensures that only valid, contract-compliant messages are processed while invalid ones are isolated automatically.
Example: Versioned Contracts with Zod and TypeScript
Let’s start with a v1 contract and see how we can introduce a v2 version without breaking compatibility for existing producers.
src/contracts/messageV1.ts
This is our first version of the message contract. It defines the structure that all incoming events must follow, including required fields like id, type, and a payload with a valid timestamp.
import { z } from 'zod';
export const MessageV1 = z.object({
id: z.string().uuid(),
type: z.enum(['INFO', 'WARNING', 'ERROR']),
payload: z.object({
title: z.string().min(1),
description: z.string().min(1),
timestamp: z.string().refine((val) => !isNaN(Date.parse(val)), {
message: 'Invalid timestamp format',
}),
}),
});
export type MessageV1Type = z.infer<typeof MessageV1>;
src/contracts/messageV2.ts
Version 2 extends the original contract by adding a version field and an optional userEmail field in the payload. This shows how to evolve an event schema without breaking existing producers.
import { z } from 'zod';
import { MessageV1 } from './messageV1';
export const MessageV2 = MessageV1.extend({
version: z.literal('2'),
payload: MessageV1.shape.payload.extend({
userEmail: z.string().email().optional(),
}),
});
export type MessageV2Type = z.infer<typeof MessageV2>;
src/contracts/messageSchema.ts (optional central registry)
A central registry combines all supported contract versions into one schema.
This allows you to validate messages dynamically, regardless of whether they’re version 1 or version 2.
import { z } from 'zod';
import { MessageV1 } from './messageV1';
import { MessageV2 } from './messageV2';
export const MessageSchema = z.union([MessageV1, MessageV2]);
export type MessageType = z.infer<typeof MessageSchema>;
src/handler.ts (detect version first, then validate)
The Lambda handler receives events from SQS, detects the contract version, and validates the message accordingly.
Each version can be processed differently while maintaining backward compatibility and clear validation boundaries.
import { SQSHandler } from 'aws-lambda';
import { MessageV1 } from './contracts/messageV1';
import { MessageV2 } from './contracts/messageV2';
export const processMessage: SQSHandler = async (event) => {
for (const record of event.Records) {
try {
const body = JSON.parse(record.body);
const isV2 = 'version' in body && body.version === '2';
if (isV2) {
const message = MessageV2.parse(body);
console.log('Processing V2 message', {
id: message.id,
type: message.type,
title: message.payload.title,
email: message.payload.userEmail || '(no email)',
timestamp: message.payload.timestamp,
});
// V2-specific logic here
} else {
const message = MessageV1.parse(body);
console.log('Processing V1 message', {
id: message.id,
type: message.type,
title: message.payload.title,
timestamp: message.payload.timestamp,
});
// V1 logic continues unchanged
}
} catch (error: any) {
console.error('Message failed validation or processing', {
body: record.body,
error: error.message,
});
throw error; // Let SQS retry and route to DLQ after threshold
}
}
};
Sample Messages
These examples show how valid and invalid messages look for each version of the contract.
Invalid messages will fail validation at runtime and be retried or sent to the DLQ, depending on your configuration.
Valid v1
A valid v1 message that includes all required fields:
id,type, and a properly formattedpayload.
{
"id": "e7b1c37b-7b73-4d84-97a4-dc5a3e91c121",
"type": "INFO",
"payload": {
"title": "Legacy Order Processed",
"description": "Order completed successfully.",
"timestamp": "2025-10-28T10:00:00Z"
}
}
Invalid v1
This message is missing the description field inside payload and will fail validation.
{
"id": "e7b1c37b-7b73-4d84-97a4-dc5a3e91c121",
"type": "INFO",
"payload": {
"title": "Legacy Order Processed",
"timestamp": "2025-10-28T10:00:00Z"
}
}
Valid v2
A valid v2 message with the new version field and optional userEmail included.
{
"version": "2",
"id": "a8cb7c7f-93a0-4b24-b15d-63d9289ce9b1",
"type": "INFO",
"payload": {
"title": "User Registered",
"description": "New user created account.",
"timestamp": "2025-10-28T10:10:00Z",
"userEmail": "john@example.com"
}
}
Invalid v2
This message fails validation because the userEmail field is not a valid email address and the timestamp format is invalid.
{
"version": "2",
"id": "a8cb7c7f-93a0-4b24-b15d-63d9289ce9b1",
"type": "INFO",
"payload": {
"title": "User Registered",
"description": "New user created account.",
"timestamp": "not-a-date",
"userEmail": "not-an-email"
}
}
Why This Approach Works
- Backward compatible: Existing v1 producers continue to work without changes
- Forward ready: v2 producers can safely introduce new fields without breaking v1 consumers
- Safe evolution: You can log and monitor usage across versions, then deprecate v1 once all consumers have migrated
- Confidence: Every message is validated against a known contract, ensuring predictable and reliable behavior
Conclusion
In this post, we’ve explored how to prevent downstream errors caused by breaking changes in event-driven systems using contracts and validation.
By defining clear schemas, versioning events properly, and validating messages early, you can build systems that evolve safely without disrupting existing consumers.
The examples using Zod, TypeScript, and AWS services like Lambda and SQS showed how these principles can be applied in practice, giving you confidence that every message is well-structured, validated, and backward compatible.
All the source code used in this article is available on GitHub:
github.com/rashwanlazkani/aws-contract-testing
Here are the key takeaways to keep in mind:
- Treat event schemas as public contracts. Design them with the same care as APIs since multiple services depend on them.
- Use validation (for example, with Zod) and/or a schema registry to enforce structure and catch issues early.
- Always version breaking changes instead of overwriting existing events and maintain backward compatibility during migration.
- Validate early, fail fast, and communicate schema changes clearly across teams to prevent downstream failures.
Event-driven architectures thrive on autonomy and decoupling, but without clear contracts, that autonomy can quickly turn into chaos.
Contracts and validation keep your systems predictable, resilient, and future-proof.

