Schema Registry — Event Schema Evolution, Compatibility, and Validation
The schema problem in event-driven systems#
Producer A publishes an event with a user_id field. Six months later, someone renames it to userId. Consumer B breaks at 3 AM. Nobody knows why because there is no contract, no validation, and no changelog.
A schema registry solves this by acting as the single source of truth for event schemas — and enforcing compatibility rules so changes never break consumers.
What a schema registry does#
- Stores schemas — versioned history of every event schema
- Validates compatibility — rejects schema changes that would break consumers
- Provides schema IDs — producers embed a schema ID in each message so consumers know how to deserialize
- Enables evolution — add fields, deprecate fields, and evolve schemas safely over time
Serialization formats#
Apache Avro#
The most common choice for Kafka ecosystems. Binary format with a compact wire representation.
{
"type": "record",
"name": "UserCreated",
"namespace": "com.example.events",
"fields": [
{"name": "user_id", "type": "string"},
{"name": "email", "type": "string"},
{"name": "created_at", "type": "long", "logicalType": "timestamp-millis"},
{"name": "plan", "type": "string", "default": "free"}
]
}
Strengths: Schema embedded in data, excellent backward/forward compatibility, compact binary encoding. The schema is required for both reading and writing — this forces producers and consumers to agree.
Protocol Buffers (Protobuf)#
Google's serialization format. Strongly typed, with explicit field numbering.
syntax = "proto3";
message UserCreated {
string user_id = 1;
string email = 2;
int64 created_at = 3;
string plan = 4;
}
Strengths: Smaller wire size than Avro, code generation for many languages, field numbers make evolution explicit. Tradeoff: Requires code generation step, schema not self-describing on the wire.
JSON Schema#
Human-readable, widely supported, but verbose on the wire.
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"title": "UserCreated",
"required": ["user_id", "email", "created_at"],
"properties": {
"user_id": {"type": "string"},
"email": {"type": "string", "format": "email"},
"created_at": {"type": "integer"},
"plan": {"type": "string", "default": "free"}
}
}
Strengths: No special tooling needed, easy to debug, works with REST and webhooks. Tradeoff: No binary encoding (large payloads), less strict evolution guarantees.
Choosing a format#
| Criteria | Avro | Protobuf | JSON Schema |
|---|---|---|---|
| Wire size | Small | Smallest | Large |
| Schema evolution | Built-in | Via field numbers | Manual |
| Human readable | No (binary) | No (binary) | Yes |
| Code generation | Optional | Required | Optional |
| Kafka ecosystem | Native | Supported | Supported |
| Best for | Kafka-heavy pipelines | Multi-language RPCs | REST/webhook events |
Confluent Schema Registry#
The de facto standard for Kafka. Runs as a separate service alongside your Kafka brokers.
How it works#
Producer Schema Registry Kafka
│ │ │
│── Register schema v1 ───────▸│ │
│◂── Schema ID: 42 ───────────│ │
│ │ │
│── Produce [ID:42 + payload] ─┼───────────────────────▸│
│ │ │
│ │ Consumer │
│ │◂── Get schema ID:42 ──│
│ │── Return schema v1 ───▸│
│ │ (deserialize) │
The first 5 bytes of every Kafka message contain a magic byte and the 4-byte schema ID. Consumers use this ID to fetch the correct schema from the registry.
Registering a schema#
curl -X POST http://schema-registry:8081/subjects/user-created-value/versions \
-H "Content-Type: application/vnd.schemaregistry.v1+json" \
-d '{
"schemaType": "AVRO",
"schema": "{\"type\":\"record\",\"name\":\"UserCreated\",\"fields\":[{\"name\":\"user_id\",\"type\":\"string\"},{\"name\":\"email\",\"type\":\"string\"}]}"
}'
Schema compatibility modes#
This is the core value of a schema registry. Compatibility rules prevent breaking changes.
Backward compatible#
New schema can read data written with the old schema. You can add fields with defaults and remove optional fields.
This is the default mode in Confluent Schema Registry. Safe when consumers are upgraded before producers.
Schema v1: {user_id, email}
Schema v2: {user_id, email, plan (default: "free")} ← backward compatible
Consumer with v2 can read v1 data — plan gets the default value.
Forward compatible#
Old schema can read data written with the new schema. You can remove fields and add optional fields.
Safe when producers are upgraded before consumers.
Schema v1: {user_id, email, plan}
Schema v2: {user_id, email} ← forward compatible
Consumer with v1 can read v2 data — it ignores the missing plan field.
Full compatible#
Both backward and forward compatible. You can only add or remove optional fields with defaults.
The safest mode. Use this when you cannot control deployment order of producers and consumers.
Breaking changes (none mode)#
No compatibility checking. Any schema is accepted. Use only in development environments.
| Mode | Add field | Remove field | Rename field | Change type |
|---|---|---|---|---|
| Backward | With default | Optional only | No | No |
| Forward | Optional only | Yes | No | No |
| Full | With default | Optional with default | No | No |
| None | Yes | Yes | Yes | Yes |
Schema validation in CI#
Do not wait for production to discover incompatible schemas. Validate in your CI pipeline.
Schema compatibility check#
# .github/workflows/schema-check.yml
name: Schema Compatibility Check
on: [pull_request]
jobs:
check-schemas:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Check backward compatibility
run: |
for schema_file in schemas/*.avsc; do
subject=$(basename "$schema_file" .avsc)
result=$(curl -s -X POST \
"http://schema-registry:8081/compatibility/subjects/${subject}-value/versions/latest" \
-H "Content-Type: application/vnd.schemaregistry.v1+json" \
-d "{\"schema\": $(jq -Rs . "$schema_file")}")
if echo "$result" | jq -e '.is_compatible == false' > /dev/null; then
echo "INCOMPATIBLE: $subject"
exit 1
fi
done
Schema linting#
Beyond compatibility, enforce naming conventions and documentation:
# ci/lint_schemas.py
import json, sys
def lint_avro_schema(schema):
errors = []
for field in schema.get("fields", []):
# Enforce snake_case
if field["name"] != field["name"].lower():
errors.append(f"Field '{field['name']}' must be snake_case")
# Require doc strings
if "doc" not in field:
errors.append(f"Field '{field['name']}' missing 'doc'")
# New fields must have defaults
if "default" not in field and field.get("_new"):
errors.append(f"New field '{field['name']}' must have a default")
return errors
Schema evolution best practices#
- Never rename fields — add a new field and deprecate the old one
- Never change field types —
stringtointbreaks everything - Always add defaults to new fields — ensures backward compatibility
- Use full compatibility mode in production — the safest option
- Version your schemas in git — schemas are code, treat them that way
- One schema per topic — do not mix event types in a single Kafka topic
- Document every field — the
docattribute in Avro exists for a reason
Alternatives to Confluent Schema Registry#
| Tool | Notes |
|---|---|
| AWS Glue Schema Registry | Native for MSK (managed Kafka on AWS) |
| Apicurio Registry | Open source, supports Avro, Protobuf, JSON Schema, OpenAPI |
| Buf | Protobuf-focused, with breaking change detection and linting |
| Karapace | Open-source drop-in replacement for Confluent Schema Registry |
Visualize your event schema architecture#
Map your producers, consumers, schema registry, and compatibility flows — try Codelit to generate an interactive diagram.
Key takeaways#
- A schema registry is the contract layer between producers and consumers
- Avro for Kafka, Protobuf for gRPC, JSON Schema for REST — pick based on your ecosystem
- Full compatibility mode is the safest default for production topics
- Never rename or retype fields — add new fields with defaults instead
- Validate compatibility in CI — catch breaking changes before they reach production
- Schemas are code — version them in git, lint them, review them in PRs
Article #309 on Codelit — Keep building, keep shipping.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Related articles
Try these templates
Apache Kafka Event Streaming Platform
Distributed event streaming with producers, brokers, consumer groups, partitions, and exactly-once semantics.
10 componentsEvent Sourcing with CQRS
Event-driven architecture with separate read/write models, event store, projections, and eventual consistency.
10 componentsBuild this architecture
Generate an interactive architecture for Schema Registry in seconds.
Try it in Codelit →
Comments