Schema Registry: The event is the API

Schema registry provides tools for describing your events.

Ben Pope

September 7, 2021

Last modified on

TL;DR Takeaways:

How can I register a schema in the Redpanda schema registry?

Schemas are registered against a subject, typically in the form {topic}-key or {topic}-value. You can register a schema by posting the schema to the /subjects/{subject-name}/versions endpoint with the Content-Type of application/vnd.schemaregistry.v1+json. The response will include a unique ID for the schema in the Redpanda cluster.

How can I retrieve a schema from the Redpanda schema registry?

You can retrieve a schema directly using its unique ID by sending a GET request to the /schemas/ids/{id} endpoint. You can also retrieve a schema by subject and version using the /subjects/{subject-name}/versions/{version-number} endpoint. To get the latest version of a schema for a subject, replace the version number with 'latest' in the endpoint.

How can I update my existing Redpanda installation to take advantage of the schema registry?

To use the schema registry in an existing Redpanda installation, you need to update to the latest version of Redpanda. If you are setting up a new instance, you can follow the instructions in the Linux, MacOS, Kubernetes, or Docker quick start guides provided on the Redpanda website.

What is the purpose of a schema in an event-driven system?

Schemas in event-driven systems serve as the contract between the producer and the consumer. The schema is used to document and evolve the API. It can be used as human-readable documentation for an API, to verify data conformity to the API, to generate serialisers for the data, and to evolve the API with predefined levels of compatibility. This allows new versions of services to be rolled out independently.

What is the Redpanda schema registry subsystem?

The Redpanda schema registry subsystem provides an interface for managing schemas. It is built into the Redpanda binary, with schemas stored on the same raft-based storage engine. The RESTful interface is available on every broker, ensuring high availability. There are no new binaries to install, no new services to deploy and maintain, and the default configuration just works.

Learn more at Redpanda University

Heads up: there's a newer version of this post. Read it here!

Introduction

Highly scalable, loosely coupled architectures often use an asynchronous event-driven design. In such systems, the contract between the producer and the consumer is the event - the event is the API.

It's important to document the API, and it's important to be able to evolve the API. This is often done using schema, such as Apache Avro, JSON Schema, or Protobuf.

We're pleased to announce the beta release of the schema registry subsystem of Redpanda that provides an interface for managing schema.

Built into the Redpanda binary, schemas are stored on the same raft-based storage engine and the RESTful interface is available on every broker. You get the same high availability as your data so there's nothing new to deploy, and it's available today.

To take advantage of the schema registry in an existing Redpanda installation, make sure you update to the latest version. Otherwise, follow the instructions in the Linux, MacOS, Kubernetes, or Docker quick start guides to spin up a new Redpanda instance.

If you want to leave the infrastructure issues to us, sign up for Redpanda Cloud for the simplest way to run Redpanda.

To get down to business, skip ahead to the example.

Overview

A loosely coupled architecture not only reduces dependencies in the code, it also reduces communication overhead between and within teams. By defining the API, or in this case the event, with a schema, disparate teams can start work on the subsystems that produce and consume those events with minimal communication overhead.

Operational complexity

At Redpanda, we like to make things simple. Redpanda is an Apache Kafka®-compatible event streaming platform that eliminates Zookeeper® and the JVM, autotunes itself for modern hardware, and ships in a single binary.

We've built the schema registry directly into Redpanda; there are no new binaries to install, no new services to deploy and maintain, and the default configuration just works.

Schemas are stored in a standard compacted topic, we use optimistic concurrency control at the topic level to allow mutating REST calls to any broker. There's no need to configure leadership or failover strategies, every broker is symmetric.

Schema

A schema can be used as human readable documentation for an API, to verify data conforms to that API, to generate serialisers for the data, and to evolve the API with predefined levels of compatibility, allowing new versions of services to be rolled out independently.

Some data encodings are somewhat self-describing, but that can make them verbose. Some encodings are extensible. JSON for example, has a property name and a property value. The name isn't part of the information, but it allows new fields to be easily added by the producer and ignored by the consumer.

A schema is an external mechanism to describe the data and its encoding, allowing a reduction in the amount of data transmitted, while keeping the same information. It also allows defaults for new fields, which means that it's possible to decouple the rollout of producers and consumers.

Example

Start Redpanda

Let's jump right in and start Redpanda using Docker on Linux:

docker network create redpanda-sr docker volume create redpanda-sr docker run \ --pull=always \ --name=redpanda-sr \ --net=redpanda-sr \ -v "redpanda-sr:/var/lib/redpanda/data" \ -p 8081:8081 \ -p 8082:8082 \ -p 9092:9092 \ --detach \ docker.vectorized.io/vectorized/redpanda start \ --overprovisioned \ --smp 1 \ --memory 1G \ --reserve-memory 0M \ --node-id 0 \ --check=false \ --pandaproxy-addr 0.0.0.0:8082 \ --advertise-pandaproxy-addr 127.0.0.1:8082 \ --kafka-addr 0.0.0.0:9092 \ --advertise-kafka-addr redpanda-sr:9092

Now we're ready to start using the schema registry!

Endpoints are documented with Swagger at http://localhost:8081/v1 or on SwaggerHub

I'm using jq to prettify and process the JSON responses.

We'll use the popular requests module (pip install requests).

For the rest of the guide, we'll assume the following for an interactive python session:

import requests import json def pretty(text): print(json.dumps(text, indent=2)) base_uri = "http://localhost:8081"

Schemas

The currently supported schema type is AVRO, we plan to support JSON and PROTOBUF.

You can query the schema registry for that:

Curl
Python

curl -s \ "http://localhost:8081/schemas/types" \ | jq .

[ "AVRO" ]

Publish a schema

Schemas are registered against a subject, typically in the form {topic}-key or {topic}-value.

Let's register an example Avro schema which represents a measurement from a sensor for the value of the sensor topic.

{ "type": "record", "name": "sensor_sample", "fields": [ { "name": "timestamp", "type": "long", "logicalType": "timestamp-millis" }, { "name": "identifier", "type": "string", "logicalType": "uuid" }, { "name": "value", "type": "long" } ] }

We need to POST the AVRO schema to /subjects/sensor-value/versions endpoint with the Content-Type of application/vnd.schemaregistry.v1+json:

Curl
Python

curl -s \ -X POST \ "http://localhost:8081/subjects/sensor-value/versions" \ -H "Content-Type: application/vnd.schemaregistry.v1+json" \ -d '{"schema": "{\"type\":\"record\",\"name\":\"sensor_sample\",\"fields\":[{\"name\":\"timestamp\",\"type\":\"long\",\"logicalType\":\"timestamp-millis\"},{\"name\":\"identifier\",\"type\":\"string\",\"logicalType\":\"uuid\"},{\"name\":\"value\",\"type\":\"long\"}]}"}' \ | jq

{ "id": 1 }

The id is unique for the schema in the Redpanda cluster.

Retrieve the schema by its ID

We can retrieve the schema directly using its ID:

Curl
Python

curl -s \ "http://localhost:8081/schemas/ids/1" \ | jq .

{ "schema": "{\"type\":\"record\",\"name\":\"sensor_sample\",\"fields\":[{\"name\":\"timestamp\",\"type\":\"long\",\"logicalType\":\"timestamp-millis\"},{\"name\":\"identifier\",\"type\":\"string\",\"logicalType\":\"uuid\"},{\"name\":\"value\",\"type\":\"long\"}]}" }

List the subjects

Now that a schema is associated with a subject, let's list the subjects:

Curl
Python

curl -s \ "http://localhost:8081/subjects" \ | jq .

[ "sensor-value" ]

Cool! We knew that, but now anyone can discover them.

Retrieve the schema versions for the subject

Schemas associated with subjects are versioned. That's how your API can evolve.

Let's query the versions for the sensor-value subject:

Curl
Python

curl -s \ "http://localhost:8081/subjects/sensor-value/versions" \ | jq .

[ 1 ]

Retrieve a schema for the subject

If we know the subject and the version we want, we can query directly:

Curl
Python

curl -s \ "http://localhost:8081/subjects/sensor-value/versions/1" \ | jq .

{ "subject": "sensor-value", "id": 1, "version": 1, "schema": "{\"type\":\"record\",\"name\":\"sensor_sample\",\"fields\":[{\"name\":\"timestamp\",\"type\":\"long\",\"logicalType\":\"timestamp-millis\"},{\"name\":\"identifier\",\"type\":\"string\",\"logicalType\":\"uuid\"},{\"name\":\"value\",\"type\":\"long\"}]}" }

Instead of a specific version, we can ask for the latest:

Curl
Python

curl -s \ "http://localhost:8081/subjects/sensor-value/versions/latest" \ | jq .

It's also possible to query for just the schema by appending /schema to the query path. That unwraps the escaped schema:

Curl
Python

curl -s \ "http://localhost:8081/subjects/sensor-value/versions/latest/schema" \ | jq .

Compatibility

There are several types of compatibility:

BACKWARDS- Allows consumers of the new version to read the previous version
FORWARDS- Allows consumers of the previous version to read the new version
FULL- Forwards and backwards compatibility is ensured.

Each of these will check against the most recent version. To check against all registered versions for a subject, they can have _TRANSITIVE appended.

NONE- No compatibility is required.

The default global compatibility is backwards.

Compatibility can be set explicitly for a subject:

Curl
Python

curl -s \ -X PUT \ "http://localhost:8081/config/sensor-value" \ -H "Content-Type: application/vnd.schemaregistry.v1+json" \ -d '{"compatibility": "BACKWARD"}' \ | jq .

{ "compatibility": "BACKWARD" }

Evolving a schema

Posting a backwards incompatible change to a subject will fail.

For example, changing the type of the value field from long to int:

Curl
Python

{ "error_code": 409, "message": "Schema being registered is incompatible with an earlier schema for subject \"{sensor-value}\"" }

A backwards compatible change would be changing it from a long to a double:

Curl
Python

{ "id": 2 }

Cleanup

Now we can cleanup:

docker stop redpanda-sr docker rm redpanda-sr docker volume remove redpanda-sr docker network remove redpanda-sr

Conclusion

We'll be adding more endpoints and more encodings. For an up-to-date list of features and their status see the schema registry features meta-issue on GitHub.

The schema registry is built on the same principles as Redpanda, but has not yet been optimized for performance. We are continuing to work on the schema registry, so make sure you join our slack community to get updates on the progress!

No items found.

Join the Redpanda Community on Slack

Chat with our team, ask industry experts, and meet fellow data streaming enthusiasts.

FEATURED RESOURCE

Table of contents

Marc Millstone

Jul 9, 2026

What is an Agentic Data Plane?

What is it, why enterprises need it, and how to evaluate one

Text Link

Product

Tutorial

Paul Wilkinson

Jun 23, 2026

Bridge Queries in Redpanda SQL

Have your real-time cake and eat your analytics too

Text Link

Engineering

Product

Evgeny Lazin

Jun 18, 2026

Adaptive write request scheduling in Redpanda's Cloud Topics

Solving a Kafka problem to balance batching efficiency against latency and cost

Text Link

PANDA MAIL

Stay in the loop

Subscribe to our VIP (very important panda) mailing list to pounce on the latest blogs, surprise announcements, and community events!
Opt out anytime.

Schema Registry: The event is the API

Introduction

Overview

Operational complexity

Schema

Example

Start Redpanda

Schemas

Publish a schema

Retrieve the schema by its ID

List the subjects

Retrieve the schema versions for the subject

Retrieve a schema for the subject

Compatibility

Evolving a schema

Cleanup

Conclusion

Join the Redpanda Community on Slack

Related articles

What is an Agentic Data Plane?

Bridge Queries in Redpanda SQL

Adaptive write request scheduling in Redpanda's Cloud Topics

Stay in the loop