Introducing rsonschema: a fast, ergonomic JSON Schema validator for Rust (and Python)
At hiop, we needed a portable way to describe data contracts. JSON Schema fit the bill, and we wanted a validator that stayed fast on the “compile schema every request” path while still giving end users errors they could act on.
TL;DR
- What: A JSON Schema Draft 2020-12 validator for Rust, with Python bindings (Apache-2.0). Older drafts are not supported. If your schemas declare
$schema: draft-07or earlier, they’ll need updating before you can adoptrsonschema. - Who: Teams that validate JSON at the edge (APIs, CLIs, data pipelines) and care about latency and error quality
- Speed: On the cold path (schema work + validation each call) we beat the Rust
jsonschemacrate on most schema shapes; see the tables below. - Install:
cargo add rsonschemaorpip install rsonschema(Python >= 3.10)
Why we built it
In the world of data validation, structure matters. We wanted a language-agnostic format so the same contract could travel between services and stacks, and JSON Schema is the obvious lingua franca.
We had used the jsonschema crate in Rust. It is capable, but we kept tripping over two things:
jsonschema::error::ValidationErrorborrows theinstancefield, which makes it harder to collect, log, or return errors without fighting lifetimes.- Composition failures (
anyOf,oneOf,allOf) surface generic messages instead of the constraint that actually explains what went wrong, especially painful when those errors are shown to API clients.
We also looked at valico (similar error-ergonomics issues, and limited maintenance) and schemars (excellent for generating schemas from Rust types, but not a validation engine).
So we built rsonschema: a validator that optimizes for owned, structured errors, human-readable messages, and cold-path performance that matches how many request handlers actually work (schema deserialized per call).
What rsonschema is
- Draft: JSON Schema 2020-12 only: older drafts are not supported yet
- Role: Validation and error reporting only: no schema generation and no annotation output.
- Keywords: Standard keywords are implemented, including composition (
allOf,anyOf,oneOf,not), conditionals (if/then/else), references ($ref,$anchor), unevaluated keywords (unevaluatedProperties,unevaluatedItems), and format assertions. The implementation is organized as one module per keyword underrust/src/schema/keyword/. - Python: Bindings ship on PyPI via PyO3 and maturin, alongside the crates.io crate.
- Correctness: The official JSON Schema Test Suite (draft 2020-12) is run from this repo; see the README for the small set of deliberately unsupported dynamic keywords.
Design principles
Errors you can own and test
rsonschema::validate returns a ValidationReport with owned ValidationError values. No borrowed instance tying the error to the input buffer. That makes it straightforward to assert on errors in tests or forward them over an API boundary.
let schema = serde_json::json!({
"$schema": "<https://json-schema.org/draft/2020-12/schema>",
"minLength": 3
});
let instance = serde_json::json!("foo");
let report = rsonschema::validate(
&instance,
schema.clone(),
);
assert!(report.is_valid());
let instance = serde_json::json!("a");
let report = rsonschema::validate(
&instance,
schema,
);
assert_eq!(
report,
rsonschema::ValidationReport {
errors: Some(
rsonschema::error::ValidationErrors::from([
rsonschema::error::ValidationError {
instance: serde_json::json!("a"),
type_: rsonschema::error::type_::ValidationErrorType::MinLength {
limit: 3.into(),
},
..Default::default()
}
])
),
..Default::default()
}
);
Messages for humans, not just for machines
Each error is intended to be safe to show to an end user: failing value, JSON Pointer path, and a precise description.
let schema = serde_json::json!({
"$schema": "<https://json-schema.org/draft/2020-12/schema>",
"type": "string",
"minLength": 5
});
let report = rsonschema::validate(&serde_json::json!("hi"), schema);
let error = report.errors.unwrap().into_iter().min().unwrap();
println!("{error}");
// "hi": must be longer than `5` characters
Nested paths are carried on the error:
let schema = serde_json::json!({
"$schema": "<https://json-schema.org/draft/2020-12/schema>",
"properties": {
"user": {
"required": ["name", "email"]
}
}
});
let report = rsonschema::validate(
&serde_json::json!({"user": {"name": "Alice"}}),
schema,
);
let error = report.errors.unwrap().into_iter().min().unwrap();
println!("{error}");
// {"name":"Alice"} at `user`: missing required: `email`
Composition: surface the branch that “almost” matched
When validation fails under a composition keyword, rsonschema tries to show the most relevant inner error instead of a vague composition failure. It scores how closely the instance resembles each branch (string similarity on values and property names).
let schema = serde_json::json!({
"$schema": "<https://json-schema.org/draft/2020-12/schema>",
"anyOf": [
{"type": "string", "minLength": 5},
{"type": "integer", "minimum": 10}
]
});
let report = rsonschema::validate(&serde_json::json!("hi"), schema);
let error = report.errors.unwrap().into_iter().min().unwrap();
println!("{error}");
// "hi": must be longer than `5` characters
"hi" is clearly closer to the string branch, so you get the minLength explanation, not a generic “did not match any schema”
Quick start
Rust
cargo add rsonschema
Python
pip install rsonschema
import rsonschema
schema = {"$schema": "<https://json-schema.org/draft/2020-12/schema>", "minLength": 3}
errors = rsonschema.validate("foo", schema)
assert errors == []
errors = rsonschema.validate("a", schema)
assert len(errors) == 1
assert str(errors[0]) # human-readable error description
Performance
We benchmark against both the Rust jsonschema crate and the jsonschema Python package. Two competitor modes matter:
- Cold: compile the schema and validate on every iteration (aligned with
rsonschema::validate, which takes an ownedserde_json::Valueschema each call). - Warm: compile once, measure validation only (best case for libraries that reuse a compiled validator).
All numbers below were measured on Apple M3 (8 GB RAM). Rust uses Criterion (median, 3 s warmup, 100 samples); Python uses pytest-benchmark on a release wheel (mean). Each Rust iteration clones the schema, because rsonschema::validate takes an owned value, that’s the realistic single-call shape. Full setup and per-scenario schemas are in BENCHMARKS.md.
Rust, lower is better. Re-run with cargo bench --package rsonschema.
| Scenario | rsonschema | jsonschema (cold) | jsonschema (warm) |
|---|---|---|---|
simple_string_valid |
738 ns | 2.14 µs | 5.6 ns |
complex_object_valid (5 fields) |
6.85 µs | 8.95 µs | 128 ns |
any_of_composition |
3.25 µs | 4.91 µs | 3.5 ns |
all_of_composition |
4.26 µs | 6.10 µs | 17.9 ns |
array_of_objects (50 items) |
54.0 µs | 7.74 µs | 1.59 µs |
Python, lower is better.
| Scenario | rsonschema | jsonschema (cold) | jsonschema (warm) |
|---|---|---|---|
simple_string_valid |
1.36 µs | 4.36 µs | 1.12 µs |
array_of_objects (50 items) |
78.12 µs | 392.46 µs | 398.24 µs |
Takeaways (read the numbers in context).
- On the cold path,
rsonschemais faster than the Rustjsonschemacrate for the string, object,anyOf, andallOfscenarios above, workloads that resemble “deserialize schema, validate once” request handling. - For
array_of_objectsin Rust,jsonschemawins cold thanks to a very fast array traversal; we call that out instead of cherry-picking only flattering rows. - In Python,
rsonschemabeatsjsonschemacold on every scenario in our suite and is dramatically faster on the 50-object array (~5× faster mean time in that row). - On
simple_string_validin Python,jsonschema/warm(1.12 µs) edges outrsonschema(1.36 µs), but that’s warm vs cold. The warm path assumes you’ve already paid the compile cost once and are reusing the compiled validator; on the cold path that’s relevant to most request handlers,rsonschemais 3× faster thanjsonschema/cold(4.36 µs). - For the Python
array_of_objectsrow,jsonschema/warm(398 µs) lands a touch abovejsonschema/cold(392 µs), the FFI boundary and per-element work dominate so the compile-once optimisation barely registers, and the gap is within measurement noise.
Hardware and background load move absolute times; always re-run benchmarks on your own machine when choosing a library.
Scope: what we deliberately skip
We target one draft (2020-12) and validation only. That keeps the surface area understandable and the test suite authoritative.
We do not implement dynamic reference keywords $dynamicAnchor and $dynamicRef, they add substantial complexity for relatively rare real-world schemas. Everything else in the supported draft runs against the official test suite as documented in the README.
Try it and contribute
- Rust: crates.io/crates/rsonschema · docs.rs/rsonschema
- Python: pypi.org/project/rsonschema
- Source & CI: github.com/hiop-oss/rsonschema
If you’re fighting lifetimes around jsonschema::ValidationError, or shipping cryptic anyOf / oneOf failures to your API clients, rsonschema is built for exactly that pain. Try it on a real schema and tell us what breaks. Issues and PRs welcome. See CONTRIBUTING.md for how we run tests and linting.