Introducing rsonschema: a fast, ergonomic JSON Schema validator for Rust (and Python)

At hiop, we needed a portable way to describe data contracts. JSON Schema fit the bill, and we wanted a validator that stayed fast on the “compile schema every request” path while still giving end users errors they could act on.

Introducing rsonschema: a fast, ergonomic JSON Schema validator for Rust (and Python)

TL;DR

  • What: A JSON Schema Draft 2020-12 validator for Rust, with Python bindings (Apache-2.0). Older drafts are not supported. If your schemas declare $schema: draft-07 or earlier, they’ll need updating before you can adopt rsonschema.
  • Who: Teams that validate JSON at the edge (APIs, CLIs, data pipelines) and care about latency and error quality
  • Speed: On the cold path (schema work + validation each call) we beat the Rust jsonschema crate on most schema shapes; see the tables below.
  • Install: cargo add rsonschema or pip install rsonschema (Python >= 3.10)

Why we built it

In the world of data validation, structure matters. We wanted a language-agnostic format so the same contract could travel between services and stacks, and JSON Schema is the obvious lingua franca.

We had used the jsonschema crate in Rust. It is capable, but we kept tripping over two things:

  1. jsonschema::error::ValidationError borrows the instance field, which makes it harder to collect, log, or return errors without fighting lifetimes.
  2. Composition failures (anyOf, oneOf, allOf) surface generic messages instead of the constraint that actually explains what went wrong, especially painful when those errors are shown to API clients.

We also looked at valico (similar error-ergonomics issues, and limited maintenance) and schemars (excellent for generating schemas from Rust types, but not a validation engine).

So we built rsonschema: a validator that optimizes for owned, structured errors, human-readable messages, and cold-path performance that matches how many request handlers actually work (schema deserialized per call).

What rsonschema is

  • Draft: JSON Schema 2020-12 only: older drafts are not supported yet
  • Role: Validation and error reporting only: no schema generation and no annotation output.
  • Keywords: Standard keywords are implemented, including composition (allOf, anyOf, oneOf, not), conditionals (if / then / else), references ($ref, $anchor), unevaluated keywords (unevaluatedProperties, unevaluatedItems), and format assertions. The implementation is organized as one module per keyword under rust/src/schema/keyword/.
  • Python: Bindings ship on PyPI via PyO3 and maturin, alongside the crates.io crate.
  • Correctness: The official JSON Schema Test Suite (draft 2020-12) is run from this repo; see the README for the small set of deliberately unsupported dynamic keywords.

Design principles

Errors you can own and test

rsonschema::validate returns a ValidationReport with owned ValidationError values. No borrowed instance tying the error to the input buffer. That makes it straightforward to assert on errors in tests or forward them over an API boundary.

let schema = serde_json::json!({
    "$schema": "<https://json-schema.org/draft/2020-12/schema>",
    "minLength": 3
});

let instance = serde_json::json!("foo");
let report = rsonschema::validate(
    &instance,
    schema.clone(),
);
assert!(report.is_valid());

let instance = serde_json::json!("a");
let report = rsonschema::validate(
    &instance,
    schema,
);
assert_eq!(
    report,
    rsonschema::ValidationReport {
        errors: Some(
            rsonschema::error::ValidationErrors::from([
                rsonschema::error::ValidationError {
                    instance: serde_json::json!("a"),
                    type_: rsonschema::error::type_::ValidationErrorType::MinLength {
                        limit: 3.into(),
                    },
                    ..Default::default()
                }
            ])
        ),
        ..Default::default()
    }
);

Messages for humans, not just for machines

Each error is intended to be safe to show to an end user: failing value, JSON Pointer path, and a precise description.

let schema = serde_json::json!({
    "$schema": "<https://json-schema.org/draft/2020-12/schema>",
    "type": "string",
    "minLength": 5
});

let report = rsonschema::validate(&serde_json::json!("hi"), schema);
let error = report.errors.unwrap().into_iter().min().unwrap();
println!("{error}");
// "hi": must be longer than `5` characters

Nested paths are carried on the error:

let schema = serde_json::json!({
    "$schema": "<https://json-schema.org/draft/2020-12/schema>",
    "properties": {
        "user": {
            "required": ["name", "email"]
        }
    }
});

let report = rsonschema::validate(
    &serde_json::json!({"user": {"name": "Alice"}}),
    schema,
);
let error = report.errors.unwrap().into_iter().min().unwrap();
println!("{error}");
// {"name":"Alice"} at `user`: missing required: `email`

Composition: surface the branch that “almost” matched

When validation fails under a composition keyword, rsonschema tries to show the most relevant inner error instead of a vague composition failure. It scores how closely the instance resembles each branch (string similarity on values and property names).

let schema = serde_json::json!({
    "$schema": "<https://json-schema.org/draft/2020-12/schema>",
    "anyOf": [
        {"type": "string", "minLength": 5},
        {"type": "integer", "minimum": 10}
    ]
});

let report = rsonschema::validate(&serde_json::json!("hi"), schema);
let error = report.errors.unwrap().into_iter().min().unwrap();
println!("{error}");
// "hi": must be longer than `5` characters

"hi" is clearly closer to the string branch, so you get the minLength explanation, not a generic “did not match any schema”

Quick start

Rust

cargo add rsonschema

Python

pip install rsonschema
import rsonschema

schema = {"$schema": "<https://json-schema.org/draft/2020-12/schema>", "minLength": 3}

errors = rsonschema.validate("foo", schema)
assert errors == []

errors = rsonschema.validate("a", schema)
assert len(errors) == 1
assert str(errors[0])  # human-readable error description

Performance

We benchmark against both the Rust jsonschema crate and the jsonschema Python package. Two competitor modes matter:

  • Cold: compile the schema and validate on every iteration (aligned with rsonschema::validate, which takes an owned serde_json::Value schema each call).
  • Warm: compile once, measure validation only (best case for libraries that reuse a compiled validator).

All numbers below were measured on Apple M3 (8 GB RAM). Rust uses Criterion (median, 3 s warmup, 100 samples); Python uses pytest-benchmark on a release wheel (mean). Each Rust iteration clones the schema, because rsonschema::validate takes an owned value, that’s the realistic single-call shape. Full setup and per-scenario schemas are in BENCHMARKS.md.

Rust, lower is better. Re-run with cargo bench --package rsonschema.

Scenario rsonschema jsonschema (cold) jsonschema (warm)
simple_string_valid 738 ns 2.14 µs 5.6 ns
complex_object_valid (5 fields) 6.85 µs 8.95 µs 128 ns
any_of_composition 3.25 µs 4.91 µs 3.5 ns
all_of_composition 4.26 µs 6.10 µs 17.9 ns
array_of_objects (50 items) 54.0 µs 7.74 µs 1.59 µs

Python, lower is better.

Scenario rsonschema jsonschema (cold) jsonschema (warm)
simple_string_valid 1.36 µs 4.36 µs 1.12 µs
array_of_objects (50 items) 78.12 µs 392.46 µs 398.24 µs

Takeaways (read the numbers in context).

  • On the cold path, rsonschema is faster than the Rust jsonschema crate for the string, object, anyOf, and allOf scenarios above, workloads that resemble “deserialize schema, validate once” request handling.
  • For array_of_objects in Rust, jsonschema wins cold thanks to a very fast array traversal; we call that out instead of cherry-picking only flattering rows.
  • In Python, rsonschema beats jsonschema cold on every scenario in our suite and is dramatically faster on the 50-object array (~5× faster mean time in that row).
  • On simple_string_valid in Python, jsonschema/warm (1.12 µs) edges out rsonschema (1.36 µs), but that’s warm vs cold. The warm path assumes you’ve already paid the compile cost once and are reusing the compiled validator; on the cold path that’s relevant to most request handlers, rsonschema is 3× faster than jsonschema/cold (4.36 µs).
  • For the Python array_of_objects row, jsonschema/warm (398 µs) lands a touch above jsonschema/cold (392 µs), the FFI boundary and per-element work dominate so the compile-once optimisation barely registers, and the gap is within measurement noise.

Hardware and background load move absolute times; always re-run benchmarks on your own machine when choosing a library.

Scope: what we deliberately skip

We target one draft (2020-12) and validation only. That keeps the surface area understandable and the test suite authoritative.

We do not implement dynamic reference keywords $dynamicAnchor and $dynamicRef, they add substantial complexity for relatively rare real-world schemas. Everything else in the supported draft runs against the official test suite as documented in the README.

Try it and contribute

If you’re fighting lifetimes around jsonschema::ValidationError, or shipping cryptic anyOf / oneOf failures to your API clients, rsonschema is built for exactly that pain. Try it on a real schema and tell us what breaks. Issues and PRs welcome. See CONTRIBUTING.md for how we run tests and linting.