Security 14 min read

Canonical JSON in 2026: How to Sign, Hash, and Compare JSON Without Security Footguns

Learn why you should never sign raw JSON text, how RFC 8785 (JCS) solves canonicalization, and the replay-resistant envelope patterns that actually work in production.

#canonicalization #signing #rfc-8785 #jcs #security #cryptography

You've got JSON flowing between services. You need to sign it, hash it, or compare it deterministically. Sounds simple—until you realize that {"a":1,"b":2} and {"b":2,"a":1} are semantically identical but produce completely different hashes. Welcome to the canonicalization problem.

TL;DR

  • Never sign "JSON text" as-is — whitespace, key order, and escaping can change without changing meaning
  • Use JCS (RFC 8785) to canonicalize before signing/hashing
  • Treat JSON parsing as an attack surface: reject duplicate keys, enforce size limits
  • Use replay-resistant envelopes with nonce, timestamp, audience, and expiry

The Problem: JSON is a Data Model, Not Bytes

Per RFC 8259, these two JSON documents are semantically equivalent:

equivalent-json-1.json
json
{"a":1,"b":2}
equivalent-json-2.json
json
{
  "b": 2,
  "a": 1
}

But if you hash them directly, you get completely different results. This breaks signature verification, content-addressable storage, and any system that relies on "same data = same bytes."

⚠️ The Common Failure Mode: Teams sign the string they happened to serialize, then later parse/pretty-print/re-serialize and wonder why signatures "randomly" fail—or worse, accept malformed JSON that passes signature checks but changes semantics when parsed differently.

Why This Matters More in 2026

We're seeing more systems that treat JSON as a portable envelope for:

  • AI/tool-calling: Model outputs as JSON that trigger real-world actions
  • Zero-trust service-to-service calls: Signed payloads across trust boundaries
  • Enclosed systems (IoT/industrial): Constrained devices sending JSON telemetry and receiving JSON commands

In all these cases, you need cryptographic guarantees over JSON data. That requires canonicalization.

The Solution: JCS (RFC 8785)

JSON Canonicalization Scheme (JCS) defines deterministic rules for turning a JSON value into canonical JSON text:

  • Object member ordering: Lexicographic by UTF-16 code units
  • Minimal whitespace: No unnecessary spaces or newlines
  • Consistent escaping: Predictable character escaping
  • Number normalization: 1, 1.0, 1e0 all canonicalize consistently

If you need "same data ⇒ same bytes," JCS is the baseline I recommend in 2026.

JSON Canonicalization & Signing Pipeline (RFC 8785) 1. RAW JSON { "b" : 2 , "a" : 1 } ⚠️ Key order varies ⚠️ Whitespace differs ⚠️ Number formats vary parse 2. PARSE + VALIDATE ✓ Strict JSON parse ✓ No duplicate keys ✓ Schema validation JCS 3. CANONICALIZE { "a" : 1 , "b" : 2 } ✓ Sorted keys (UTF-16) ✓ Minimal whitespace ✓ Normalized numbers sign 4. SIGN Ed25519 ECDSA P-256 ✓ Deterministic Replay-Resistant Envelope { "v":1, "kid":"key-2026-01", "aud":"orders-api", "iat":1736000000, "exp":1736000300, "nonce":"b64url(...)", "payload":{ ... }, "sig":"b64url(sig)" } Anti-Replay Fields Explained kid Key ID — which key signed this aud Audience — intended recipient (prevents cross-service replay) iat Issued At — when the message was created exp Expiry — message is invalid after this time nonce Random value — prevents same-window replays schema Schema version — prevents interpretation drift
The JSON canonicalization and signing pipeline: Parse → Validate → Canonicalize (JCS) → Sign with replay protection

What JCS Does NOT Solve

JCS gives you stable bytes. It doesn't decide your product policy on:

  • Duplicate keys: JSON itself doesn't forbid them; parsers disagree on handling
  • Numeric range/precision: JavaScript's IEEE-754 limits vs. other languages
  • Schema semantics: Business rules and data contracts
  • Replay protection: Signing alone doesn't stop replay attacks

You still need guardrails on top of canonicalization.

Threat Model: What Can Go Wrong

1. Signature Bypass via Parser Divergence

If your verifier parses JSON with library A and your business logic uses library B, differences in duplicate key handling, number coercion, or Unicode normalization can create two different "meanings" for the same signed payload.

Rule: Verify signature and interpret payload using a single, strict parse/validate/canonicalize pipeline.

2. Replay Attacks

If your signed JSON command is:

replay-vulnerable.json
json
{"action":"unlockDoor","doorId":"A12"}

An attacker who records it can replay it later—the signature is still valid. This is especially dangerous in IoT and industrial systems.

Rule: Include nonce/timestamp, bind to recipient, and enforce freshness.

3. JSON-Level DoS

Even "valid" JSON can be weaponized:

  • Huge strings/arrays/objects (memory exhaustion)
  • Deep nesting (stack blowups)
  • Pathological numbers (very long exponents)
  • Regex-heavy schema validation (ReDoS)
Rule: Enforce budgets at parse time and validation time.

The Safe Pipeline (Recommended)

Step 0: Define Your "JSON Policy"

Decide and document these upfront:

  • Strict JSON only: No comments, no trailing commas
  • Duplicate keys: Reject at parse time
  • Numbers: IEEE-754 double, decimal string, or BigInt with constraints?
  • Unicode: Require valid UTF-8; don't normalize strings

Step 1: Parse Strictly + Enforce Budgets

Budgets I actually use in production:

parse-budgets.ts
typescript
const PARSE_LIMITS = {
  maxBytes: 1_000_000,      // 1MB max body
  maxDepth: 20,             // Nesting depth
  maxKeys: 100,             // Keys per object
  maxArrayLength: 10_000,   // Items per array
  maxStringLength: 100_000, // Characters per string
};

function parseStrictJSON(input: string): unknown {
  if (input.length > PARSE_LIMITS.maxBytes) {
    throw new Error('JSON exceeds size limit');
  }
  
  // Use a strict parser that rejects duplicate keys
  // and enforces depth/size limits
  return strictJSONParse(input, PARSE_LIMITS);
}

Step 2: Schema Validate (Fail Closed)

Validate against a versioned schema (JSON Schema 2020-12 is a good baseline). For security boundaries:

strict-schema.json
json
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "action": { "type": "string", "enum": ["read", "write", "delete"] },
    "resourceId": { "type": "string", "minLength": 1, "maxLength": 64 }
  },
  "required": ["action", "resourceId"],
  "additionalProperties": false
}

Step 3: Canonicalize (JCS)

Canonicalize the parsed value using JCS and treat that canonical text as the bytes to sign/hash.

canonicalize.ts
typescript
import { canonicalize } from 'json-canonicalize';

function getSignableBytes(data: unknown): Uint8Array {
  const canonical = canonicalize(data);
  return new TextEncoder().encode(canonical);
}

Step 4: Sign an Envelope, Not Just Payload

Recommended envelope structure:

signed-envelope.json
json
{
  "v": 1,
  "kid": "key-2026-01",
  "aud": "orders-api",
  "iat": 1736000000,
  "exp": 1736000300,
  "nonce": "b64url(random_96_bits)",
  "schema": "orders.command.v3",
  "payload": { "action": "read", "resourceId": "order-123" },
  "sig": "b64url(signature_over_canonicalized_envelope)"
}

Key fields explained:

  • aud: Binds the message to the intended recipient (prevents cross-service replay)
  • iat/exp: Provides a freshness window
  • nonce: Prevents same-window replays (store nonces per kid + aud for the window)
  • schema: Prevents "schema confusion" where consumers interpret the payload differently

The Sharp Edges (What Bites Teams)

Duplicate Keys: The Silent Ambiguity

JSON doesn't require keys to be unique. Many parsers do "last write wins":

duplicate-keys.json
json
{"role":"user","role":"admin"}

Different components may select the first vs last value. That's an AppSec bug class, not a correctness nit.

Recommendation: Reject duplicate keys at parse time everywhere you can. If you can't, normalize by enforcing a strict "first wins" or "last wins" policy consistently across all components.

Numbers: Precision, -0, and IEEE-754 Limits

Common problems:

  • JavaScript cannot precisely represent integers above 253-1
  • Some languages parse 1e309 as Infinity (non-JSON), or fail differently
  • -0 exists in IEEE-754 and can behave oddly in comparisons
Recommendations:
  • For IDs: use strings, not numbers
  • For money: use decimal strings with schema constraints
  • Don't accept non-standard numeric values (NaN/Infinity)

Unicode: Normalization and Escaping

Two strings can look identical to humans but be different sequences of code points (NFC vs NFD). If your authorization logic uses "visual" identity, you can get spoofing.

Recommendation: Treat strings as exact, compare exact, canonicalize via JCS. If you need user-facing identifiers, apply additional anti-spoofing policies separately.

Embedded/Enclosed Systems Notes

On constrained devices, you might not want full JSON Schema validation. Still:

  • Enforce strict parsing + budgets
  • Canonicalize using a known-good library or a small, audited implementation
  • Prefer short-lived commands (expiry) and idempotency keys
  • Consider migrating to CBOR/COSE when feasible; but if you must stay JSON, be disciplined

Implementation Checklist

Copy this into your security review:

  • ☐ Parse strictly; reject comments/trailing commas
  • ☐ Reject duplicate keys
  • ☐ Enforce max bytes/depth/keys/items/lengths
  • ☐ Validate against a versioned schema; fail closed
  • ☐ Canonicalize using JCS (RFC 8785)
  • ☐ Sign canonical bytes with a modern algorithm (Ed25519 / ECDSA P-256)
  • ☐ Include aud, iat, exp, and nonce for replay resistance
  • ☐ Bind signature verification and business interpretation to the same parsed object
  • ☐ Log signature verification failures with safe redaction; rate-limit

References

Continue Learning

About the Author

AT

Adam Tse

Founder & Lead Developer · 10+ years experience

Full-stack engineer with 10+ years of experience building developer tools and APIs. Previously worked on data infrastructure at scale, processing billions of JSON documents daily. Passionate about creating privacy-first tools that don't compromise on functionality.

JavaScript/TypeScript Web Performance Developer Tools Data Processing