TOON Format Documentation

Complete JSON to TOON Conversion Guide - TOON Syntax & Serialization Format

Introduction

Token-Oriented Object Notation (TOON) is a compact, human-readable serialization format designed for passing structured data to Large Language Models with significantly reduced token usage. It's intended for LLM input as a lossless, drop-in representation of JSON data.

TOON's sweet spot is uniform arrays of objects โ€“ multiple fields per row, same structure across items. It borrows YAML's indentation-based structure for nested objects and CSV's tabular format for uniform data rows, then optimizes both for token efficiency in LLM contexts.

๐Ÿ’ก Key Concept
Think of TOON as a translation layer: use JSON programmatically, convert to TOON for LLM input.
TOON Format Documentation Overview

Why TOON?

AI is becoming cheaper and more accessible, but larger context windows allow for larger data inputs as well. LLM tokens still cost money โ€“ and standard JSON is verbose and token-expensive.

Before (JSON - 125 tokens)

{
  "users": [
    { "id": 1, "name": "Alice", "role": "admin" },
    { "id": 2, "name": "Bob", "role": "user" }
  ]
}

After (TOON - 54 tokens)

users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user

Result: 57% fewer tokens for the same data!

Key Features

  • ๐Ÿ’ธ Token-efficient: typically 30โ€“60% fewer tokens than JSON
  • ๐Ÿคฟ LLM-friendly guardrails: explicit lengths and fields enable validation
  • ๐Ÿฑ Minimal syntax: removes redundant punctuation (braces, brackets, most quotes)
  • ๐Ÿ“ Indentation-based structure: like YAML, uses whitespace instead of braces
  • ๐Ÿงบ Tabular arrays: declare keys once, stream data as rows

Benchmarks

Token counts are measured using the GPT-5 o200k_base tokenizer. Actual savings vary by model and tokenizer.

โญ GitHub Repositories (100 repos)
TOON
8,745 tokens
JSON
15,145 tokens

Savings: 42.3% (6,400 tokens)

๐Ÿ“ˆ Daily Analytics (180 days)
TOON
4,507 tokens
JSON
10,977 tokens

Savings: 58.9% (6,470 tokens)

๐Ÿ›’ E-Commerce Order
TOON
166 tokens
JSON
257 tokens

Savings: 35.4% (91 tokens)

Installation & Quick Start

Using npm

npm install @toon-format/toon

Using pnpm

pnpm add @toon-format/toon

Using yarn

yarn add @toon-format/toon

Example Usage

import { encode } from '@toon-format/toon'

const data = {
  users: [
    { id: 1, name: 'Alice', role: 'admin' },
    { id: 2, name: 'Bob', role: 'user' }
  ]
}

console.log(encode(data))
// Output:
// users[2]{id,name,role}:
//   1,Alice,admin
//   2,Bob,user

Format Overview

Objects

Simple objects with primitive values:

// Input
{ id: 123, name: 'Ada', active: true }

// Output
id: 123
name: Ada
active: true

Nested objects:

// Input
{ user: { id: 123, name: 'Ada' } }

// Output
user:
  id: 123
  name: Ada

Arrays

Primitive Arrays (Inline)

// Input
{ tags: ['admin', 'ops', 'dev'] }

// Output
tags[3]: admin,ops,dev

Arrays of Objects (Tabular)

When all objects share the same primitive fields, TOON uses an efficient tabular format:

// Input
{ items: [
  { sku: 'A1', qty: 2, price: 9.99 },
  { sku: 'B2', qty: 1, price: 14.5 }
]}

// Output
items[2]{sku,qty,price}:
  A1,2,9.99
  B2,1,14.5

Mixed and Non-Uniform Arrays

Arrays that don't meet tabular requirements use list format:

// Input
{ items: [1, { a: 1 }, 'text'] }

// Output
items[3]:
  - 1
  - a: 1
  - text

Quoting Rules

TOON quotes strings only when necessary to maximize token efficiency.

Strings That Require Quotes

Condition Examples
Empty string ""
Leading/trailing spaces " padded ", " "
Contains delimiter, colon, quotes "a,b", "a:b", "say \"hi\""
Looks like boolean/number/null "true", "42", "null"
Starts with list syntax "- item"
Looks like structural token "[5]", "{key}"

Strings That Don't Need Quotes

  • Unicode and emoji: hello ๐Ÿ‘‹ world
  • Strings with inner spaces: hello world
  • Alphanumeric with punctuation: user_name, user.name

API Reference

encode(value, options?)

Converts any JSON-serializable value to TOON format.

Parameters

  • value (any) โ€“ JSON-serializable value
  • options (optional) โ€“ Encoding options:
    • indent (number) โ€“ Spaces per indentation level (default: 2)
    • delimiter (string) โ€“ Array delimiter: ',', '\t', '|' (default: ',')
    • lengthMarker (boolean) โ€“ Add # prefix to array lengths (default: false)

Example

import { encode } from '@toon-format/toon'

// Basic encoding
const data = { users: [{ id: 1, name: 'Alice' }] }
console.log(encode(data))

// With options
console.log(encode(data, { 
  delimiter: '\t', 
  lengthMarker: '#' 
}))

decode(input, options?)

Converts a TOON-formatted string back to JavaScript values.

Parameters

  • input (string) โ€“ TOON-formatted string
  • options (optional) โ€“ Decoding options:
    • indent (number) โ€“ Expected indentation (default: 2)
    • strict (boolean) โ€“ Enable strict validation (default: true)

Example

import { decode } from '@toon-format/toon'

const toon = `users[2]{id,name}:
  1,Alice
  2,Bob`

const data = decode(toon)
// { users: [{ id: 1, name: 'Alice' }, { id: 2, name: 'Bob' }] }

Command Line Interface

TOON includes a CLI tool for converting between JSON and TOON formats.

Basic Usage

npx @toon-format/cli [options] [input]

Examples

# Encode JSON to TOON
npx @toon-format/cli input.json -o output.toon

# Decode TOON to JSON
npx @toon-format/cli data.toon -o output.json

# Pipe from stdin
cat data.json | npx @toon-format/cli

# Show token savings
npx @toon-format/cli data.json --stats

# Use tab delimiter
npx @toon-format/cli data.json --delimiter "\t"

Options

Option Description
-o, --output Output file path (stdout if omitted)
-e, --encode Force encode mode
-d, --decode Force decode mode
--delimiter Array delimiter: comma, tab, pipe
--indent Indentation size (default: 2)
--length-marker Add # prefix to array lengths
--stats Show token count and savings
--no-strict Disable strict validation

Using TOON in LLM Prompts

TOON works best when you show the format instead of describing it. The structure is self-documenting โ€“ models parse it naturally once they see the pattern.

Sending TOON to LLMs (Input)

Wrap your encoded data in a fenced code block and label it as toon:

```toon
users[3]{id,name,role}:
  1,Alice,admin
  2,Bob,user
  3,Charlie,user
```

Question: How many users have the role "user"?

Generating TOON from LLMs (Output)

When you want the model to generate TOON, be explicit about the rules:

Task: Return only users with role "user" as TOON.

Rules:
- 2-space indent
- No trailing spaces
- [N] must match row count
- Use header format: users[N]{id,name,role}:

Output only the TOON code block.
๐Ÿ’ก Tip
For large uniform tables, use tab delimiters: encode(data, { delimiter: '\t' }). Tabs often tokenize better than commas.

Notes and Limitations

When TOON Excels

  • Uniform arrays of objects (same fields, primitive values)
  • Large datasets with consistent structure
  • Tabular data with 10+ rows
  • Time-series analytics data
  • API responses with repeated structures

When JSON is Better

  • Non-uniform data with varying field sets
  • Deeply nested structures (3+ levels)
  • Small datasets (<10 items)
  • Objects with mixed or nested array values

Important Considerations

  • Token counts vary by tokenizer and model
  • Benchmarks use GPT-style tokenizers (cl100k/o200k)
  • TOON is designed for LLM input, not APIs or storage
  • Real-world performance depends on your data structure

Syntax Cheatsheet

Input (JSON) Output (TOON)
{ id: 1, name: 'Ada' }
id: 1
name: Ada
{ user: { id: 1 } }
user:
  id: 1
{ tags: ['foo', 'bar'] } tags[2]: foo,bar
{ items: [{ id: 1, qty: 5 }, { id: 2, qty: 3 }] }
items[2]{id,qty}:
  1,5
  2,3
{ items: [1, { a: 1 }, 'x'] }
items[3]:
  - 1
  - a: 1
  - x
['x', 'y'] [2]: x,y
{} (empty output)
{ items: [] } items[0]:

Full Specification

For precise formatting rules, edge cases, and implementation details, see the official TOON specification:

๐Ÿ“‹ TOON Format Specification v1.3

The specification includes:

  • Detailed syntax rules
  • Edge case handling
  • Conformance tests
  • Implementation guidelines
  • Format versioning

Other Implementations

Official Implementations

Community Implementations

Get Started with TOON

Ready to start optimizing your LLM costs with TOON? Here are some resources to help you get started: