Toon vs Json
In this post, I want to compare the costs of using TOON versus JSON for data representation when working with large language models (LLMs).
What is TOON?
While JSON is very familiar to most developers, TOON is somewhat new (at least to me). TOON stands for Token-Oriented Object Notation. Instead of me trying to explain it, I would highly recommend visiting the previous link and reading the documentation there. It is very well written and easy to understand.
What I want to focus on in this post is to try to calculate the actual savings based on the example of how to use TOON in the LLM prompts.
The nice thing is that I do not have to become an expert in this format to start using it. In the application code, we can continue using JSON format, but use the TOON library to convert it before sending the request.
Example TypeScript Code
import { encode } from '@toon-format/toon';
import OpenAI from 'openai';
const openai = new OpenAI({ apiKey: GET_IT_FROM_SECURE_PLACE });
async function analyzeMetrics(data: any) {
const toon = encode({ users: data });
const response = await openai.chat.completions.create({
model: 'gpt-5',
messages: [
{ role: 'system', content: 'You are an analytics expert.' },
{ role: 'user', content: `Analyze user data for the given dataset:\n\n${toon}\n\nSummarize key insights.` }
],
});
return response.choices[0].message.content;
}
const userData = await db.query('SELECT SOME VERY USEFUL DATA FROM users LIMIT 1000');
const insights = await analyzeMetrics(userData);
console.log(insights);
Test Methodology
To be able to calculate the actual cost and compare the usage of each format, we first need to understand how the requests are billed. LLM models are using “tokens” to calculate the size of the input prompt. The larger input is provided, the more tokens are required. In OpenAI ChatGPT world one token is roughly 4 characters of text. If you are wondering how that affects data submitted, you have to be aware that ALL characters are counted. By all I mean things such as spaces, punctuation, escape characters etc. This alone eliminates usage of pretty JSON in the test as it can add a lot of overhead. We will use compact JSON instead.
The current OpenAI price per 1,000,000 tokens is $1.25. We will use this value to calculate price on a daily, monthly and yearly basis.
Token Calculation
The documentation states that the best way to use TOON is to “introduce” the dataset and show it to your model, rather than describing it. Here is the example from the doc, “that works well with reading and generating the data”.
Prompt:
Data is in TOON format (2-space indent, arrays show length and fields).
```toon
users[3]{id,name,role,lastLogin}:
1,Alice,admin,2025-01-15T10:30:00Z
2,Bob,user,2025-01-14T15:22:00Z
3,Charlie,user,2025-01-13T09:45:00Z
Task: Return only users with role "user" as TOON. Use the same header. Set [N] to match the row count. Output only the code block.
Response:
users[2]{id,name,role,lastLogin}:
2,Bob,user,2025-01-14T15:22:00Z
3,Charlie,user,2025-01-13T09:45:00Z
This quick exercise used approximately 60 tokens. If we repeat the same prompt, but this time with compacted JSON it will require 90 tokens give or take… Now, we are going to increase the number of rows, and assume 1000 records with 1000 requests per day. This example used approximately 23000 tokens with TOON. Same dataset with compacted JSON will require around 44000 tokens.
Summary for 1000 requests per day with 3 records
Now that we have the number of tokens required for each format, we can calculate the costs based on 1000 requests per day.
TOON
TOKENS = 60.
REQUESTS = 1000 per day.
PRICE = $1.25 per 1 million tokens.
DAILY TOTAL = 1000 requests x 60 tokens x $1.25 / 1 000 000 = $0.075.
MONTHLY TOTAL = DAILY TOTAL x 30 days = $2.25.
YEARLY TOTAL = DAILY TOTAL x 365 days = $27.37.
JSON
TOKENS = 90.
REQUESTS = 1000 per day.
PRICE = $1.25 per 1 million tokens.
DAILY TOTAL = 1000 requests x 90 tokens x $1.25 / 1 000 000 = $0.11.
MONTHLY TOTAL = DAILY TOTAL x 30 days = $3.37.
YEARLY TOTAL = DAILY TOTAL x 365 days = $41.25.
| Format | Daily Cost | Monthly Cost | Annual Cost | Savings vs. JSON (Annual) |
|---|---|---|---|---|
| TOON | $0.075 | $2.25 | $27.37 | $13.88 |
| JSON | $0.11 | $3.37 | $41.25 | - |
Summary for 1000 requests per day with 1000 records
The dataset used is same as above, with the difference that we will take the dataset and assume that it has 1000 rows. Unfortunately, the size does matter, as we will see in the math result below.
TOON
TOKENS = 23000.
REQUESTS = 1000 per day.
PRICE = $1.25 per 1 million tokens.
DAILY TOTAL = 1000 requests x 23000 tokens x $1.25 / 1 000 000 = $28.75.
MONTHLY TOTAL = DAILY TOTAL x 30 days = $862.50.
YEARLY TOTAL = DAILY TOTAL x 365 days = $10,493.75.
JSON
TOKENS = 44000.
REQUESTS = 1000 per day.
PRICE = $1.25 per 1 million tokens.
DAILY TOTAL = 1000 requests x 44000 tokens x $1.25 / 1 000 000 = $55.00.
MONTHLY TOTAL = DAILY TOTAL x 30 days = $1,650.00.
YEARLY TOTAL = DAILY TOTAL x 365 days = $20,075.00.
| Format | Daily Cost | Monthly Cost | Annual Cost | Savings vs. JSON (Annual) |
|---|---|---|---|---|
| TOON | $28.75 | $862.50 | $10,493.75 | $9,581.25 |
| JSON | $55.00 | $1,650.00 | $20,075.00 | - |
Conclusion
The initial benchmarking confirms significant savings when using TOON format. It seems that it makes sense on a large scale for the data analytics specifically. On a small scale, with the smaller dataset and number of requests, it does look like an overkill and doesn’t provide too much value. This is still relatively new to me and there is a lot to learn and try. However, it does look promising and a legit way to optimize costs for the data analytics tasks using LLM.