In this article we’re going to talk about how to use JSON parsing libraries in Rust, as well as a comparison of the most popular libraries and how they perform.
JSON Parsing Basics
Parsing JSON manually
To get started with working with JSON in Rust, you’ll want to install a library that lets you manipulate JSON easily. One of the popular crates currently available to use is serde-json
. You can install it by running the following:
cargo add serde-json
Once done, you can create JSON manually like this:
use serde_json::{Result, Value};
fn untyped_example() -> Result<()> {
// Some JSON input data as a &str. Maybe this comes from the user.
let data = r#"
{
"name": "John Doe",
"age": 43,
"phones": [
"+44 1234567",
"+44 2345678"
]
}"#;
// Parse the string of data into serde_json::Value.
let v: Value = serde_json::from_str(data)?;
// Access parts of the data by indexing with square brackets.
println!("Please call {} at the number {}", v["name"], v["phones"][0]);
Ok(())
}
However, we can do much better than this. For example, we can serialize JSON to and from structs, which has many applications. We can use it in JSON templating, web services, CLI arguments and more. Let’s have a look at this in the next section.
Parsing JSON with Serde
Serde is a crate that helps you serialize and deserialize data to and from various formats, with one popular use of this being for JSON. If you write web services in Rust, Serde is your friend as you’ll be dealing quite often with JSON data that you may need to either send or receive. Serde provides two main traits to help you with this: Serialize
and Deserialize
. For convenience, a derive macro implementation has been added to help with this. See below for how you can carry this out:
use serde::{Deserialize, Serialize};
#[derive(Serialize, Deserialize)]
pub struct MyStruct {
message: String
}
fn convert_json_to_struct() {
// create a raw JSON string from the json! macro and turn it into a MyStruct struct
let raw_json_string = json!({"message": "Hello world!"});
let my_struct: MyStruct = serde_json::from_str(raw_json_string).unwrap();
}
You can also create nested JSON by adding a struct that implements Serialize
and Deserialize
as a field of another struct that also implements Serialize
and Deserialize
:
use chrono::{DateTime, Utc};
use serde::{Deserialize, Serialize};
#[derive(Serialize, Deserialize)]
pub struct Post {
nested_json: PostMetadata,
title: String,
body: String
}
#[derive(Serialize, Deserialize)]
pub struct PostMetadata {
timestamp_created: DateTime<Utc>,
timestamp_last_updated: Datetime<Utc>,
categories: Vec<String>,
}
One use case for this would be nested JSON in a web service. For example, when you are receiving a POST request to your API that has a JSON body, you would normally pass the relevant Json
type in as a handler function parameter. See below:
use axum::Json;
use chrono::{DateTime, Utc};
use serde::{Deserialize, Serialize};
#[derive(Serialize, Deserialize)]
pub struct Post {
nested_json: PostMetadata,
title: String,
body: String
}
#[derive(Serialize, Deserialize)]
pub struct PostMetadata {
timestamp_created: DateTime<Utc>,
timestamp_last_updated: Datetime<Utc>,
categories: Vec<String>,
}
async fn receive_some_json(
// this extractor consumes a JSON body and converts it into the struct type given
Json(json): Json<Post>
) -> Json<Post> {
println!("{:?}", json);
Json(json)
}
In addition to the previous code snippet that shows how you can use serde_json
to convert to a struct from a JSON string, you can also convert to a struct from its byte representation:
let json_as_bytes = b"
{
\"message\": \"Hello world!\",
}";
let my_struct: MyStruct = serde_json::from_slice(json_as_bytes).unwrap();
This is particularly useful if you want to store a struct somewhere as a byte array and then turn it back into a struct later on!
Similarly, you can also read JSON and turn it into a struct from a IO stream of JSON using the .from_reader()
method. Below is an example taken from the serde_json
docs for how you could use it with a TCP stream:
use serde::Deserialize;
use std::error::Error;
use std::net::{TcpListener, TcpStream};
#[derive(Deserialize, Debug)]
struct User {
fingerprint: String,
location: String,
}
fn read_user_from_stream(tcp_stream: TcpStream) -> Result<User, Box<dyn Error>> {
let mut to_be_deserialized = serde_json::Deserializer::from_reader(tcp_stream);
let user = User::deserialize(&mut to_be_deserialized)?;
Ok(user)
}
fn main() {
let listener = TcpListener::bind("127.0.0.1:4000").unwrap();
for stream in listener.incoming() {
println!("{:#?}", read_user_from_stream(stream.unwrap()));
}
}
Doing it this way allows you to deserialize from the stream directly instead of adding buffering in memory. If you’re receiving a lot of JSON-based data, this can help you quite a bit!
Comparing Rust JSON crates
Although serde-json
may be the most popular crate, it is by no means the fastest. A few other crates have popped up in the meantime to improve general JSON parsing performance. In exchange for performance, however, there are certain caveats regarding CPU SIMD extension requirements. There is also increased use of unsafe code, though generally speaking best efforts have been upheld to make sure the code is safe to use.
All of these crates for the most part have the same API. Unless stated otherwise, you can safely go between these libraries and expect roughly the same interfaces for working with JSON in each library.
serde-json
serde-json
is the easiest to use of the Rust JSON libraries. It requires no extra dependencies to use and is often recommended alongside serde
when you need access to idiomatic manipulation of raw JSON values. serde-json
also has support for no_std
by allowing you to turn off the default std
feature and enabling alloc
instead.
In terms of performance, serde-json
itself is not slow by any means. However, it is slower than some of the other JSON libraries on this list. This is primarily due to being optimised for non-parallelized CPU usage. Particularly if you are able to access a modern x86 CPU, you may want to read on to find out more about some of the better-performing options. However, this crate is also the most well used and supported within the Rust community, so if you are having issues with it then it is easy to find assistance!
simd-json
simd-json
is a Rust port of the simdjson
C++ JSON parser, with serde
compatibility built in. As the name states, this library uses SIMD - short for Single Instruction Multiple Data. This is a technique used to be able to process multiple data points with parallel processing, making it significantly faster! As a caveat however, it requires that your system is x86 capable and during runtime it will select the best SIMD feature set for performance. If no feature sets are available there is also an unoptimised Rust implementation, but in the documentation it mentions that it should not be relied on.
It is mentioned in the documentation that simd-json
can be used at full capacity on native target compilation. You can do this by enabling the following compiler option in rustc when running your program, like so:
rustc -C target-cpu=native
However, if you’re like most people using Cargo you probably want to use cargo run
. As in the example, you can create a config at .cargo/config
and then add the following:
[build]
rustflags = ["-C", "target-cpu=native"]
Generally speaking, although this library is quite fast, it should be noted that there is quite a lot of unsafe code in this crate due to it being a port of a C++ crate. This is not to say that you shouldn’t use it, but rather to use it with caution (as the crate says). In spite of this, there’s a section on safety that details how best practices (like unit testing) are upheld to make sure the crate is as safe to use as possible.
It should also be mentioned that for best performance, it’s typically best to enable the jemalloc
or mimalloc
features to be able to make the most of the library.
Generally, the API for simd-json
is the same as serde-json
, so if you want to switch at any point then generally you should not have any problems doing so.
sonic-rs
sonic-rs
is a Rust implementation of JSON manipulation with SIMD functionality. This library also has a counterpart library in C++ and Go! Although it used to require the Rust nightly toolchain, it supports stable Rust. Similarly to simd-json
, it also requires x86 CPU architecture to function at full capacity.
Like simd-json
, to use sonic-rs
you need to enable the following compiler option in rustc when running your program:
rustc -C target-cpu=native
You can create a config at .cargo/config
and then add the following to enable it while using cargo run
:
[build]
rustflags = ["-C", "target-cpu=native"]
This allows you to build for SIMD without needing to do anything else!
Like simd-json
, there is a fair amount of unsafe
code being used. However, if you search for unsafe code within the library you will find probably even more unsafe
code than in the previous library. There is also not much documentation as to how unsafe guarantees are upheld, so although this library may be even faster than simd-json
you will want to double-check that there is no undefined behavior!
sonic-rs
additionally has some extra methods for lazy evaluation and additional speed. For example, if you want a JSON string literal, you can use the LazyValue
type when deserializing to convert it to a JSON string value that still has the forward slashes. There are also quite a few unchecked
methods you can use if you are either not afraid of unsafe behavior or are sure it won’t error out.
Although sonic-rs
is a pretty fast library, it is also a more recent crate and therefore there are some methods like from_reader
(to allow reading from an IO stream) missing from the crate. This has been raised as a GitHub issue already, so hopefully it will get implemented sooner rather than later.
Benchmarks
You can find the benchmarks for simd-json
and serde-json
here. There is a fairly significant improvement for simd-json
versus serde-json
.
You can find the benchmarks for sonic-rs
here, which also compares it against simd-json
and serde-json
. As you can see, the final results are not in the same format as the simd-json
and serde-json
benchmarks so it is somewhat more difficult to understand in terms of data processed per second. However, sonic-rs
is significantly (and sometimes hugely!) faster than both simd-json
and serde-json
under most scenarios.
Finishing Up
Thanks for reading! I hope this article has helped you gain an understanding of how to effectively use Rust JSON parsing libraries.
Interested in more?