Deserialization with System.Text.Json

Posted by

Working with JSON is as common as working with a language’s primitive types. .NET has always had basic built-in support for JSON with things like DataContractJsonSerializer, but it didn’t have the functionality, flexibility, or performance necessary to be considered a first-class citizen. The release of .NET Core 3 shifted that narrative with the inclusion of System.Text.Json.

This post explores the different ways that you can read JSON with System.Text.Json. It’s the second post in the series, with a few more articles in the works:

Overview

System.Text.Json provides three different ways for reading JSON. Each approach exposes the data in a different way, and the one you choose depends on what you’re trying to do:

  • JsonSerializer: The “general-purpose” API, meant to deserialize JSON into POCOs. It’s similar to Newtonsoft’s DeserializeObject, with some additional overloads for reading streams and raw bytes more efficiently.
  • JsonDocument: The “advanced” API that breaks down a JSON document into its constituent parts, and exposes it through a document object model.
  • Utf8JsonReader: The “full control” API that lets you decide what to do with each JSON token all the while keeping memory usage down to a minimum.

Let’s get a better idea of when to use each of these APIs in real-world scenarios.

JsonSerializer

It’s fairly common to have some JSON that you want to deserialize to an object. Just like with Json.NET, you can pass a string of JSON to the Deserialize method and get back a newly-instantiated POCO that represents the data. There’s nothing fancy about that. But if you’re reading data from a file or stream, you’re more likely to be working with a stream or array of bytes rather than a string.

That’s where the other overloads of the Deserialize method come in handy:

public static ValueTask<TValue> DeserializeAsync<TValue>(Stream utf8Json, JsonSerializerOptions options = null, CancellationToken cancellationToken = default);

public static TValue Deserialize<TValue>(ReadOnlySpan<byte> utf8Json, JsonSerializerOptions options = null);

The DeserializeAsync method is useful any time you’re reading a stream that contains JSON. One such place is the function trigger for an HTTP-based Azure Function. By default, a new HTTP trigger function comes pre-loaded with code similar to the following:

string requestBody = await new StreamReader(req.Body).ReadToEndAsync();
var data = JsonConvert.DeserializeObject<SomeObject>(requestBody);

The above code uses Json.NET, but the you get the idea. The stream is read completely into a string, which is then deserialized to a POCO. The same thing can be accomplished with System.Text.Json’s DeserializeAsync method in a single statement:

var data = await JsonSerializer.DeserializeAsync<SomeObject>(req.Body);

It’s much neater to deserialize the request body this way, and it avoids the unnecessary string allocation, since the serializer consumes the stream for you.

The Deserialize method can also take a ReadOnlySpan of bytes as input. Much like the stream example above, you previously had to read the bytes into a string before deserializing it to JSON. Instead, if you’ve already got the data loaded in memory, this overload saves you a few allocations and parses the JSON directly into a POCO.

It’s also worth nothing that some of the default options for System.Text.Json are different from Json.NET. System.Text.Json adheres to RFC 8259, so if you’re ever left wondering why a setting is different from Newtonsoft, that’s probably why.

You should used JsonSerializer when you:

  • Have a POCO that matches the JSON data, or it’s easy to create one.
  • Need to use most of the properties of the JSON in your application code.

JsonDocument

JsonDocument.Parse deserializes JSON to its constituent parts and makes it accessible through an object model that can represent any valid JSON. The object model gives you the power to read arbitrary parts of the JSON document, without forcing you to define a POCO. In that sense, it’s similar to the JObject type in Newtonsoft, but with a much nicer API.

A JsonDocument is composed of a single property called RootElement, of type JsonElement. Think of a JsonElement as being any JSON value, object, or array.

Here’s a diagram that shows the relationship between JsonDocument, JsonProperty, and JsonElement:

JsonElement has Get methods for primitive types that you can call like so: document.RootElement.GetString("Topic"), or document.RootElement.GetNumber("Part").

It also has a GetProperty method to retrieve a block of JSON within the document. For example, you would write document.RootElement.GetProperty("Stats") if you wanted to get a JsonElement that includes all the properties in that block of JSON.

There are two ways to go about getting to the data that interests you. The first, if you know what you’re looking for, is to access the element directly through the DOM. You could for example do the following to get a known property:

// {"Topic":"Json Serialization Part 1","Part":1,"Author":"Marc","Co-Author":"Helen","Keywords":["json","netcore","parsing"]}

var blogPost = JsonDocument.Parse(stringifiedJson);
var topic = blogPost.RootElement.GetProperty("Topic").GetString();

That works great for random access to a property that you know how to find. But what if you’re looking for a property that could be anywhere in the document? Or you need to read a particular property from each object in a JSON array? That’s where EnumerateObject and EnumerateArray come in. They can be used together to walk through any JsonDocument:

// {"Topic":"Json Serialization Part 1","Part":1,"Author":"Marc","Co-Author":"Helen","Keywords":["json","netcore","parsing"]}
var blogPost = JsonDocument.Parse(stringifiedJson);

// Find all authors, returns enumerable with "Marc", "Helen"
var authors = blogPost.RootElement.EnumerateObject()
                   .Where(it => it.Name.Contains("Author") && it.Value.ValueKind == JsonValueKind.String);

// Find all keywords, returns enumerable with "json", "netcore", "parsing"
var keywords = blogPost.RootElement.EnumerateObject()
                  .Where(it => it.Value.ValueKind == JsonValueKind.Array && it.Name == "Keywords")
                  .SelectMany(it => it.Value.EnumerateArray().Select(that => that.GetString()));


Serialization and deserialization are both expensive operations. The JsonDocument API is designed to keep allocations down a minimum, reducing the impact it has on your application.

You should use JsonDocument and its related types when:

  • The JSON would be too complex to represent in a POCO.
  • You need access to only a few specific parts of the JSON data.
  • You don’t know the format of the JSON or the JSON could have multiple formats.

Utf8JsonReader

Utf8JsonReader is lower level than both the JsonSerializer and JsonDocument APIs. It operates on individual JSON tokens so that you can decide what to do with each token. It’s designed to customize the deserialization process and keep allocations to a minimum, allowing you to read very large documents that wouldn’t be feasible with other deserialization means. You could use it, for example, to:

  • Find the value of a particular property hidden deep within the JSON.
  • Filter for JSON tokens that match some criteria.
  • Count the number of tokens that match some criteria.
  • Deserialize only the values you need from a large JSON.
  • Reading a large file from a stream.

Utf8JsonReader is for what I would consider edge cases — it’s not something you’re likely to use on a daily basis. For that reason, I won’t show any examples of how to use it here, but you can refer to the linked articles above for more details on its API.

You should consider using Utf8JsonReader when:

  • You need full control of how and what you’re going to deserialize.
  • You have a really large JSON document that can’t feasibly be read any other way.
  • You have to do some special processing of the JSON document, like counting certain tokens.

Summary

We saw a few different ways to parse JSON data with System.Text.Json. The method you choose depends on what you’re trying to accomplish and can be summarized as below:

Good for
JsonSerializer– Small to medium size JSON that’s deserializable to a POCO.
– Making all properties and values accessible to your application code.
JsonDocument– Complex JSON documents.
– Reading only specific parts of the JSON.
– Walking through an unknown JSON format.
Utf8JsonReader– Reading extremely large JSON data sets.
– Customizing the deserialization process to handle special scenarios
– Controlling deserialization behaviour.

Now that we can read JSON data any way that we like, it’s time to figure out how to write JSON for others to consume. Look for that that post around mid-October.

One comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s