8

Is it possible to serialize to NDJSON (Newline Delimited JSON) using Json.NET? The Elasticsearch API uses NDJSON for bulk operations, and I can find nothing suggesting that this format is supported by any .NET libraries.

This answer provides guidance for deserializing NDJSON, and it was noted that one could serialize each row independently and join with newline, but I would not necessarily call that supported.

1
  • That link points to a domain grab. It was created only a couple of years ago, while providers like AWS and Azure use newline-delimeted JSON for several years Commented Mar 5, 2018 at 12:16

2 Answers 2

13

As Json.NET does not currently have a built-in method to serialize a collection to NDJSON, the simplest answer would be to write to a single TextWriter using a separate JsonTextWriter for each line, setting CloseOutput = false for each:

using Newtonsoft.Json;
using System.Collections.Generic;
using System.Text;
using System.IO;

public static partial class JsonExtensions
{
    public static void ToNewlineDelimitedJson<T>(Stream stream, IEnumerable<T> items)
    {
        // Let caller dispose the underlying stream 
        using (var textWriter = new StreamWriter(stream, new UTF8Encoding(false, true), 1024, true))
        {
            ToNewlineDelimitedJson(textWriter, items);
        }
    }

    public static void ToNewlineDelimitedJson<T>(TextWriter textWriter, IEnumerable<T> items)
    {
        var serializer = JsonSerializer.CreateDefault();

        foreach (var item in items)
        {
            // Formatting.None is the default; I set it here for clarity.
            using (var writer = new JsonTextWriter(textWriter) { Formatting = Formatting.None, CloseOutput = false })
            {
                serializer.Serialize(writer, item);
            }
            // https://web.archive.org/web/20180513150745/http://specs.okfnlabs.org/ndjson/
            // Each JSON text MUST conform to the [RFC7159] standard and MUST be written to the stream followed by the newline character \n (0x0A). 
            // The newline charater MAY be preceeded by a carriage return \r (0x0D). The JSON texts MUST NOT contain newlines or carriage returns.
            textWriter.Write("\n");
        }
    }
}

Sample fiddle.

Since the individual NDJSON lines are likely to be short but the number of lines might be large, this answer suggests a streaming solution to avoid the necessity of allocating a single string larger than 85kb. As explained in Newtonsoft Json.NET Performance Tips, such large strings end up on the large object heap and may subsequently degrade application performance.

Sign up to request clarification or add additional context in comments.

7 Comments

Accepting as answer due to the use of a JsonTextWriter. It seems like this is the most sane approach in the context of what the library already provides, and it notably more performant than the other answer's approach of creating a new TextWriter for each line.
Actually, the above is the answer that creates a JsonTextWriter for each line.
@jlavallet - JsonConvert.SerializeObject() internally creates both a StringWriter and a JsonTextWriter; see here for details. Since the individual JSON lines are likely to be short but the number of lines might be large, I suggested a streaming solution to avoid allocating a single string larger than 85kb as recommended here.
Small clarification: the original answer is disposing each JsonTextWriter to force the JsonTextWriter to call Flush(). This is what @dbc was explaining in his comment reply about large strings. As an alternative to recreating the JsonTextWriter every time, you can move its using block outside of the foreach and explicitly call Flush() at the end of each iteration.
Could you add "using" statements? Because I'm confused if this answer is Newtonsoft based or System.Text.Json based.
|
1

You could try this:

string ndJson = JsonConvert.SerializeObject(value, Formatting.Indented);

but now I see that you are not just wanting the serialized object to be pretty printed. If the object you are serializing is some kind of collection or enumeration, could you not just do this yourself by serializing each element?

StringBuilder sb = new StringBuilder();
foreach (var element in collection)
{
    sb.AppendLine(JsonConvert.SerializeObject(element, Formatting.None));
}

// use the NDJSON output
Console.WriteLine(sb.ToString());

4 Comments

It certainly would be valid to serialize one line at a time and append, but as I pointed out: this is not functionality I can get from Json.NET out-of-the-box. It's a fair question whether or not Json.NET should support this format explicitly. What would be the input type for NDJson, an array of objects?
I agree that it's a fair question whether Json.NET can support this out-of-the-box.
As to what the input type would be – I suppose from what I quickly read about the NDJSON format, that would depend on the context. It would be "a line of data" that should be separately handled from other "lines of data". What is your context? The line of data could be a simple object with a few properties, a complex object with multiple levels of sub objects, or just a string, You would have to tell me what should appear on each line.
@jlvallet NDJSON allows for any valid JSON to be transmitted in this format. If you wanted to produce this output with a set of mixed objects in .NET, some type boxing/unboxing would be necessary. Anyway, it was meant more as a rhetorical prompt. Maybe the best implementation is simply to build a custom JsonTextWriter for this type of serialziation, eschewing any direct support in the library.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.