22

I have a file that contains a json array of objects:

[ { "test1": "abc" }, { "test2": [1, 2, 3] } ]

I wish to use use Jackson's JsonParser to take an inputstream from this file, and at every call to .next(), I want it to return an object from the array until it runs out of objects or fails.

Is this possible?

Use case: I have a large file with a json array filled with a large number of objects with varying schemas. I want to get one object at a time to avoid loading everything into memory.

EDIT:

I completely forgot to mention. My input is a string that is added to over time. It slowly accumulates json over time. I was hoping to be able to parse it object by object removing the parsed object from the string.

But I suppose that doesn't matter! I can do this manually so long as the jsonParser will return the index into the string.

4 Answers 4

57

Yes, you can achieve this sort of part-streaming-part-tree-model processing style using an ObjectMapper:

ObjectMapper mapper = new ObjectMapper();
JsonParser parser = mapper.getFactory().createParser(new File(...));
if(parser.nextToken() != JsonToken.START_ARRAY) {
  throw new IllegalStateException("Expected an array");
}
while(parser.nextToken() == JsonToken.START_OBJECT) {
  // read everything from this START_OBJECT to the matching END_OBJECT
  // and return it as a tree model ObjectNode
  ObjectNode node = mapper.readTree(parser);

  // do whatever you need to do with this object
}

parser.close();
Sign up to request clarification or add additional context in comments.

7 Comments

Hey Ian, after nearly 2 years this code actually still works and saved my day. Just trying to confirm, every time mapper does the read tree until the END_OBJECT token matches where it started, the "Cursor" of parser is also moved there right? so if I do another parser.nextToken() right after the while loop, I should be reading the next object after what's just being read, correct?
@JamesJiang correct. The readTree given a parser positioned on the START_OBJECT will consume events from the parser until it reaches the matching END_OBJECT and will leave the parser positioned on that.
Hmm, I tried this, but I'm getting a com.fasterxml.jackson.core.JsonParseException: Unexpected character (',' (code 44)): expected a valid value (number, String, array, object, 'true', 'false' or 'null') exception. But a JSON list is supposed to be separated by commas, so I'm not sure why Jackson is complaining... any tips?
@SnoopDougg that looks to me like an error with the JSON, maybe a property with no value ({"foo": ,"bar":"baz"}). Or possibly if you've got a series of objects at the top level that are separated by commas but not surrounded by overall array brackets ([]) - I know this technique can cope with either a well formed array ([{...},{...}]) or a stream of objects ({...}{...}) but not if you have the commas without the brackets.
I was trying to separate the objects in my stream with commas. Removing the commas fixed it, thanks! Now I just pass ({...}{...}{...})
|
18

What you are looking for is called Jackson Streaming API. Here is a code snippet using Jackson Streaming API that could help you to achieve what you need.

JsonFactory factory = new JsonFactory();
JsonParser parser = factory.createJsonParser(new File(yourPathToFile));

JsonToken token = parser.nextToken();
if (token == null) {
    // return or throw exception
}

// the first token is supposed to be the start of array '['
if (!JsonToken.START_ARRAY.equals(token)) {
    // return or throw exception
}

// iterate through the content of the array
while (true) {

    token = parser.nextToken();
    if (!JsonToken.START_OBJECT.equals(token)) {
        break;
    }
    if (token == null) {
        break;
    }

    // parse your objects by means of parser.getXxxValue() and/or other parser's methods

}

1 Comment

Just for information, now the method createJsonParser is deprecated you can use createParser instead of createJsonParser.
5

This is a late answer that builds on Ian Roberts' answer. You can also use a JsonPointer to find the start position if it is nested into a document. This avoids custom coding the slightly cumbersome streaming token approach to get to the start point. In this case, the basePath is "/", but it can be any path that JsonPointer understands.

Path sourceFile = Paths.get("/path/to/my/file.json");
// Point the basePath to a starting point in the file
JsonPointer basePath = JsonPointer.compile("/");
ObjectMapper mapper = new ObjectMapper();
try (InputStream inputSource = Files.newInputStream(sourceFile);
     JsonParser baseParser = mapper.getFactory().createParser(inputSource);
     JsonParser filteredParser = new FilteringParserDelegate(baseParser,
                    new JsonPointerBasedFilter(basePath), false, false);) {
    // Call nextToken once to initialize the filteredParser
    JsonToken basePathToken = filteredParser.nextToken();
    if (basePathToken != JsonToken.START_ARRAY) {
        throw new IllegalStateException("Base path did not point to an array: found " 
                                       + basePathToken);
    }
    while (filteredParser.nextToken() == JsonToken.START_OBJECT) {
        // Parse each object inside of the array into a separate tree model 
        // to keep a fixed memory footprint when parsing files 
        // larger than the available memory
        JsonNode nextNode = mapper.readTree(filteredParser);
        // Consume/process the node for example:
        JsonPointer fieldRelativePath = JsonPointer.compile("/test1");
        JsonNode valueNode = nextNode.at(fieldRelativePath);
        if (!valueNode.isValueNode()) {
            throw new IllegalStateException("Did not find value at "
                    + fieldRelativePath.toString() 
                    + " after setting base to " + basePath.toString());
        }
        System.out.println(valueNode.asText());
    }
}

Comments

5

This example reads custom objects directly from a stream:

source is a java.io.File

ObjectMapper mapper = new ObjectMapper();
JsonParser parser = mapper.getFactory().createParser( source );
if ( parser.nextToken() != JsonToken.START_ARRAY ) {
    throw new Exception( "no array" );
}
while ( parser.nextToken() == JsonToken.START_OBJECT ) {
    CustomObj custom = mapper.readValue( parser, CustomObj.class );
    System.out.println( "" + custom );
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.