1

I am using mongoDb 2.6.4 and still getting an error:

uncaught exception: aggregate failed: {
    "errmsg" : "exception: aggregation result exceeds maximum document size (16MB)",
    "code" : 16389,
    "ok" : 0,
    "$gleStats" : {
        "lastOpTime" : Timestamp(1422033698000, 105),
        "electionId" : ObjectId("542c2900de1d817b13c8d339")
    }
}

Reading different advices I came across of saving result in another collection using $out. My query looks like this now:

db.audit.aggregate([
{$match: { "date": { $gte : ISODate("2015-01-22T00:00:00.000Z"),
                    $lt : ISODate("2015-01-23T00:00:00.000Z")
                    }
                }
            },

{ $unwind : "$data.items" } ,
{
$out : "tmp"
}] 
)

But I am getting different error: uncaught exception: aggregate failed:

{"errmsg" : "exception: insert for $out failed: { lastOp: Timestamp 1422034172000|25, connectionId: 625789, err: \"insertDocument :: caused by :: 11000 E11000 duplicate key error index: duties_and_taxes.tmp.agg_out.5.$_id_  dup key: { : ObjectId('54c12d784c1b2a767b...\", code: 11000, n: 0, ok: 1.0, $gleStats: { lastOpTime: Timestamp 1422034172000|25, electionId: ObjectId('542c2900de1d817b13c8d339') } }",
    "code" : 16996,
    "ok" : 0,
    "$gleStats" : {
        "lastOpTime" : Timestamp(1422034172000, 26),
        "electionId" : ObjectId("542c2900de1d817b13c8d339")
    }
}

Can someone has a solution?

1 Answer 1

3

The error is due to the $unwind step in your pipeline.

When you unwind by a field having n elements, n copies of the same documents are produced with the same _id. Each copy having one of the elements from the array that was used to unwind. See the below demonstration of the records after an unwind operation.

Sample demo:

> db.t.insert({"a":[1,2,3,4]})

WriteResult({ "nInserted" : 1 })

> db.t.aggregate([{$unwind:"$a"}])

{ "_id" : ObjectId("54c28dbe8bc2dadf41e56011"), "a" : 1 }
{ "_id" : ObjectId("54c28dbe8bc2dadf41e56011"), "a" : 2 }
{ "_id" : ObjectId("54c28dbe8bc2dadf41e56011"), "a" : 3 }
{ "_id" : ObjectId("54c28dbe8bc2dadf41e56011"), "a" : 4 }
>

Since all these documents have the same _id, you get a duplicate key exception(due to the same value in the _id field for all the un-winded documents) on insert into a new collection named tmp.

The pipeline will fail to complete if the documents produced by the pipeline would violate any unique indexes, including the index on the _id field of the original output collection.

To solve your original problem, you could set the allowDiskUse option to true. It allows, using the disk space whenever it needs to.

Optional. Enables writing to temporary files. When set to true, aggregation operations can write data to the _tmp subdirectory in the dbPath directory. See Perform Large Sort Operation with External Sort for an example.

as in:

db.audit.aggregate([
{$match: { "date": { $gte : ISODate("2015-01-22T00:00:00.000Z"),
                    $lt : ISODate("2015-01-23T00:00:00.000Z")
                    }
                }
            },

{ $unwind : "$data.items" }] ,  // note, the pipeline ends here
{
  allowDiskUse : true
});
Sign up to request clarification or add additional context in comments.

5 Comments

Do you have any other suggestions? Unfortunately with allowDiskUse : true I am getting the same original error Error("Printing Stack Trace")@:0 ()@src/mongo/shell/utils.js:37 ([object Array],[object Object])@src/mongo/shell/collection.js:866 @(shell):10 uncaught exception: aggregate failed: { "errmsg" : "exception: aggregation result exceeds maximum document size (16MB)", "code" : 16389, "ok" : 0, "$gleStats" : { "lastOpTime" : Timestamp(1422046336000, 21), "electionId" : ObjectId("542c2900de1d817b13c8d339") } }
@NewMongoDBUser Why do you want to unwind the array items? Can you explain your use case a bit. We can think of other possible solutions which avoid the $unwind operation.
I need to load data into mySql and each array element will be in the separate row in the rdbms.
You could better write the items split on the client side, since $unwind takes a bit of toll when the number and size of the documents is quiet huge.
i am trying to use PDI kettle for it and seems stuck with this problem

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.