1

This is the first time I am using mongo for a meteor project, which is to help collaborate a group of users to read the holy scriptures. The data structure for the collection looks as follows:

//collection is named as 'logs'
[
    {
        _id: someObjectId,
        startPage: 1,
        finishPage: 4,
        status: 'in progress',
        userId: someUserId1
    },
    {
        _id: someObjectId,
        startPage: 5,
        finishPage: 10,
        status: 'done',
        userId: someUserId2
    }
    //.... and so on and normally two users wont read the same page.
]

I am trying to figure out the following:

  • Total Pages "Done"
  • Total Pages "In Progress"
  • Next available Page
  • Missing Pages

Now due to my limited knowledge of mongo I'm stuck at this point. I have been looking at various solutions but not sure which one would be the right fit:

  1. Aggregation (but I'm unclear as to how I would achieve the required data)
  2. MapReduce (should this be used every time when a new log is added or MapReduce is only meant to run as a batch job)
  3. Create a new separate collection that would hold all the tracking data and update it every time a log is either added or updated

e.g.

 [
    {
        page: 1,
        inProgress: 0, //users would normally not read the same page twice but they may
        done: 1
    },
    {
        page: 2,
        inProgress: 1, //users would normally not read the same page twice but they may
        done: 0
    }
]

I would be really grateful if somebody could provide some insight into this and preferred way of doing it. It may be obvious but I'm finding it a bit hard. Thanks.

2
  • Your question is unfortunately not very clear. Logs are normally written but never modified. You could also track accesses in the original collection that's being read but doing that would eventually cause your documents to get very large, perhaps exceeding the 16MB doc limit. What are you trying to do with the logs afterwards? Show who read what when and how many times? Presumably you want to create some reports. Commented Jan 19, 2016 at 8:02
  • Thanks @MichelFloyd. Apologies for that. I missed the main part out. I was trying to get the formatting right and the main bit got deleted (by fault of my own). I've updated the question now with what I am trying to figure out. You are also right that the logs are not meant to be modified and I should use a better name but in this case they can be modified. Commented Jan 19, 2016 at 10:37

1 Answer 1

2

I'd try to model your documents as per the usage you're trying to achieve, If you split up the logs in to just pages read then you can do a simple group by.

So say we change your models in to the following:

[{
    _id: new ObjectId(),
    pageNumber: 1,
    status: 'in progress',
    userId: 1
},
{
    _id: new ObjectId(),
    pageNumber: 2,
    status: 'in progress',
    userId: 1
},
{
    _id: new ObjectId(),
    pageNumber: 3,
    status: 'in progress',
    userId: 1
},
{
    _id: new ObjectId(),
    pageNumber: 4,
    status: 'in progress',
    userId: 1
},
{
    _id: new ObjectId(),
    pageNumber: 5,
    status: 'in progress',
    userId: 1
},
{
    _id: new ObjectId(),
    pageNumber: 6,
    status: 'done',
    userId: 2
},
{
    _id: new ObjectId(),
    pageNumber: 7,
    status: 'done',
    userId: 2
},
{
    _id: new ObjectId(),
    pageNumber: 8,
    status: 'done',
    userId: 2
},
{
    _id: new ObjectId(),
    pageNumber: 9,
    status: 'done',
    userId: 2
},
{
    _id: new ObjectId(),
    pageNumber: 10,
    status: 'done',
    userId: 2
},
{
    _id: new ObjectId(),
    pageNumber: 9,
    status: 'done',
    userId: 1
},
{
    _id: new ObjectId(),
    pageNumber: 10,
    status: 'done',
    userId: 1
}]

Then we can run the following aggregation query:

> db.logs.aggregate([ {$group: { _id: { "pageNumber" : "$pageNumber", "status" : "$status"}, count : {$sum : 1}}} ]).pretty()
{ "_id" : { "pageNumber" : 8, "status" : "done" }, "count" : 1 }
{ "_id" : { "pageNumber" : 7, "status" : "done" }, "count" : 1 }
{ "_id" : { "pageNumber" : 6, "status" : "done" }, "count" : 1 }
{ "_id" : { "pageNumber" : 4, "status" : "in progress" }, "count" : 1 }
{ "_id" : { "pageNumber" : 9, "status" : "done" }, "count" : 2 }
{ "_id" : { "pageNumber" : 3, "status" : "in progress" }, "count" : 1 }
{ "_id" : { "pageNumber" : 10, "status" : "done" }, "count" : 2 }
{ "_id" : { "pageNumber" : 2, "status" : "in progress" }, "count" : 1 }
{ "_id" : { "pageNumber" : 5, "status" : "in progress" }, "count" : 1 }
{ "_id" : { "pageNumber" : 1, "status" : "in progress" }, "count" : 1 }
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks Kevin. That helps. However using this solution how can the missing pages be worked out?
you'd need like a book collection that you could store the book size in then you could match against that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.