1

I'm seeking advice how to process this JSON:

[{
  "title":"Things",
  "text":"1. Raindrops on roses 2. Whiskers on kittens 3. Bright copper kettles 4. Warm woolen mittens 5. Brown paper packages"
},{
  "title":"Colors",
  "text":"1. White 2. Blue 3. Red 4. Yellow 5. Green"
},{
  "title":"Animals",
  "text":"1. Dog 2. Rabbit 3. Cat 4. Squirrel 5. Duck"
},{
  "title":"Colors",
  "text":"1. Red 2. Blue 3. Orange 4. Green 5. Purple"
},{
  "title":"Animals",
  "text":"1. Bear 2. Bird 3. Duck 4. Squirrel 5. Rabbit"
},{
  "title":"Colors",
  "text":"1. Yellow 2. White 3. Black 4. Brown 5. Blue"
}]

to return these collections of observable arrays:

  • Title: Colors, Count: 3, Items:
    • Name: White, Score: 9, Count: 2
    • Name: Blue, Score: 9, Count: 3
    • Name: Red, Score: 8, Count: 2
    • Name: Yellow, Score: 7, Count: 2
    • Name: Black, Score: 3, Count: 1
    • Name: Orange, Score: 3, Count: 1
    • Name: Green, Score: 3, Count: 2
    • Name: Brown, Score: 2, Count: 1
    • Name: Purple, Score: 1, Count: 1

  • Title: Animals, Count: 2, Items:
    • Name: Bear, Score: 5, Count: 1
    • Name: Dog, Score: 5, Count: 1
    • Name: Rabbit, Score: 5, Count: 2
    • Name: Bird, Score: 4, Count: 1
    • Name: Duck, Score: 4, Count: 2
    • Name: Squirrel, Score: 4, Count: 2
    • Name: Cat, Score: 3, Count: 1

  • Title: Things, Count: 1, Items:
    • Name: Raindrops on roses, Score: 5, Count: 1
    • Name: Whiskers on kittens, Score: 4, Count: 1
    • Name: Bright copper kettles, Score: 3, Count: 1
    • Name: Warm woolen mittens, Score: 2, Count: 1
    • Name: Brown paper packages, Score: 1, Count: 1

To explain, I need to do the following:

  1. Group all arrays by title, count their frequency and return new List objects e.g. Title: Colors, Count: 3
  2. Split each text item by number separator to create an array of items, count their frequency, assign them a score based on index position (i.e. [0] = 5 to [4] = 1), sum their scores and return new Item objects e.g. Name: White, Score: 9, Count: 2
  3. Sort the Lists collection by Count and the Items collections by Score, Count and Name.

I've created the following javascript objects:

function List(title, items, count) {
    var self = this;
        self.Title = title;
        self.Count = count;
        self.Items = ko.observableArray(items);
}

function Item(name, count, score) {
    var self = this;
        self.Name = name;
        self.Count = count;
        self.Score = score;
}

I've looked into various approaches I had intended to attack this task with underscore.js but I was put off by reported poor performance at iterating arrays.

I anticipate the JSON file to get much larger than the sample I've shown so performance will be important.

Hopefully, some can suggest the best approach to achieve my objectives and perhaps demonstrate a good starting point.

Thanks in advance.

1 Answer 1

1

I'm using Lo-Dash 2.2.1 in this example, but you could easily swap in Underscore.js as their libraries _.each signatures are basically identical.

Live Demo

var collection = [{
  "title":"Things",
  "text":"1. Raindrops on roses 2. Whiskers on kittens 3. Bright copper kettles 4. Warm woolen mittens 5. Brown paper packages"
},{
  "title":"Colors",
  "text":"1. White 2. Blue 3. Red 4. Yellow 5. Green"
},{
  "title":"Animals",
  "text":"1. Dog 2. Rabbit 3. Cat 4. Squirrel 5. Duck"
},{
  "title":"Colors",
  "text":"1. Red 2. Blue 3. Orange 4. Green 5. Purple"
},{
  "title":"Animals",
  "text":"1. Bear 2. Bird 3. Duck 4. Squirrel 5. Rabbit"
},{
  "title":"Colors",
  "text":"1. Yellow 2. White 3. Black 4. Brown 5. Blue"
}];

//Store all the collections in a single place.
var collections = {}; 
_.each(collection, function(item){

    //Does the collection exist?
    if(!collections[item.title]){
        //No? Add a new one
        collections[item.title] = {Title:'', Items: {} };    
    }

    //Split out the parts of the text by number period
    var parts = item.text.split(/[0-9]+\./); 
    if(parts.length > 0){
        parts.shift(); //remove the first entry in the array since it will be blank   
    }

    //Iterate over each part text we found
    _.each(parts, function(partName, i){

        partName = partName.trim(); //remove whitespace

        //Try to get an existing reference to the part
        var part = collections[item.title].Items[partName]; 

        //If the part doesn't exist
        if(!part){

            //Create a new one and add it to the collection
            part = { Score: 0, Count: 0 }; 
            collections[item.title].Items[partName] = part; 
        }

        //Increment the score by the ordinal position
        part.Score += i; 

        //Increment the count by 1
        part.Count++; 
    }); 
});


//Store the final array of collections here
var finalData = []; 

//Iterate over our current collection object
_.each(collections, function(collection, i){

    var data = []; 

    //Convert the Items "hashtables"/objects into arrays
    for(var key in collection.Items){

        data.push({
            'Name': key,    
            'Score': collection.Items[key].Score, 
            'Count': collection.Items[key].Count
        }); 

    }

    collection.Items = data; //replace the Items object with the array
    collection.Count = data.length; //compute the current count. (Although you could always call Items.length)
    finalData.push(collection); //add the collection to the array.
});

console.log(finalData); 

Results

[
  {
    "Title": "Things",
    "Items": [
      {
        "Name": "Raindrops on roses",
        "Score": 0,
        "Count": 1
      },
      {
        "Name": "Whiskers on kittens",
        "Score": 1,
        "Count": 1
      },
      {
        "Name": "Bright copper kettles",
        "Score": 2,
        "Count": 1
      },
      {
        "Name": "Warm woolen mittens",
        "Score": 3,
        "Count": 1
      },
      {
        "Name": "Brown paper packages",
        "Score": 4,
        "Count": 1
      }
    ],
    "Count": 5
  },
  {
    "Title": "Colors",
    "Items": [
      {
        "Name": "White",
        "Score": 1,
        "Count": 2
      },
      {
        "Name": "Blue",
        "Score": 6,
        "Count": 3
      },
      {
        "Name": "Red",
        "Score": 2,
        "Count": 2
      },
      {
        "Name": "Yellow",
        "Score": 3,
        "Count": 2
      },
      {
        "Name": "Green",
        "Score": 7,
        "Count": 2
      },
      {
        "Name": "Orange",
        "Score": 2,
        "Count": 1
      },
      {
        "Name": "Purple",
        "Score": 4,
        "Count": 1
      },
      {
        "Name": "Black",
        "Score": 2,
        "Count": 1
      },
      {
        "Name": "Brown",
        "Score": 3,
        "Count": 1
      }
    ],
    "Count": 9
  },
  {
    "Title": "Animals",
    "Items": [
      {
        "Name": "Dog",
        "Score": 0,
        "Count": 1
      },
      {
        "Name": "Rabbit",
        "Score": 5,
        "Count": 2
      },
      {
        "Name": "Cat",
        "Score": 2,
        "Count": 1
      },
      {
        "Name": "Squirrel",
        "Score": 6,
        "Count": 2
      },
      {
        "Name": "Duck",
        "Score": 6,
        "Count": 2
      },
      {
        "Name": "Bear",
        "Score": 0,
        "Count": 1
      },
      {
        "Name": "Bird",
        "Score": 1,
        "Count": 1
      }
    ],
    "Count": 7
  }
]
Sign up to request clarification or add additional context in comments.

3 Comments

Wow - great answer. Thanks very much. I'm just working through it and will reply again later.
Did you get a chance to work through it? I didn't add the ko. observableArray components as I figured you were more interested in the transformation routine, but they should be super easy to add in. Let me know if you have any questions.
Hi Brandon, sorry for the late reply. I did actually try to recreate your solution as pure knockout (models, ko.observableArrays and ko.utils) because there is more functionality I need to incorporate that I suspect knockouts MVVM would lend itself nicely to. e.g. stackoverflow.com/questions/22003718/… I was hoping for a clear answer to this question before I replied to yours. Like I said, performance is paramount so if lo-dash's _each is faster than knockout's ArrayForEach (for example), I'd go with that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.