2

I have a lot of data in a CSV that I need to convert to nested JSON to use in a D3.js tree.

Here's a sample of the CSV data:

Domain,Subject,Section,Topic
Networking,Networking Communications,Data Transmission,Data - Overview
Networking,Networking Communications,Data Transmission,Email
Networking,Networking Communications,Data Transmission,Datagram
Networking,Networking Communications,Networking Models,OSI Model
Networking,Networking Communications,Networking Models,TCP/IP Mode

This is how the JSON should look:

{
    "name":"Networking",
    "groups":["CS Analyst", "Cyber Crime"],
    "children":[
        {
            "name":"Networking Communications",
            "groups":["CS Analyst", "Cyber Crime"],
            "children":[
                {
                    "name":"Data Transmission",
                    "groups":["CS Analyst", "Cyber Crime"],
                    "children":[
                        {
                            "name":"Data - Overview",
                            "groups":["CS Analyst", "Cyber Crime"],
                        },
                        {
                            "name":"Email",
                            "groups":["CS Analyst", "Cyber Crime"],
                        },
                        {
                            "name":"Datagram",
                            "groups":[],
                        }
                    ]
                },
                {
                    "name":"Networking Models",
                    "groups":["CS Analyst"],
                    "children":[
                        {
                            "name":"OSI Model",
                            "groups":["CS Analyst"],
                        },
                        {
                            "name":"TCP/IP Model",
                            "groups":["CS Analyst"],
                        }
                    ]
                },
4
  • That's not a CSV. Also, please specify the headers (1st row) precisely. Commented May 11, 2018 at 5:01
  • Your example data looks incomplete -- where does the "groups":["CS Analyst", "Cyber Crime"] information come from? Commented May 11, 2018 at 13:13
  • The "groups" are added by hand. I'm sorry I didn't specify that. Just trying to avoid having to manually do the rest of the JSON. Commented May 11, 2018 at 15:13
  • @peak my comment refers to the first version of the question, before the edit. Commented May 11, 2018 at 22:10

2 Answers 2

1

Here's a jq approach that supports unlimited nesting by using recursion. Since it's unclear exactly how the groups value is to be computed, the following program (program.jq) uses a fixed value. If you can specify an algorithm for determining the value, it should be easy to incorporate it into the program.

The key to understanding program.jq is understanding group_by(f), which groups the input array items into an array of arrays.

program.jq

def gather($supplement):
  group_by(.[0])
  | map( {name: .[0][0]} 
         + $supplement +
         {children: (if (.[0]|length) > 2
                     then (map(.[1:]) | gather($supplement))
                     else map({name:.[1]} + $supplement)
                     end) } )
  ;

split("\n") | map(split(","))
| .[1:] # skip the headers
| map(select(length>0))
| gather({"groups":["CS Analyst", "Cyber Crime"]})
| .[]

Invocation:

 jq -Rs -f program.jq nested.csv

Output:

  {
    "name": "Networking",
    "groups": [
      "CS Analyst",
      "Cyber Crime"
    ],
    "children": [
      {
        "name": "Networking Communications",
        "groups": [
          "CS Analyst",
          "Cyber Crime"
        ],
        "children": [
          {
            "name": "Data Transmission",
            "groups": [
              "CS Analyst",
              "Cyber Crime"
            ],
            "children": [
              {
                "name": "Data - Overview",
                "groups": [
                  "CS Analyst",
                  "Cyber Crime"
                ]
              },
              {
                "name": "Email",
                "groups": [
                  "CS Analyst",
                  "Cyber Crime"
                ]
              },
              {
                "name": "Datagram",
                "groups": [
                  "CS Analyst",
                  "Cyber Crime"
                ]
              }
            ]
          },
          {
            "name": "Networking Models",
            "groups": [
              "CS Analyst",
              "Cyber Crime"
            ],
            "children": [
              {
                "name": "OSI Model",
                "groups": [
                  "CS Analyst",
                  "Cyber Crime"
                ]
              },
              {
                "name": "TCP/IP Mode",
                "groups": [
                  "CS Analyst",
                  "Cyber Crime"
                ]
              }
            ]
          }
        ]
      }
    ]
  }
Sign up to request clarification or add additional context in comments.

Comments

0
var data = [
  "Networking, Networking Communications, Data Transmission, Data-Overview",
  "Networking, Networking Communications, Data Transmission, Email",
  "Networking, Networking Communications, Data Transmission, Something",
  "Networking,  Networking Communications,   Collection Management,   Logs",
  "Networking,  Networking Communications,   Collection Management,   Backups",
  "Networking,  Networking Communications,   Collection Management,   Configuration files",
  "Networking,  Network Architecture,    Architecture Concepts,   Types",
  "Networking,  Network Architecture,    Architecture Concepts,  Design",
  "Networking,  Network Architecture,    Network Topologies,  Comm Medias",
  "Networking,  Network Architecture,    Network Topologies,  Implementations",
  "Networking,  Network Architecture,    Intranet Extranet,  Zoning"
]

var arrayToTree = (finalArray, currentArray) => {
  var node

  currentArray.forEach((name, index) => {
    const children = index === 0 ? finalArray : node['children']
    let newNode = children.find(child => child.name === name)

    if (!newNode) {
      newNode = {
        name,
        ...(
          index < currentArray.length - 1
          ? { children: [] }
          : {}
        )
      }

      children.push(newNode)
    }

    node = newNode
  })

  return finalArray
}

var finalResult = data.map(row => row.split(/,\s+/)).reduce(arrayToTree, [])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.