4

So I am using a program which spits out a horrible, outdated text/ascii style table as a "Click to Copy" style output. I am wondering how I would go about using RegEx to pull the headings and then the values, and put them into an object. The string in question looks similar to this:

*========================================================================*
|Timestamp              ||Key1  |Key2   |Key3   |Key4   |Key5   |Key6   |
*------------------------------------------------------------------------*
| 01/06/2015 12:00:00   ||0     |12     |47.5   |102    |0      |0      |
*========================================================================*

*========================================================*
|Timestamp              ||Key1  |Key2   |Key3   |Key4   |
*--------------------------------------------------------*
| 01/06/2015 12:00:00   ||0.8   |120    |475    |1.2    |
*========================================================

I spent a bit of time on it and ended up with one array of Keys, one array of Values for each table, which was then in an array, like

[[KeyArray01], [ValueArray01], [KeyArray02], [ValueArray02]]

After messing around with that for a while, I decided, there has to be a far better solution to this, and I was hoping someone had an example of how to pull the Headings, and Value from this string, and put them straight into an object, something like this:

[{
    Timestamp = 01/06/2015 12:00:00
    Key1 = 0,
    Key2 = 12,
    Key3 = 47.5,
    Key4 = 102,
    Key5 = 0,
    Key6 = 0
},{
    Timestamp = 01/06/2015 12:00:00
    Key1 = 0.8,
    Key2 = 120,
    Key3 = 475,
    Key4 = 1.2
}]

So I can then simply run a look over it, to print out the relevant data needed, in list or table.

I had a bit of a look around, and could not come up with a solution, any help, or a point in the right direction would be greatly appreciated.

Regards Kirt

UPDATE: For thoes asking for the method I used to get the arrays.

function objectify() {
            var rawData = codeOutput.getSession().getValue();
                b =  rawData.match(/\|.*\|/g);
                j = [];

                for (var i = 0 ; i < b.length; i++) {
                    j.push(b[i].split("|"))
                    oid = "";
                    oval = "";

                    // Run through each array
                    // $.each(j,function(i,v) { $.each(v,function(index,value) { console.log(value)});});
                } console.log(j)
        }

Final Update OK, so all of the answers appear to work on the test data. The one that appears to have worked the best for me is Xotic750 answer. Since it takes into account possible variations in the output, which is perfect for my solution.

Thanks all.

5
  • How did you end up with an array? Could you please also share that code, so we have a base? Commented Jun 2, 2015 at 6:38
  • If you want JSON, you need to format the output like Timestamp: '01/06/2015 12:00:00',Key1: '0', and so on. It may be that CSV is better (greater compatibility with desktop applications). Commented Jun 2, 2015 at 6:39
  • Seems best to just make an object from the arrays you've extracted. Commented Jun 2, 2015 at 6:49
  • what if you build up a string properly, var str = "{Timestamp: '01/06/2015 XX:xx', Key1: '0.8', [...]}" and do an eval of the string to receive the object by using var obj = eval('(' + str + ')'); Commented Jun 2, 2015 at 7:02
  • Sorry, I will post up how I made the arrays. I had started doing it at the office, and it was one of them situations where I knew there was an easy solution, and it was just out of reach of my mind at the time, and I ended up leaving in a huff haha. Got home and wrote up this Question. Commented Jun 2, 2015 at 22:02

4 Answers 4

1

Another possible solution.

function trimmer(list) {
    return list.map(function (part) {
        return part.trim();
    });
}

function getLines(text) {
    return trimmer(text.split('\n')).filter(function (part) {
        return part.charAt(0) === '|';
    });
}

function getParts(text) {
    return trimmer(text.slice(1, -1).replace('||', '|').split('|'));
}

var groups = getLines(document.getElementById('data').textContent).reduce(function (acc, line) {
    var last;

    if (line.slice(0, 10) === '|Timestamp') {
        acc.push({
            headers: getParts(line)
        });
    } else {
        last = acc[acc.length - 1];
        if (!Array.isArray(last.data)) {
            last.data = [];
        }

        last.data.push(getParts(line));
    }

    return acc;
}, []).reduce(function (acc, group) {
    var headers = group.headers;

    group.data.forEach(function (datum) {
        if (datum.length !== headers.length) {
            throw new Error('datum and header lengths do not match');
        }

        acc.push(headers.reduce(function (record, header, index) {
            record[header] = datum[index];

            return record;
        }, {}));
    });

    return acc;
}, []);

document.getElementById('out').textContent = JSON.stringify(groups, null, 2);
<pre id='data'>
*========================================================================*
|Timestamp              ||Key1  |Key2   |Key3   |Key4   |Key5   |Key6   |
*------------------------------------------------------------------------*
| 01/06/2015 12:00:00   ||0     |12     |47.5   |102    |0      |0      |
| 02/06/2015 22:00:00   ||1     |24     |90.5   |204    |1      |1      |
*========================================================================*

*========================================================*
|Timestamp              ||Key1  |Key2   |Key3   |Key4   |
*--------------------------------------------------------*
| 01/06/2015 12:00:00   ||0.8   |120    |475    |1.2    |
*========================================================
</pre>

<pre id="out">
</pre>

Of course you could add further error detection to check things like: Make sure a header is not empty or is not repeted, and so on.

Sign up to request clarification or add additional context in comments.

Comments

1

If you've already extracted arrays it's probably best to just form objects from those...

Here's one approach to extract and populate an object for each table:

function extract(tabularDataString) {
  // Split on double newline (space between tables)
  return tabularDataString.split(/\n\n/g).map(function(s) {

    var all = [];
    var obj = {};

    s.split(/\n/g).forEach(function(line) {
      var current = [];
      // Grab individual values
      line.replace(/\| ?([\w\/:. ]+)/g, function($0, v) {
        current.push(v.trim());
      });
      if (current.length) all.push(current);
    });

    var keys = all[0];
    var vals = all[1];

    vals.forEach(function(val, i) {
      obj[keys[i]] = val;
    });

    return obj;
  });
}

Demo: http://jsfiddle.net/Lh70o120/

Comments

1

I'd go with a two step approach, first get the array-key values ('timestamp', 'key1', etc.) and in a second step get all the values.

Step 0

Copy your code into a variable

var str = "you horrible table like string";

Step 1

Split the string into lines and extract the key values

Step 2

Populate new Array with the values

Here is a fiddle http://jsfiddle.net/2qa8Lt53/ the only thing is how do you get the horrible string?

str='*========================================================================*\n'+
'|Timestamp              ||Key1  |Key2   |Key3   |Key4   |Key5   |Key6   |\n'+
'*------------------------------------------------------------------------*\n'+
'| 01/06/2015 12:00:00   ||0     |12     |47.5   |102    |0      |0      |\n'+
'| 02/06/2015 12:00:00   ||1     |14     |45.5   |132    |0      |1      |\n'+
'*========================================================================*\n';

var aStr = str.split('\n');
// get key values
kval = aStr[1].split("|").map(Function.prototype.call, String.prototype.trim).filter(Boolean); 
// get values
var fArr=[];
for(var i=3;i<aStr.length;i++) {
    if(aStr[i].indexOf('|')>=0) {
        tStr = aStr[i].split("|").map(Function.prototype.call, String.prototype.trim).filter(Boolean);
        var ttStr = [];
        for(var j=0;j<tStr.length;j++) {
            ttStr[kval[j]]=tStr[j]; 
        } fArr.push(ttStr); 
    }
}

console.log(fArr);

Comments

1

Because this was fun, I thought I'd offer the following approach:

// getting the element containing the relevant ASCII table
// this is restrictive in that that element must contain
// *only* the ASCII table (for predictable results):
var input = document.getElementById('asciiTable').textContent,

    // splitting the input string by new-lines to form an array,
    // filtering that array with Array.prototype.filter():
    lines = input.split(/\n/).filter(function (line) {
        // line is the array-element from the array over
        // which we're iterating; here we keep only those
        // array elements for which the assessment is true:
        return line.indexOf('|') === 0;
    }),

    // getting the first 'line' of the array, using
    // Array.prototype.shift() which contains the headings,
    // using shift() also removes that element from the array,
    // then we match all occurrences ('g') of a strings
    // matching the word-boundary ('\b') followed by one-or-more
    // alpha-numerics (\w+) followed by another word-boundary;
    // String.prototype.match() returns either null, or an Array
    // containing the matched-strings:
    keys = lines.shift().match(/\b\w+\b/g),

    // initialising an Array:
    objArray = [],

    // initialising a variable to use within the loop:
    values;

// iterating over the lines Array, using
// Array.prototype.forEach():
lines.forEach(function (line, index) {
    // the first argument is the array-element,
    // the second is the index (the names are user-chosen
    // and, so long as they're valid, entirely irrelevant).

    // creating an Object to hold the key-value data:
    objArray[index] = {};

    // using String.prototype.split() on the line, to
    // to form an Array; filtering that array (again
    // with Array.prototype.filter()):
    values = line.split('|').filter(function (value) {
        // keeping those array-elements for which
        // the trimmed value (removing leading and
        // trailing white-space) still has a length
        // 0 is falsey, any non-zero value is truthy:
        return value.trim().length;
    });

    // iterating over the keys array, using
    // Array.prototype.forEach() (again):
    keys.forEach(function (key, i) {

        // setting a key on the Object held
        // at the index-th index of the array
        // (that we created a few lines ago),
        // and setting its value to that of
        // the trimmed value, found earlier,
        // found at the i-th index of the values
        // array:
        objArray[index][key] = values[i].trim();
    });
});

// logging to the console:
console.log(objArray);
// [{"Timestamp":"01/06/2015 12:00:00","Key1":"0","Key2":"12","Key3":"47.5","Key4":"102","Key5":"0","Key6":"0"}]

var input = document.getElementById('asciiTable').textContent,
  lines = input.split(/\n/).filter(function(line) {
    return line.indexOf('|') === 0;
  }),
  keys = lines.shift().match(/\b\w+\b/g),
  objArray = [],
  values;

lines.forEach(function(line, index) {
  objArray[index] = {};
  values = line.split('|').filter(function(value) {
    return value.trim().length;
  });
  keys.forEach(function(key, i) {
    objArray[index][key] = values[i].trim();
  });
});

console.log(objArray);
<pre id="asciiTable">
*========================================================================*
|Timestamp              ||Key1  |Key2   |Key3   |Key4   |Key5   |Key6   |
*------------------------------------------------------------------------*
| 01/06/2015 12:00:00   ||0     |12     |47.5   |102    |0      |0      |
*========================================================================*
</pre>

External JS Fiddle demo, for experimentation.

And, as a function:

function parseAsciiTable(opts) {

    // setting the defaults, these can
    // be adjusted to taste either
    // directly (here) or in the 'opts'
    // object supplied to the function:
    var defaults = {
        // a valid CSS-style selector:
        'source': '#asciiTable',

        // the character that relevant
        // lines start with, and the
        // separator between values:
        'linesStartWith': '|',
        'separator': '|'
    };

    // iterating over the user-supplied object,
    // 'opts', using for..in:
    for (var property in opts) {

        // using Object.prototype.hasOwnProperty()
        // to check that it's not a native/inherited
        // property:
        if (opts.hasOwnProperty(property)) {
            // setting the default property with
            // the user-supplied property-value:
            defaults[property] = opts[property];
        }
    }

    // getting the element containing the ASCII table
    // and - again - only the ASCII table, using
    // document.querySelector() to retrieve the first of,
    // or no, elements matching the supplied selector:
    var source = document.querySelector(defaults.source),

    // what follows is exactly the same as above:
        lines = source.textContent.split(/\n/).filter(function (line) {
            return line.trim().indexOf(defaults.linesStartWith) === 0;
        }),
        keys = lines.shift().match(/\b\w+\b/g),
        objArray = [],
        values;
    lines.forEach(function (line, index) {
        objArray[index] = {};
        values = line.split(defaults.separator).filter(function (value) {
            return value.trim().length;
        });
        keys.forEach(function (key, i) {
            objArray[index][key] = values[i].trim();
        });
    });

    // returning the array of objects:
    return objArray;
}

console.log(parseAsciiTable());
// [{"Timestamp":"01/06/2015 12:00:00","Key1":"0","Key2":"12","Key3":"47.5","Key4":"102","Key5":"0","Key6":"0"}]

function parseAsciiTable(opts) {
  var defaults = {
    'source': '#asciiTable',
    'linesStartWith': '|',
    'separator': '|'
  };
  for (var property in opts) {
    if (opts.hasOwnProperty(property)) {
      defaults[property] = opts[property];
    }
  }
  var source = document.querySelector(defaults.source),
    lines = source.textContent.split(/\n/).filter(function(line) {
      return line.trim().indexOf(defaults.linesStartWith) === 0;
    }),
    keys = lines.shift().match(/\b\w+\b/g),
    objArray = [],
    values;
  lines.forEach(function(line, index) {
    objArray[index] = {};
    values = line.split(defaults.separator).filter(function(value) {
      return value.trim().length;
    });
    keys.forEach(function(key, i) {
      objArray[index][key] = values[i].trim();
    });
  });
  return objArray;
}

console.log(parseAsciiTable());
<pre id="asciiTable">
*========================================================================*
|Timestamp              ||Key1  |Key2   |Key3   |Key4   |Key5   |Key6   |
*------------------------------------------------------------------------*
| 01/06/2015 12:00:00   ||0     |12     |47.5   |102    |0      |0      |
*========================================================================*
</pre>

External JS Fiddle demo, for experimentation.

References:

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.