0

I have a string of type:

'??__HELLO__?? WORLD ##SAMPLE --MAIN--##'

And I need to parse it and get array that contains:

[{ marker: '??', value: { marker: '__', value: 'HELLO' }, ' WORLD ', { marker: '##', value: ['SAMPLE ' , { marker: '--', value: 'MAIN' }]]

So I have this arkers:

this.markers = {
        b: '??',
        i: '##',
        u: '__',
        s: '--',
    };

And I have a function that generates stack:

parse(string) {
    this.string = string;
    this.stack = [];

    for (let i = 0; i < string.length; i++) {
        for (let marker of Object.values(this.markers)) {
            if (string[i] + string[i + 1] === marker) {
                this.stack.push({ marker: marker, index: i });
                this.stack.push('');
                i++;
                break;
            } else if (marker === Object.values(this.markers)[Object.values(this.markers).length - 1]) {
                this.stack[this.stack.length - 1] = this.stack[this.stack.length - 1].concat(string[i]);
                break;
            }
        }
    }

    for (let i = 0; i < this.stack.length; i++) {
        if (this.stack[i] === '') {
            this.stack.splice(i, 1);
            i--;
        }
    }

    console.log(this.stack);
    return this.parseRecursively(this.stack[0]);
}

In my example stack will contain:

[ { marker: '??', index: 0 },
{ marker: '__', index: 2 },
'HELLO',
{ marker: '__', index: 9 },
{ marker: '??', index: 11 },
' WORLD ',
{ marker: '##', index: 20 },
'SAMPLE ',
{ marker: '--', index: 26 },
'MAIN',
{ marker: '--', index: 31 },
{ marker: '##', index: 33 } ]

And this function calls another one that recursively will generate the output array:

parseRecursively(element) {
    if (this.stack.length === 0) {
        return;
    }

    let parsed = [];

    for (let i = this.stack.indexOf(element); i < this.stack.length; i++) {
        if (typeof this.stack[i] === 'object') {
            if (this.stack[i].marker === this.stack[this.stack.indexOf(this.stack[i]) + 1].marker) {
                let popped = this.stack.splice(this.stack.indexOf(this.stack[i]) + 1, 1)[0];
                let popped2 = this.stack.splice(this.stack.indexOf(this.stack[i]), 1)[0];

                return { marker: popped.marker, value: this.string.substring(popped2.index + 2, popped.index) };
            } else {
                parsed.push({ marker: this.stack[i].marker, value: this.parseRecursively(this.stack[this.stack.indexOf(this.stack[i]) + 1]) });
                i = -1;
            }
        } else {
            parsed.push(this.stack.splice(this.stack.indexOf(this.stack[i]), 1)[0]);
            i -= 2;
        }
    }

I tried many implementations of the above function but it still fails to parse the string.

So how can I rewrite this function so it would work?

Thanks!

P.S. Only plain JavaScript, nothing more and I think using regular expressions will help solve it easier, here's mine regex:

this.regex = /(\?{2}|#{2}|\-{2}|_{2})(.+?)(\1)/g;
7
  • Are there any obvious separators, or are all the groups enclosed? (e.g. ## Something\n vs ## Something ##)? Commented Mar 28, 2018 at 10:24
  • They are all enclosed, like tags, e.g. ??Hello??, not separators Commented Mar 28, 2018 at 10:27
  • Given the following; function parse(str) { const regex = /(\?\?|__|##|--)(.*?)(\1)/; return regex.test(str) ? str.split(regex).filter(_ => _).map(parse).reduce((c, v) => c.concat(v), []) : str; } would that be a start? Commented Mar 28, 2018 at 10:39
  • Ok, so what's next? Commented Mar 28, 2018 at 11:27
  • Give me about 1 min, to post the followup :D Commented Mar 28, 2018 at 11:28

1 Answer 1

1

Okay, so after a bit of thought, here is my take on your problem:

function parse(str, markers = ['??', '__', '##', '--']) {
  // Escape the markers (mostly useless...)
  const e = markers.map(m => m.replace(/./g, '\\$&'))

  // Create regexs to match each individual marker.
  const groups = e.map(m => new RegExp('(' + m + ')(.*?)' + m));

  // Create the regex to match any group.
  const regex = new RegExp('(' + e.map(m => m + '.*?' + m).join('|') + ')');
  const output = [];

  // 'Match' the groups markers.
  str = str.split(regex).filter(_ => _);

  // Iterate over each of the split markers. e.g.
  // From: '??__HELLO__?? WORLD ##SAMPLE --MAIN--##'
  //   To: ['??__HELLO__??', ' WORLD ', '##SAMPLE --MAIN--##']
  return str.map(match => {
    // Find the marker if it is a marker.
    marker = groups.find(m => m.test(match));

    // If it's not a marker return the value.
    if (!marker) {
      return match.trim();
    }

    // It is a marker so make the marker object.
    match = match.match(marker);
    return {
      marker: match[1],
      // Do the recursion.
      value: parse(match[2], markers)
    }
  })
}

// Usage example:
console.log(
  parse('??__HELLO__?? WORLD ##SAMPLE --MAIN--##')
);
.as-console-wrapper {min-height: 100%;}

The individual regexes used in this code are build in the following style:

  1. Escape every character; ?? becomes \?\?
  2. The markers are isolated; \?\? becomes (\?\?)
  3. The contents are matched using the following regex; (.*?)
  4. The content is then surrounded by the markers; (\?\?)(.*?)\?\?

This means the default regex array looks like this:

[
  /(\?\?)(.*?)\?\?/,
  /(\_\_)(.*?)\_\_/,
  /(\#\#)(.*?)\#\#/,
  /(\-\-)(.*?)\-\-/
]

The match any marker regex would look like this:

/\?\?.*?\?\?|\_\_.*?\_\_|\#\#.*?\#\#|\-\-.*?\-\-/

that is effectively the same regex, just without the matching groups.

Sign up to request clarification or add additional context in comments.

1 Comment

I will note that you cannot nest markers of the same type. e.g. ??test__thing??nest!??__?? but I cant see why anyone would?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.