1

I have an html string that contains multiple <p> tags. WIthin each <p> tag there is a word and its definition.

let data = "<p><strong>Word 1:</strong> Definition of word 1</p><p><strong>Word 2:</strong> Definition of word 2</p>"

My goal is to convert this html string into an array of objects that looks like below:

[
 {"word": "Word 1", "definition": "Definition of word 1"},
 {"word": "Word 2", "definition": "Definition of word 2"}
]

I am doing it as follows:

var parser = new DOMParser();
  var parsedHtml    = parser.parseFromString(data, "text/html");
  let pTags = parsedHtml.getElementsByTagName("p");
  let vocab = []
  pTags.forEach(function(item){
    // This is where I need help to split and convert item into object
    vocab.push(item.innerHTML)
  });

As you can see the comment in the above code, that is where I'm stuck. Any help is appreciated.

4
  • Please create and share a fiddle which can describe what you tried to do Commented Dec 21, 2018 at 10:32
  • How about this link stackoverflow.com/questions/13272406/…? Commented Dec 21, 2018 at 10:33
  • @manfromnowhere That's not about parsing HTML, it's JSON. Commented Dec 21, 2018 at 10:37
  • Can you change the HTML to put a tag around the definition, like <span class="definition">Definition of word 1</span>? Commented Dec 21, 2018 at 10:38

3 Answers 3

3

Use textContent to get the text out of an element. The word is in the strong child element, the definition is the rest of the text.

var parser = new DomParser();
  var parsedHtml    = parser.parseFromString(data, "text/html");
  let pTags = parsedHtml.getElementsByTagName("p");
  let vocab = []
  pTags.forEach(function(item){
    let word = item.getElementsByTagName("strong")[0].textContent.trim();
    let allText = item.textContent;
    let definition = allText.replace(word, "").trim();
    vocab.push({word: word, definition: definition})
  });
Sign up to request clarification or add additional context in comments.

Comments

0

A bit adhoc but works.

const data = "<p><strong>Word 1:</strong> Definition of word 1</p><p><strong>Word 2:</strong> Definition of word 2</p>";
const parsedData = [
  {
    "word1": data.split('<strong>')[1].split('</strong>')[0].trim(),
    "definition": data.split('</strong>')[1].split('</p>')[0].trim()
  },
  {
    "word2": data.split('</p>')[1].split('<strong>')[1].split('</strong>')[0].trim(),
    "definition": data.split('</p>')[1].split('</strong>')[1].split('</p>')[0].trim()
  }
]
console.log(parsedData);

1 Comment

Going from a DOM parser to string function is backwards.
0

You should fix:

  • DOMParser, not DomParser
  • pTags cannot use .forEach(), please use for loop

My solution for your problem:

let data = "<p><strong>Word 1:</strong> Definition of word 1</p><p><strong>Word 2:</strong> Definition of word 2</p>"

var parser = new DOMParser();
var parsedHtml = parser.parseFromString(data, "text/html");
let pTags = parsedHtml.getElementsByTagName("p");
let vocab = [];
for (let p of pTags) {
  const word = p.getElementsByTagName('strong')[0].innerHTML.replace(':', '').trim();
  const definition = p.innerHTML.replace(/<strong>.*<\/strong>/, '').trim();
  vocab.push( { word, definition } )
}

console.log(vocab);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.