1

I'm working on a script and need to split strings which contain both html tags and text. I'm trying to isolate the tags and eliminate the text.

For example, I want this:

string = "<b>Text <span>Some more text</span> more text</b>";

to be split like this:

separation = string.split(/some RegExp/);

and become:

separation[0] = "<b>";
separation[1] = "<span>";
separation[2] = "</span>";
separation[3] = "</b>";

I would really appreciate any help or advice.

1
  • What is supposed to happen in the case of something like <b attribute="...">. Do you want everything up to the >? If so, you'll need a more advanced parser to cover all bases... Or consider using the one build into your browser (HTML -> DOM). Commented May 5, 2012 at 19:03

1 Answer 1

7

You'll probably want to look into String.match instead:

var str = "<b>Text <span>Some more text</span> more text</b>";
var separation = str.match(/<[^]+?>/g);

console.log(separation); // ["<b>", "<span>", "</span>", "</b>"]
Sign up to request clarification or add additional context in comments.

2 Comments

Actually, do you have any idea what the regular expression would be to cover <span class='class'> as well.
@user433351: This one should work for that already. (Unless it's split across multiple lines?)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.