4

Could someone please help me with the regexp javascript code to replace all <br /> tags with a newline "\n" character found within <pre> divisions.. For example, a string passed to the function containing the following:

<pre class="exampleclass">1<br />2<br />3</pre>

Should be returned as (newlines not shown, though I hope you get the idea):

<pre class="exampleclass">1(newline)2(newline)3</pre>

Another example:

<div>foo<br />bar<pre>1<br />2</pre></div>

Returned as:

<div>foo<br />bar<pre>1(newline)2</pre></div>

Note that the class and division content is dynamic, along with other content in the string (other divs etc). On the other hand, the <br /> tag does not change, so there's no need to cater for <br> or other variants.

NB - I'm working with strings, not HTML elements.. Just in case there is any confusion by the way I have presented the question.

3
  • 1
    Graham: In a comment below, you say something that really needs to be in the question above, specifically that there is other content being passed as part of the string outside the pre tags. So from your comment, it sounds as though you're saying another example string would be "<div>foo<br />bar<pre>1<br />2</pre></div>", with the expected result "<div>foo<br />bar<pre>1(newline)2</pre></div>". That completely changes the question. Commented Dec 16, 2010 at 12:55
  • Thanks T.J. Crowder - right you are. Commented Dec 16, 2010 at 13:04
  • By the way - if you are using javascript, you are likely in a browser. If you render the string in a container, you can use DOM to parse it. Commented Dec 16, 2010 at 13:16

5 Answers 5

5

You could use

str.match(/<pre(?:.*?)>(?:.*?)<\/pre>/g);

And then for all matches

replaced = match.replace(/<br \/>/g, '\n');
str.replace(match, replaced);

So probably something like this:

var matches = str.match(/<pre(?:.*?)>(?:.*?)<\/pre>/g),
    len = matches.length,
    i;

for (i = 0; i < len; i++) {
    str = str.replace(matches[i], matches[i].replace(/<br \/>/g, '\n'));
}

EDIT: changed to match <pre class=""> as well.

Sign up to request clarification or add additional context in comments.

9 Comments

No sorry Spiny Norman. There are other div tags in the string
but within the <pre> tag you will see the \n as it is, not the new line.
@user160820 This question is to resolve a compatibility issue with Joomla JCE Editor and RJ_InsertCode. Within <pre> tags, newline characters are converted to <br /> upon loading an article (luckily it works fine when saving content)
Ok, then you could first get all occurrences of <pre> tags using /<pre>(.*?)<\/pre>/g, replace the <br /> s out of that and then replace the old <pre> tags with the new ones. There's probably a single regexp you could use for this whole operation, but this should do as well.
That sounds like a great approach Spiny. I'm only familiar with global or single string replacements - could you guide me as to how I could handle multiple occurrences?
|
0

HAD it been a document then

var allPre = document.getElementsByTagName('pre');
for (var i=0,n=allPre.length;i<n;i++) {
   allPre[i].innerHTML=allPre[i].innerHTML.replace(/<br \/>/gi,"\n");
}

since <br /> could be <BR /> in some innerHTML implementations

Have a look here too: Replace patterns that are inside delimiters using a regular expression call

3 Comments

Thankyou for your contribution mplungjan, though I need a solution that works with strings (not document HTML elements).
but within the <pre> tag you will see the \n as it is, not the new line.
Not correct user160820, I think you are confusing the pre with xmp
0

You could use the DOM to do this and avoid trying to parse HTML with regex. However, this will leave you at the mercy of the browser's implementation of innerHTML. For example, IE will return tag names in upper case and will not necessarily close all tags.

See it in action: http://jsfiddle.net/timdown/KYRSU/

var preBrsToNewLine = (function() {
    function convert(node, insidePre) {
        if (insidePre && node.nodeType == 1 && node.nodeName == "BR") {
            node.parentNode.replaceChild(document.createTextNode("\n"), node);
        } else {
            insidePre = insidePre || (node.nodeType == 1 && node.nodeName == "PRE");
            for (var i = 0, children = node.childNodes, len = children.length; i < len; ++i) {
                convert(children[i], insidePre);
            }
        }
    }

    return function(str) {
        var div = document.createElement("div");
        div.innerHTML = str;
        convert(div, false);
        return div.innerHTML;
    }
})();

var str = "<div>foo<br />bar<pre>1<br />2</pre></div>";
window.alert(preBrsToNewLine(str));

Comments

0

I (and others) think its a bad idea to use regular expressions to parse html (or xml). You probably want to use a recursive state machine. Will something like this resolve the issue? There's a lot of room to optimize, but I think it illustrates.

function replace(input, pre) {
    var output = [];
    var tag = null;
    var tag_re = /<(\w+)[^>]*?(\/)?>/; // This is a bit simplistic and will have problems with > in attribute values
    while (tag_re.exec(input)) {
        output.push(RegExp.leftContext);
        input = RegExp.rightContext;
        tag = RegExp.$1;
        if (pre && tag == 'br') {
            output.push('\n');
        } else {
            output.push(RegExp.lastMatch);
        }

        if (!RegExp.$2) {
            // not a self closing tag
            output.push(replace(input, tag=='pre'));
            return output.join('');
        }
    }
    output.push(input);
    return output.join('');
}

Comments

0

I use this type of 'replaceBetween' quite alot and have this method for it..

function replaceBetween(input, start, end, newText) {
        var reg = new RegExp(start + ".*?" + end, "g");
        var newString = input.replace(reg, start + newText + end);
        return newString;
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.