3

I am trying to replace all occurences of ???some.text.and.dots??? in a html page to add a link on it. I've built this regexp that does it :

\?\?\?([a-z0-9.]*)\?\?\?

However, I would like to exclude any result that is inside a link : "<a ...> ... MY PATTERN ... </a>", and I am a little stuck as to how to do that, all my attempts have failed for now.

2 Answers 2

9

It's not really clear what kind of "HTML" you are working on. If it is HTML code, something from an Ajax request maybe, then you can use a regular expression; matching both a link or the pattern, and then work out what to do in a callback:

var html = document.body.innerHTML;
html = html.replace(/(<a\s.*?>.*?<\/a>)|(\?\?\?([a-z0-9.]*)\?\?\?)/g, 
    function ( a, b, c, d ) {
       return ( a[0] == '<' ) ? a : '<a href="#">' + d + '</a>'; 
    });
context.innerHTML = html;

Conveniently, replace() can take a callback function as a replacement generator rather than a simple string.

If you are working on a live DOM tree, however, you might want to respect events on nodes and not simply reset the innerHTML. You'll need a bit more primitive approach for that:

// returns all childnodes of type text that do not have A as parent
function walker ( node ) {
  var nodes = [];
  for (var c, i = 0; c = node.childNodes[i]; i++) {
    if ( c.nodeType === 1 && c.tagName !== 'A' ) {
      nodes = nodes.concat( arguments.callee( c ) );
    }
    else if ( c.nodeType === 3 ) { 
      nodes.push( c );
    }
  }
  return nodes;
}

var textNodes = walker( document.body );
for (var i = 0; i < textNodes.length; i++) {
  // create an array of strings separating the pattern
  var m = textNodes[i].nodeValue.split( /(\?\?\?([a-z0-9.]*)\?\?\?)/ );
  if ( m.length > 1 ) {
    for (var j=0; j<m.length; j++) {
      var t, parent = textNodes[i].parentNode;
      // create a link for any occurence of the pattern
      if ( /^\?\?\?([a-z0-9.]*)\?\?\?$/.test( m[j] ) ) {
        var a = document.createElement( 'a' );
        a.href = "#";
        a.innerHTML = RegExp.$1;  // m[j] if you don't want to crop the ???'s
        parent.insertBefore( a, textNodes[i] );
        t = document.createTextNode( ' ' ); // whitespace padding
      }
      else {
        t = document.createTextNode( m[j] );
      }
      parent.insertBefore( t, textNodes[i] );
    }
    // remove original text node
    parent.removeChild( textNodes[i] );
  }
}

This method only touches textnodes, and then only those that match the pattern.

Sign up to request clarification or add additional context in comments.

2 Comments

Well detailed. Impressive. +1
Well, I was trying to use the regexp look-behind feature, not knowing it wasn't supported by javascript. However, your function is excellent for my purposes. Thanks!
0

JavaScript doesn't inherently support look-behind. In order to do this, you'd need to run .match() and then for each of your matches, you'd need to do matches on your tags (such as /<a\s+.*?>/ being immediately before your match and then </a> after your match).

Good luck!!

1 Comment

+1 "Mimicking Lookbehind in Javascript": blog.stevenlevithan.com/archives/mimic-lookbehind-javascript

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.