3

I am looking for a way to access javascript comments from some (other) javascript code. I plan on using this to display low level help information for elements on the page that call various js function without duplicating that information in multiple places.

mypage.html:

...
<script src="foo.js"></script>
...
<span onclick="foo(bar);">clickme</span>
<span onclick="showhelpfor('foo');>?</span>
...

foo.js:

/**
 * This function does foo.
 * Call it with bar.  Yadda yadda "groo".
 */
function foo(x)
{
    ...
}

I figure I can use getElementsByTagName to grab the script tag, then load the file with an AJAX request to get the plain text content of it. However, then I'd need a way to parse the javascript in a reliable way (i.e. not a bunch of hacked together regexp's) that preserves the characters that simply eval'ing it would throw away.

I was thinking of simply putting the documentation after the function, in a js string, but that's awkward and I have a feeling getting doxygen to pick that up will be difficult.

function foo(x) { ... }
foo.comment = "\
This functions does foo.\
Call it with bar.  Yadda yadda \"groo\".\
";

2 Answers 2

8

You could create a little parser that does not parse the complete JS language, but only matches string literals, single- and multi-line comments and functions of course.

There's a JS parser generator called PEG.js that could do this fairly easy. The grammar could look like this:

{
var functions = {};
var buffer = '';
}

start
  =  unit* {return functions;}

unit
  =  func
  /  string
  /  multi_line_comment
  /  single_line_comment
  /  any_char

func
  =  m:multi_line_comment spaces? "function" spaces id:identifier {functions[id] = m;}
  /  "function" spaces id:identifier                              {functions[id] = null;}

multi_line_comment
  =  "/*" 
     ( !{return buffer.match(/\*\//)} c:. {buffer += c;} )*               
     {
       var temp = buffer; 
       buffer = ''; 
       return "/*" + temp.replace(/\s+/g, ' ');
     }

single_line_comment
  =  "//" [^\r\n]*

identifier
  =  a:([a-z] / [A-Z] / "_") b:([a-z] / [A-Z] / [0-9] /"_")* {return a + b.join("");}

spaces
  =  [ \t\r\n]+ {return "";}

string
  =  "\"" ("\\" . / [^"])* "\""
  /  "'" ("\\" . / [^'])* "'"

any_char
  =  .

When you parse the following source with the generated parser:

/**
 * This function does foo.
 * Call it with bar.  Yadda yadda "groo".
 */
function foo(x)
{
    ...
}

var s = " /* ... */ function notAFunction() {} ... ";

// function alsoNotAFunction() 
// { ... }

function withoutMultiLineComment() {
}

var t = ' /* ... */ function notAFunction() {} ... ';

/**
 * BAR!
 * Call it?
 */





            function doc_way_above(x, y, z) {
    ...
}

// function done(){};

the start() function of the parser returns the following map:

{
   "foo": "/** * This function does foo. * Call it with bar. Yadda yadda \"groo\". */",
   "withoutMultiLineComment": null,
   "doc_way_above": "/** * BAR! * Call it? */"
}

I realize there's some gaps to be filled (like this.id = function() { ... }), but after reading the docs from PEG.js a bit, that shouldn't be a big problem (assuming you know a little of parser generators). If it is a problem, post back and I'll add it to the grammar and explain a bit about what's happening in the grammar.

You can even test the grammar posted above online!

Sign up to request clarification or add additional context in comments.

1 Comment

eek, that's a bit more involved than I was hoping for, but it looks like it should do what I need. Thanks!
0

You could use a unique string identifier at the beginning of every comment, and then using that unique identifier you could easily craft a regex to extract the comment.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.