0

Best Solution i come up with so far, given a textblock it finds those methods that have paramters, but also the function with parameter key like this: "get: function(key)".

    public class JavaScriptMethodFinder
{
    static readonly string pattern = @"(?<=\s(?<Begin>[a-zA-Z_][a-zA-Z0-9_]*?)\(|\G)\s*((['""]).+?(?<!\\)\2|\{[^}]+\}|[^,;'""(){}\)]+)\s*(?:,|(?<IsEnd>\)))";
    private static readonly Regex RegEx = new Regex(pattern, RegexOptions.Compiled);

    public IEnumerable<dynamic> Find(string text)
    {
        var t = RegEx.Matches(text);
        dynamic current = null;
        bool isBegin;
        foreach (Match item in t)
        {

            if (isBegin = (item.Groups["Begin"].Value != string.Empty))
            {
                current = new ExpandoObject();
                current.MethodName = item.Groups["Begin"].Value;
                current.Parameters = new List<string>();
                current.Parameters.Add(item.Groups[1].Value);
            }else
                current.Parameters.Add(item.Groups[1].Value);
            if (item.Groups["IsEnd"].Value != string.Empty)
            {
                isBegin = false;
                if(!(item.Groups["Begin"].Value != string.Empty))
                    current.Parameters.Add(item.Groups[1].Value);
                yield return current;
            }

        }

    }

}

I wanna find Methods and its Variables. Given two examples.

First Example

function loadMarkers(markers)
{
     markers.push(
            new Marker(
              "Hdsf", 
              40.261330438503,
              10.4877055287361,
              "some text"
            ) 
      );
}

Second Example

var block = new AnotherMethod('literal', 'literal', {"key":0,"key":14962,"key":false,"key":2});

So far i have, tested here: http://derekslager.com/blog/posts/2007/09/a-better-dotnet-regular-expression-tester.ashx

(?<=Marker\(|\G)\s*((?<name>['""]).+?(?<!\\)\2|\{[^}]+\}|[^,;'""(){}\)]+)\s*(?:,|\))

Found 5 matches: "Hdsf", has 2 groups: "Hdsf" " 40.261330438503, has 2 groups: 40.261330438503 10.4877055287361, has 2 groups: 10.4877055287361 "some text" ) has 2 groups: "some text" " ) has 2 groups:

(?<=AnotherMethod\(|\G)\s*((?<name>['""]).+?(?<!\\)\2|\{[^}]+\}|[^,;'""(){}\)]+)\s*(?:,|\))

Found 3 matches: 'literal', has 2 groups: 'literal' ' (name) 'literal', has 2 groups: 'literal' ' (name) {"key":0,"key":14962,"key":false,"key":2}) has 2 groups: {"key":0,"key":14962,"key":false,"key":2} (name)

I would like to combine it such that i have one expression

  • Match<(methodname)>
    • Group : parameter
    • Group : parameter
    • Group : parameter
  • Match<(methodname)>
    • Group : parameter
    • Group : parameter
    • Group : parameter

so when i scan a page which contains both cases, i will get two matches witch ect the first capture being the method name and then the following is the paramters.

I been trying to modify what i already have, but its to complex with the LookBehind stuff for I to understand it.

1 Answer 1

1

Regex's are a very problematic approach for this type of project. Have you looked at using a genuine JavaScript parser/compiler like Rhino? That will give you full awareness of JavaScript syntax "for free" and the ability to walk your source code meaningfully.

Sign up to request clarification or add additional context in comments.

1 Comment

Its for scrapping a few html pages. I would prefer a regex solution rather then using a external lib.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.