0

Using the following string:

http://www.google.com.ar/setprefs?prev=http://www.google.com.ar/&sig=0_Kxz_cp1G52p8pcrDBlMIQhwJAL0%3D&suggon=2 https://plus.google.com/?gpsrc=ogpy0&tab=wX http://www.google.com.ar/webhp?hl=es&tab=ww http://www.google.com.ar/imghp?hl=es&tab=wi http://video.google.com.ar/?hl=es&tab=wv http://news.google.com.ar/nwshp?hl=es&tab=wn http://translate.google.com.ar/?hl=es&tab=wT https://mail.google.com/mail/?tab=wm http://www.google.com.ar/intl/es/options/ http://books.google.com.ar/bkshp?hl=es&tab=wp http://scholar.google.com.ar/schhp?hl=es&tab=ws http://www.google.com.ar/blogsearch?hl=es&tab=wb https://www.google.com/calendar?tab=wc https://docs.google.com/?tab=wo https://sites.google.com/?tab=w3 http://groups.google.com.ar/grphp?hl=es&tab=wg http://www.google.com.ar/reader/?hl=es&tab=wy http://www.google.com.ar/intl/es/options/ https://accounts.google.com/ServiceLogin?hl=es&continue=http://www.google.com.ar/ http://www.google.com.ar/preferences?hl=es http://www.google.com.ar/preferences?hl=es-419 http://www.google.com.ar/advanced_search?hl=es-419 http://www.google.com.ar/language_tools?hl=es-419 http://www.google.com/history/optout?hl=es http://www.google.com.ar/webhp?hl=es-419 http://www.google.com.ar/support/websearch/bin/answer.py?answer=186645&form=bb&hl=es-419 http://www.google.com.ar/intl/es-419/ads/ http://www.google.com.ar/services/ http://www.google.com.ar/intl/es-419/privacy.html https://plus.google.com/112209395112018703654 http://www.google.com.ar/intl/es-419/about.html http://www.google.com/ncr javascript:void(0)

And this regex:

(http://)(www.){0,1}(google.com.ar)[\S]*

This code:

var result = links.match(new RegExp("(http://)(www.){0,1}(google.com.ar)[\S]*"));
for(var i = 0;i<result.length;i++)
{
    alert(result[i]);
}

Gives me this output:

  1. http://www.google.com.ar
  2. http://
  3. www.
  4. google.com.ar

I have already tried to test the regex in http://regexpal.com/ and www.regextester.com, and in both cases the highlighted matches are correct, so i guess the problem is with the code. I'm really new with javascript so i can't see the problem.

Thanks in advance

3
  • 1
    ...what are you expecting that isn't happening? Commented Jan 23, 2012 at 2:07
  • 3
    "... not working as expected" What exactly did you expect? Commented Jan 23, 2012 at 2:07
  • I expected the correct matches, for example the regex should match at least: google.com.ar/intl/es-419/about.html But it does not do that. Commented Jan 23, 2012 at 2:14

1 Answer 1

1

Use the g flag on your regex.

var result = links.match(/http:\/\/(?:www\.)?google\.com\.ar([\S]*)/g);
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks, this worked perfectly. But i need the webpage to be a variable, is there any way to archive this with the regex "format" i posted?
Well, if you loop through the result array, you can then test the regex again but without the g flag. This will allow you to get the subpatterns.
This is what i mean : new RegExp("(http://)(www.){0,1}("+ URL +")[\S]*"). I need a simpler way to get the links from any website (not including sub domains: groups.google.com.ar).
Thanks, i splited your pattern in two: http:\\/\\/(?:www\\.)? ([\\S]*) And then checked: links.match(new RegExp(pattern1+URL+pattern2,'gi'));

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.