2

I've got the following string: |Africa||Africans||African Society||Go Africa Go||Mafricano||Go Mafricano Go||West Africa|.

I am trying to write a regular expression that only matches terms that include the word Africa or any deriative of it (meaning yes to all terms above except for |Mafricano| and |Go Mafricano Go|. Each term is enclosed between two |.

Right now I've come up with: /\|[^\|]*africa[^\|]*\|/gi, which says:


  1. \| Match |

  1. [^\|]* Match zero to unlimited instances of any character except |

  1. africa Match africa literally

  1. [^\|]* Match zero to unlimited instances of any character except |

  1. \| Match |

I've tried inserting ((?:\s)|(?!\w)) to make it /\|[^\|]*((?:\s)|(?!\w))africa[^\|]*\|/gi. Although it succeeds in excluding |Mafricano| and |Go Mafricano Go|, it also excludes all other entries except for |West Africa| and |Go Africa Go|. So that is good but it needs to include all single word Africa and its derived forms too.

Can anybody help me?

2
  • why don't you use replace ? It should be much more easier. Commented Oct 18, 2014 at 8:33
  • @JhKaiz, @AvinashRaj: I need an array of all matched terms, so I'll be using the match() function Commented Oct 18, 2014 at 8:36

2 Answers 2

4

You can use this regex

[^|]*\bAfrica[a-z]*\b[^|]*

DEMO

var str = "|Africa||Africans||African Society||Go Africa Go||Mafricano||Go Mafricano Go||West Africa|";
var arr = str.match(/[^|]*\bAfrica[a-z]*\b[^|]*/g);
console.log(arr); // ["Africa", "Africans", "African Society", "Go Africa Go", "West Africa"] 
Sign up to request clarification or add additional context in comments.

Comments

1

I think you want something like this,

\|(?:(?!Mafrica|\|).)*?africa(?:(?!Mafrica|\|).)*?\|

DEMO

> "|Africa||Africans||African Society||Go Africa Go||Mafricano||Go Mafricano Go||West Africa|".match(/\|(?:(?!Mafrica|\|).)*?africa(?:(?!Mafrica|\|).)*?\|/gi);
[ '|Africa|',
  '|Africans|',
  '|African Society|',
  '|Go Africa Go|',
  '|West Africa|' ]

Don't forget to turn on the i modifier to do a case insensitive match.

Explanation:

\|                       '|'
(?:                      group, but do not capture (0 or more
                         times):
  (?!                      look ahead to see if there is not:
    Mafrica                  'Mafrica'
   |                        OR
    \|                       '|'
  )                        end of look-ahead
  .                        any character except \n
)*?                      end of grouping
africa                   'africa'
(?:                      group, but do not capture (0 or more
                         times):
  (?!                      look ahead to see if there is not:
    Mafrica                  'Mafrica'
   |                        OR
    \|                       '|'
  )                        end of look-ahead
  .                        any character except \n
)*?                      end of grouping
\|                       '|'

3 Comments

The entries are way more diverse and more random than Mafricano, so this would not scale for other entries. Thanks though.
My apologies for that. I thought that was implied.
You're welcome.. Next time don't forget to to post a question with detailed explanation on very first.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.