1

I want to catch word in paragraphy. I do not want to use word boundary because of unicode character (şöüİıçğ) problems. So I use a regex like this. I get an error Invalid group. Is there someone who can help?

var paragraphy= "Bu örnek bir metindir <span>bu</span> metin; test amaçlı yazılmıştır.";
var word="metin;";
var regex = new RegExp("([\\s>]|^)("+word+")(?=([\\.\\,\\;\\?\\!](?=[\\s<])|(?<![\\.\\,\\;\\?\\!])[<\\s]|$))", "gi");
console.log(paragraphy.match(regex));

I want to this result: ["metin"]

10
  • 1
    (?<!...) is a negative lookbehind and JS doesn't support it. Commented Apr 26, 2016 at 14:08
  • @anubhava Well, what can i do? Commented Apr 26, 2016 at 14:12
  • 1
    What's expected output for above input? Commented Apr 26, 2016 at 14:13
  • @anubhava I want to this result: ["metin"] so only word non punctuation Commented Apr 26, 2016 at 14:15
  • Will (?:[\s>]|^)(metin)(?=[.,;?!]?(?:[<\s]|$)) work for you? Commented Apr 26, 2016 at 14:21

2 Answers 2

1

You can simplify the boundary check with ([\\s>]|^) group before the word, and (?=[.,;?!\\s<]) lookahead after. Also, since you are using a global flag, and you define capture groups, and you need to access one after matching, you'd better use a RegExp#exec() inside a loop.

Also, if you have some punctuation after it (inside the search word) you should get rid of it first. If it only appears at the end of the word, pre-process it with word = word.replace(/[,.;?!<]+$/, '').

var paragraphy = "Bu örnek bir metindir <span>bu</span> metin; test amaçlı yazılmıştır.";
var word="metin;";
var regex = new RegExp("([\\s>]|^)("+word.replace(/[,.;?!<]+$/, '')+")(?=[.,;?!\\s<])", "gi");
res = paragraphy.replace(regex, '$1<span>metin</span>');
document.body.innerHTML = "<pre>" + res + "</pre>";
span {
  color: #FF0000;
  }

Sign up to request clarification or add additional context in comments.

3 Comments

must be word="metin;" and result only word this "metin" so not punctuation
When you want to search for metin; there is no chance to return just metin. Regexps do not work that way - you must pre-process the pattern before searching.
I try to do this: paragraphy.replace(regex, '<span>metin</span>'); and final result: paragraphy= "Bu örnek bir metindir <span>bu</span> <span>metin</span>; test amaçlı yazılmıştır.";
1

Based on discussion above (below your question) you can use this replace:

    var word = "metin";

    var re = new RegExp("(^|[\\s>])(" + word + ")[.,;?!]?(?=[\\s<]|$)", "gi");

    var str = 'Bu örnek bir metindir <span>bu</span> metin; test amaçlı yazılmıştır';
     
    var result = str.replace(re, '$1<span>$2</span>');

    alert(result);

//=> Bu örnek bir metindir <span>bu</span> <span>metin</span> test amaçlı yazılmıştır

RegEx Demo

1 Comment

Thank you for helps me I guess. I need to reconsider all script @anubhava

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.