RegExp not working as expected while removing some string

Question

So I am extracting some data from a some websites and would like to remove some unnecessarily text from it.

So I did some parsers that can control the parsed content before presenting it to the users.

Here is my test code that I did.

// tried using this but it strill did not work 
function escapeRegex(string) {
return string.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
}

var div = document.getElementById("content");
var txArray = ["If you find any errors ( broken links, non-standard content, etc.. ), Please let us know < report chapter > so we can fix it as soon as possible.", "KobatoChanDaiSuki"]
txArray.forEach(x => {
  var reg = new RegExp(escapeRegex(x), "gi");
  div.innerHTML = div.innerHTML.replace(reg, "");
});

<div id="content">
Hyung : Big/older brother. Kind of an equivalent to the japanese “onii-san” but only used between male (male to male).
Translator :Pumba
TL Check : KobatoChanDaiSuki
 If you find any errors ( broken links, non-standard content, etc.. ), Please let us know < report chapter > so we can fix it as soon as possible.
</div>

Se above it is not removing all the contents, Why is that ?

maybe I need to break the long string and then try to clean it, I really do not know? What do you think?

edemaine · Accepted Answer · 2021-08-06 15:06:21Z

1

The problem is that (, ), and . have special meanings in JavaScript regular expressions. An additional problem is that < and > are written as < and > respectively in innerHTML. innerText avoids this problem. (I figured this out by adding console.log(div.innerHTML) to look at the contents; see the snippet below.)

Try this:

var txArray = ["If you find any errors \\( broken links, non-standard content, etc\\.\\. \\), Please let us know < report chapter > so we can fix it as soon as possible\\.", "KobatoChanDaiSuki"]
txArray.forEach(x => {
  var reg = new RegExp(x, "gi");
  div.innerText = div.innerText.replace(reg, "");
});

Or you can write code to escape your regular expressions, as in the following:

var reg = new RegExp(x.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&'), "gi");

var div = document.getElementById("content");
var txArray = ["If you find any errors \\( broken links, non-standard content, etc\\.\\. \\), Please let us know < report chapter > so we can fix it as soon as possible\\.", "KobatoChanDaiSuki"];

txArray.forEach(x => {
  var reg = new RegExp(x, "gi");
  console.log(div.innerHTML);
  div.innerText = div.innerText.replace(reg, "");
});

<div id="content">
Hyung : Big/older brother. Kind of an equivalent to the japanese “onii-san” but only used between male (male to male).
Translator :Pumba
TL Check : KobatoChanDaiSuki
 If you find any errors ( broken links, non-standard content, etc.. ), Please let us know < report chapter > so we can fix it as soon as possible.
</div>

edited Aug 6, 2021 at 15:06

answered Aug 6, 2021 at 14:54

edemaine

3,14015 silver badges23 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Alen.Toma Over a year ago

No working, I tried what you wrote and its not working. Please try to create a simple runnable code here on stackoverflow.

Alen.Toma Over a year ago

See above I updated the code and did as you said and still not working

edemaine Over a year ago

Ah, < and > are causing trouble too. I've updated my answer.

Alen.Toma Over a year ago

I do not want to write \` manually, could you fix the escapeRegex` function I wrote instead so i could replace those unwanted char in txArray

edemaine Over a year ago

That code should work if you switch from innerHTML to innerText.

Collectives™ on Stack Overflow

RegExp not working as expected while removing some string

1 Answer 1

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related