1
<span class="X">(&#8237;−&#8237;500&#8236;&#8236;</span>

I get the innerHTML from this span: var abc = document.querySelector("SELECTOR").innerHTML

It shows as "(-500" but when I copy it to the notepad it comes with the a invisible Unicode, how can I get the innerHTML just as simple text "-500" but without the Unicode and without the "(".

3
  • Who is inserting this text into the span ? Commented Feb 22, 2020 at 23:58
  • note that all characters are unicode. these are just entities. Commented Feb 23, 2020 at 0:01
  • I'm scraping with puppeteer/nodejs. Commented Feb 23, 2020 at 0:03

1 Answer 1

3

You have to explicitly remove the invisible Unicode characters and convert some Unicode characters into their ASCII equivalents:

let x = document.querySelector('.x').innerHTML;
x = x.replace(/\u202d/g, '');  // (0x202d = 8237 "LEFT-TO-RIGHT OVERRIDE")
x = x.replace(/\u2212/g, '-'); // (0x2212 = 8277 "MINUS SIGN")
Sign up to request clarification or add additional context in comments.

2 Comments

This is great, x = x.replace(/\u202d/g, '') and x = x.replace(/\u202c/g, '') replace the &#8237 and &#8236, is this possible to put in only one replace ?
This works somehow for what i need :D : x = x.replace(/\u202c|\u202d|(/g, ''); thank you!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.