0

I would like to remove in javascript from my text_string all characters that are not letters (in all languages) and numbers. I can do it individually. But how can I put both in ONE expression, so that both conditions are true at the same time?

var text_string = '!#Ab+Z1_↕.🍏2ü翻訳';
text_string = text_string.replace(/\P{Letter}/gu, ''); 
text_string = text_string.replace(/\P{Number}/gu, ''); 
text_string = text_string.replace(/[^#]/, ''); 
// should be replaced to  #AbZ12ü翻訳
2
  • 1
    If I understand: text_string.replace(/[^(a-zA-Z0-9#)]/g, ''); Commented Sep 13, 2022 at 11:22
  • 1
    No. That does not work with foreign languages without latin characters. Commented Sep 13, 2022 at 16:58

1 Answer 1

2

You can use this regex in unicode for search:

[^\p{Letter}\p{Number}#]+

and replace with empty string.

RegEx Demo

Code:

const regex = /[^\p{Letter}\p{Number}#]+/gu;

// Alternative syntax using RegExp constructor
// const regex = new RegExp('[^\\p{Letter}\\p{Number}#]+', 'gu')

const str = `!#Ab+Z1_↕.🍏2ü翻訳`;

const result = str.replace(regex, '');

console.log(result);

RegEx Breakup:

  • [^\p{Letter}\p{Number}#]+: In a character class match any character that is not # not a unicode letter and not a unicode number.

Remember that \p{something} is inverse of \P{something}

Sign up to request clarification or add additional context in comments.

2 Comments

What would it be if we had an additional requirement that the hash mark is only allowed in position 1. At all other positions the hash mark should be replaced as well.
That would be: [^\p{Letter}\p{Number}#]+|(?<!^)#

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.