3

Backward slash in String JavaScript doesn't count in length, and disapears together with number when it is ptinted. I tried to google it, not quite sure what to call it. What is going on in this code?

var b = "world\0";
alert(b.length)                        // 6
console.log( "world\0" );       // world  (we see here only 5 characters!)

Why so?

5
  • 3
    because \ is an escape character Commented Feb 12, 2018 at 7:32
  • The question is unclear. And the null character isn't supposed to disappear, how comes you see nothing ? It normally appears as a rectangle: i.imgur.com/od4wdED.png Commented Feb 12, 2018 at 7:35
  • 1
    Your title and question don't match up. Commented Feb 12, 2018 at 7:36
  • Guys, we don't need so many identical answers. Commented Feb 12, 2018 at 7:38
  • Sorry for inconveniance, I couldn't put it all together in a question because there are lot details to take in consideration before asking and i didn't know what to call it. That's why I used what I knew for that moment: "string and back slash behaves very strange" would sound even worse for my opinion Commented Feb 12, 2018 at 8:27

3 Answers 3

4

\0 is an escape sequence that creates a single character (just like \t creates a single character [a tab], \n creates a single newline, etc.) The character's character code is 0 (vs. 9 for tab or 10 for newline). So the string really has six characters in it:

w (U+0077)
o (U+006F)
r (U+0072)
l (U+006C)
d (U+0064)
\0 (U+0000)

On most consoles, you'll see character 0 as a little box or a diamond with a question mark in it, but some may not show it at all (Node.js's console for instance, at least as of this writing).

I should note that \0 producing U+0000 is a special case. \1 does not produce U+0001, etc. Perhaps straying somewhat from the topic, but if you want to specify a character by its character code, you have three choices:

  • If the character fits in a single UTF16 code unit whose value fits in two hex digits, you can use a hex escape sequence: \xXX where XX is exactly two hex digits.

  • If the character fits in a single UTF16 code unit (regardless of whether that value fits in two digits), you can also use a Unicode code unit escape sequence: \uXXXX where XXXX is exactly four hex digits.

  • If the character requires two UTF16 code units (a "surrogate pair"), you can either use two Unicode code unit escapes, or the newer (ES2015+) Unicode code point escape sequence: \u{X+} where X+ is one or more hex digits whose value is <= 0x10FFFF (the maximum code point in Unicode at present).

(If you're unfamiliar with the terms "code unit" and "code point," I've written up this blog post about it and about JavaScript strings in general.)

For example, we can write the English capital letter A as \x41, \u0041, or \u{41} because it needs just a single UTF16 code unit and that code unit's value fits in two hex digits. We can write the latin capital A-with-macron (Ā) as \u0100 or \u{100} because it, too, requires only a single code unit — but we can't use a hex escape sequence (\xXX) because the code unit's value doesn't fit in two hex digits. The emoji 😉 requires two code units; we can write it either by writing out those two code units (\uD83D\uDE09) or by writing a single code point (\u{1F609}) (we can't use \xXX to write it, because each of its code units is too large for two hex digits; that's true of all UTF16 code units that are part of a surrogate pair, because of the way UTF16 is defined).

Live Examples:

// The letter A
const a1 = "A";
const a2 = "\x41";
const a3 = "\u0041";
const a4 = "\u{41}";
console.log(a1, a2, a3, a1 === a2, a2 === a3, a3 === a4);

// The letter Ā (A-with-macron)
const m1 = "Ā";
const m2 = "\u0100";
const m3 = "\u{100}";
console.log(m1, m2, m3, m1 === m2, m2 === m3);

// The emoji 😉
const e1 = "😉";
const e2 = "\uD83D\uDE09";
const e3 = "\u{1F609}";
console.log(e1, e2, e3, e1 === e2, e2 === e3);

Sign up to request clarification or add additional context in comments.

2 Comments

Indeed, I checked on Node.js console... Thank you for clarification!
This is a great explanation - seeing \u0000 and how it's counted as a single character makes sense
1

Here \ is an escape character so it is not counted as a length in the string.

So "world\0" has 6 charactes. i.e. w, o, r, l, d, \0

But you do not see \0 here is because that is a null character but it is there somewhere.

1 Comment

Thank you! So, \0 - is "escape sequence" of null character.
0

Special Characters In Javascript

Because strings must be written within quotes, JavaScript will misunderstand this string:

var x = "We are the so-called "Vikings" from the north.";

The string will be chopped to "We are the so-called ".

The solution to avoid this problem is to use the backslash escape character.

The backslash (\) escape character turns special characters into string characters:

  • \' -> output '
  • \" -> output "
  • \\ -> output \

sample:

var x = "We are the so-called \"Vikings\" from the north.";

That's why the single backslash character will not count.

Source: https://www.w3schools.com/js/js_strings.asp

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.