23

I use utf8 and have to save a constant in a char array:

const char s[] = {0xE2,0x82,0xAC, 0}; //the euro sign

However it gives me error:

test.cpp:15:40: error: narrowing conversion of ‘226’ from ‘int’ to ‘const char’ inside { } [-fpermissive]

I have to cast all the hex numbers to char, which I feel tedious and don't smell good. Is there any other proper way of doing this?

3
  • @AaronMcDaid Look at my first sentence? Commented Oct 31, 2013 at 19:55
  • 3
    Why not const char s[] = u8"\u20AC";? Commented Oct 31, 2013 at 19:56
  • As @KerrekSB mentioned, but it's a c++11 feature. Commented Oct 31, 2013 at 19:57

4 Answers 4

38

char may be signed or unsigned (and the default is implementation specific). You probably want

  const unsigned char s[] = {0xE2,0x82,0xAC, 0}; 

or

  const char s[] = "\xe2\x82\xac";

or with many recent compilers (including GCC)

  const char s[] = "€";

(a string literal is an array of char unless you give it some prefix)

See -funsigned-char (or -fsigned-char) option of GCC.

On some implementations a char is unsigned and CHAR_MAX is 255 (and CHAR_MIN is 0). On others char-s are signed so CHAR_MIN is -128 and CHAR_MAX is 127 (and e.g. things are different on Linux/PowerPC/32 bits and Linux/x86/32 bits). AFAIK nothing in the standard prohibits 19 bits signed chars.

Sign up to request clarification or add additional context in comments.

23 Comments

@John If you do not specify the signedness of char, you are using the compiler's default ... which can (and likely will) change between different compiler vendors (or even different versions of the same compiler). When you need a char to be a byte, you should declare it as such (and not make assumptions about what the compiler may or may not do.
@BasileStarynkevitch: Yes, just a few days ago I spent a good while in the depths of the Standard to figure out why my code wasn't working, and I came across this gem from which I realized I needed three overloads, not two. Reference from C++03: 3.9.1 Fundamental types "1/ [...] Plain char, signed char, and unsigned char are three distinct types. [...]"
@ZacHowland: The same clause goes on to say that, "In any particular implementation, a plain char object can take on either the same values as a signed char or an unsigned char; which one is implementation-defined." So char isn't the same as signed char or unsigned char, but they are so close on a fundamental level that in 15 years of programming C++ professionally I only needed to distiguish between them once.
Just my personal opinion, but from a stylistic point of view, if it is text, use char. I've tried to use unsigned char in the past (because I often have to deal with accented characters): it just doesn't work (because so many functions expect char* or std::string, and string literals are char[]), and it confuses the reader.
@ZacHowland: I predict in two years you'll have to write a third overload for something. But then yo'll be good for another 15 years. :)
|
0

The short answer to your question is that you are overflowing a char. A char has the range of [-128, 127]. 0xE2 = 226 > 127. What you need to use is an unsigned char, which has a range of [0, 255].

unsigned char s = {0xE2,0x82,0xAC, 0};

3 Comments

So by default if there is no specifier, a char is signed?
No, on some implementations a char is unsigned and CHAR_MAX is 255 (and CHAR_MIN is 0). On others char are signed so CHAR_MIN is -128 and CHAR_MAX is 127 (and e.g. things are different on Linux/PowerPC/32 bits and Linux/x86/32 bits).
@texasbruce It is up to the compiler. On many compilers, the default is signed. If you need an unsigned, you should always specify it explicitly.
0

While it may well be tedious to be putting lots of casts in your code, it actually smells extremely GOOD to me to use as strong of typing as possible.

As noted above, when you specify type "char" you are inviting a compiler to choose whatever the compiler writer preferred (signed or unsigned). I'm no expert on UTF-8, but there is no reason to make your code non-portable if you don't need to.

As far as your constants, I've used compilers that default constants written that way to signed ints, as well as compilers that consider the context and interpret them accordingly. Note that converting between signed and unsigned can overflow EITHER WAY. For the same number of bits, a negative overflows an unsigned (obviously) and an unsigned with the top bit set overflows a signed, because the top bit means negative.

In this case, your compiler is taking your constants as unsigned 8 bit--OR LARGER--which means they don't fit as signed 8 bit. And we are all grateful that the compiler complains (at least I am).

My perspective is, there is nothing at all bad about casting to show exactly what you intend to happen. And if a compiler lets you assign between signed and unsigned, it should require that you cast regardless of variables or constants. eg

const int8_t a = (int8_t) 0xFF; // will be -1

although in my example, it would be better to assign -1. When you are having to add extra casts, they either make sense, or you should code your constants so they make sense for the type you are assigning to.

1 Comment

While the stronger type checking is probably good for catching bugs, it causes a lot of hurt for projects which have to deal with legacy code. Initializing char arrays from hex constants spanning 0x00-0xFF is quite common, cases in point: the X Bitmap (XBM) file format (which is actually a snippet of C source code with precisely such an itialization), along with many X library functions dealing with gradients, color maps, etc. which expect arrays of chars, not arrays of unsigned chars.
0

Is there a way to mix these? I want a define macro FX_RGB(R,G,B) that makes a const string "\x01\xRR\xGG\xBB" so I can do the following: const char* LED_text = "Hello " FX_RGB(0xff, 0xff, 0x80) "World"; and get a sting: const char* LED_text = "Hello \x01\xff\xff\x80World";

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.