1

I (C++ newbie) am currently trying to implement the following function:

std::string bytes_to_hex(const std::string &bytes);

The function should basically return a base16 encoding of a given byte array:

std::string input{0xde, 0xad, 0xbe, 0xef} => "deadbeef"

My first version doesn't quite work how I imagined:

std::string bytes_to_hex(const std::string &bytes) {
    std::ostringstream ss;
    ss << std::hex;

    for (auto &c : bytes) {
        ss << std::setfill('0') << std::setw(2) << +c;
    }

    return ss.str();
}

With this function the output is:

ffffffdeffffffadffffffbeffffffef

After some experiments, I've found out that this version looks better:

std::string bytes_to_hex(const std::string &bytes) {
    std::ostringstream ss;
    ss << std::hex;

    for (const char &c : bytes) {
        ss << std::setfill('0') << std::setw(2) << +(static_cast<uint8_t>(c));
    }

    return ss.str();
}

The output is as expected:

deadbeef

My question is:

  • Why does the second version work and the first doesn't? What is the main difference here?
  • Is the second version correct implementation of my original intention or can there be other problems?
5
  • The + does nothing, casting however did what you wanted, ie made sure the bytes are interpreted as positive numbers Commented Oct 19, 2017 at 7:05
  • Remember that the char type can be either signed or unsigned, it's up tot he compiler. In your case they seem to be signed, which means when the characters are promoted to int they are also sign extended. Commented Oct 19, 2017 at 7:05
  • 1
    @PasserBy The unary + forces integer promotion of the character. Commented Oct 19, 2017 at 7:06
  • @Someprogrammerdude I meant as in not changing its value Commented Oct 19, 2017 at 7:07
  • @bmk While a std::string can be used for arbitrary data, it does have its drawbacks. One of them you just discovered. That's why I always recommend std::vector<uint8_t> for arbitrary byte data. Commented Oct 19, 2017 at 7:10

1 Answer 1

1

As mentioned in my comment, the unary + forces integer promotion. When that happens, signed types are sign extened which for two's complement encoded integers means that negative numbers (where the left-most bit is 1) are left-padded with binary ones (i.e. 0xde becomes 0xffffffde).

Also mentioned is that char can be either signed or unsigned, a decision that is up to the compiler. Because of the output you get we can say that in your case char is actually signed char.

The simple solution you found out is to first cast the character to an unsigned char, and then (with the unary +) promote it to int.

Sign up to request clarification or add additional context in comments.

2 Comments

thank you for the answer! Do you have also comment for my second question?
@bmk Your second solution will work fine, and should be portable. The only "improvement" might be an explicit static_cast<unsigned> instead of the + promotion, which is more to read and write, but expresses intent better (and will be clearer for those that don't understand promotion yet).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.