-4

I was wondering if there was a C++ equivalent to Javas .getBytes() method. I'm reading a .txt file and need to convert each line into bytes.

Thanks in advance!

2

2 Answers 2

0

std::string::data is the equivalent.

Sign up to request clarification or add additional context in comments.

10 Comments

Are you saying that Java .getBytes() produces a raw pointer? Is that what it does?
There are no pointers in Java. getBytes just gives you direct access to the array of characters, just like std::string::data does, and in both cases you're looking at an array of bytes. In neither language is it advisable that you use these arrays rather than the original string objects.
Java is full of pointers. That's why the Java language specification, which you should be familiar with in order to answer questions concerning Java, calls them pointers. However, Java does not have raw pointers, and that was my point here: saying that std::string::data is an equivalent to anything in Java, is meaningless.
I wrote a JVM, so I am familiar with the JVM spec. I also wrote java compilers and decompilers, escape analysis tools for Java, java bytecode compression, code for Java JIT compilation, papers for real-time Java (JSR-1), Java garbage collectors, and much much more. I have forgotten more about Java than you have ever known about the language. There are no pointers in Java. Get over it. There are pointers in whatever languages you use to write the JVM.
The Java language specification calls Java pointers, pointers. You are saying that you are not familiar with the spec, but that you view yourself as very competent. You're also maintaining that you've done a lot of Java programming but that you're unfamiliar with e.g. NullPointerException.
|
0

In C++ a char is a byte. And so a std::string is already a sequence of bytes.

However, you may want a sequence of unsigned char.

One way is to just copy the byte values from the string, e.g. into a std::vector:

using Byte = unsigned char;
vector<Byte> const bytes( s.begin(), s.end() );

If you're reading the text file into a std::wstring per line, e.g. using a wide stream, then the bytes depend on your preferred encoding of that string.

In practice, except possibly on an IBM mainframe, a C++ wide string is either UTF-16 or UTF-32 encoded. For these two encodings the standard library provides specializations of std::codecvt that can convert to and from UTF-8.


If you want an arbitrary encoding from a wide string, then you're out of luck as far as the C++ standard library is concerned, sorry.

5 Comments

A Java java.lang.String also wraps an array of characters just like std::string, there is no difference in that regard.
@SeanF: A Java String is Unicode encoded as UTF-16 internally. A C++ std::string is a sequence of bytes. That's a very big difference.
@ Cheers and hth. In C++ you can encode strings whichever way you like. If you want a UTF-16 std::string, there is nothing stopping you.
No it isn't, and I don't consider that much of an argument. stackoverflow.com/questions/11086183/…
You're not making sense. Nobody's said anything about restriction to ascii. And you can put a picture in a std::string if you want. Talking about that in this context is however meaningless, dumb, nonsense, just not relevant at all except on maybe a purely associative level.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.