24

I'm using a low level native API where I send an unsafe byte buffer pointer to get a c-string value.

So it gives me

// using byte[255] c_str
string s = new string(Encoding.ASCII.GetChars(c_str));

// now s == "heresastring\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0(etc)";

So obviously I'm not doing it right, how I get rid of the excess?

3
  • I got something similar when received a string via RS-232. Eventually I was doing it wrong: I discovered that the handler is called for each byte received and in the handler I used serialPortInstance.Read(...) to read more than 1 byte. Commented Apr 5, 2010 at 21:59
  • I am not sure but might have a look on RegularExpression, something like string re1="((?:[a-z][a-z]+))"; and get the first match Commented Mar 24, 2017 at 20:37
  • 2
    The "rule" of null-terminated strings is that everything beginning with the first null should be ignored. Several of the other answer that just Trim() or Replace() aren't considering that there might be some non-null "junk" after the initial null. This answer gives a one-line solution. Commented Jan 26, 2018 at 16:38

7 Answers 7

44

.NET strings are not null-terminated (as you may have guessed from this). So, you can treat the '\0' as you would treat any normal character. Normal string manipulation will fix things for you. Here are some (but not all) options.

s = s.Trim('\0');

s = s.Replace("\0", "");

var strings =  s.Split(new char[] {'\0'}, StringSplitOptions.RemoveEmptyEntries);

If you definitely want to throw away any values after the first null character, this might work better for you. But be careful, it only works on strings that actually include the null character.

s = s.Substring(0, Math.Max(0, s.IndexOf('\0')));
Sign up to request clarification or add additional context in comments.

7 Comments

These approaches miss the fact that there might well be non-null characters after the first null in the string. This answer gives a more robust solution.
Um... how are any of these approaches missing characters after nulls? Trim only works on the ends of strings. Replace doesn't do anything to any part of the string except the null character. Split explicitly keeps everything except the null character, producing an array of strings. It looks like each option safely handles every non-null character in any string.
your solutions work well for the specific string the OP gave. But strings returned from a native (C++) API can include junk after the initial null. A general solution must ignore everything after the initial null, not just omit the null(s). Try each of your solutions on this sample string ("Here's a string\0memoryjunkhere") to see what I mean.
@Richard: I appreciate your attempt to clear this up, but my answer wasn't intended as code to simply copy and place into someone's app. Rather, it was to point out that normal string manipulation can easily detect and operate over null characters. If the developer wants to keep what comes after the \0, he can. If the developer doesn't want it, he can ignore it.
The context given in OP's question ("c-string value", "get rid of the excess") indicates they would want to ignore everything after the first null, even though they might not have known it yet :-) The approach you added yesterday is better, but as you mention, works only if the input string contains at least one null, requiring caller to implement another "if" test. The answer I referenced in my initial comment (that of @MrHIDEn) works in all these scenarios.
|
7

There may be an option to strip NULs in the conversion.

Beyond that, you could probably clean it up with:

s = s.Trim('\0');

...or, if you think there may be non-NUL characters after some NULs, this may be safer:

int pos = s.IndexOf('\0');  
if (pos >= 0)
    s = s.Substring(0, pos);

Comments

6

From .NET Core 2.1 onward, the following can be use which will help prevent unnecessary allocations for intermediate arrays or strings:

var bytesAsSpan = bytes.AsSpan();
var terminatorIndex = bytesAsSpan.IndexOf(byte.MinValue);
var s = Encoding.ASCII.GetString(bytesAsSpan.Slice(0, terminatorIndex));

It's really the last line that requires .NET Core 2.1 or later because that's when the Encoding.GetString(ReadOnlySpan<byte>) overload was introduced. It's possible to do Span based operations using the System.Memory package but Encoding.GetString won't expose an overload that accepts ReadOnlySpan<byte>, so the last line would have to allocate an array:

var s = Encoding.ASCII.GetString(bytesAsSpan.Slice(0, terminatorIndex).ToArray());

Comments

4
// s == "heresastring\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0(etc)"    
s = s.Split(new[] { '\0' }, 2)[0];
// s == "heresastring"

4 Comments

Great! This is a single-statement answer that properly handles the scenario in which a string returned from a native (C++) API might well contain junk non-null characters following the first null. (e.g., "Here's a string\0memoryjunkhere") Some of the other answers don't handle this important scenario properly (or require an IF test).
This would create a temporary array containing a string per null. So if I have a buffer of 300 NULs (\0), then put "hello" at the start - split will give me an array of ~ 294 empty strings. Better to use the s = s.Substring(0, Math.Max(0, s.IndexOf('\0'))); method I think.
@TomLeys you are forgetting about the second "2" parameter of the Split() method. In this case, the array will contain either 1 or 2 members.
@Tom Leys this code will split to array with only 2 or 1 strings. Check that "abc\0\0\0\0"s.Split(new[] { '\0' }, 2), => String ["abc","\0\0\0"]
4

How about one of the System.Runtime.InteropServices.Marshal.PtrToString* methods?

Marshal.PtrToStringAnsi - Copies all characters up to the first null character from an unmanaged ANSI string to a managed String, and widens each ANSI character to Unicode.

Marshal.PtrToStringUni - Allocates a managed String and copies all or part to the first null of an unmanaged Unicode string into it.

1 Comment

It's look perfermance more better , Can you give me a example thanks.
1

The safest way is to use:

s = s.Replace("\0", "");

Comments

0

I believe \0 is "null" in ascii -- are you sure the string you're getting is actually ascii encoded?

2 Comments

I think he means that he's getting a series of null bytes, not that he's actually getting the "\0" string sequence.
I guess I'll do like .Trim("\0") haha

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.