c# string performance - what is faster to compare, string text or string length

Question

I have to read a huge xml file which consists of over 3 million records and over 10 million nested elements.

Naturally I am using xmltextreader and have got my parsing time down to about 40 seconds from earlier 90 seconds using multiple optimization tricks and tips.

But I want to further save processing time as much as I can hence below question.

Quite a few elements are of type xs:boolean and the data provider always represents values as "true" or "false" - never "1" or "0".

For such cases my earliest code was:

if (xmlTextReader.Value == "true")
{
    bool subtitled = true;
}

which i further optimized to:

if (string.Equals(xmlTextReader.Value, "true", StringComparison.OrdinalIgnoreCase))
{
    bool subtitled = true;
}

I wanted to know if below would be fastest (because its either "true" or "false")?

if (xtr.value.length == 4)
{
    bool subtitled = true;
}

Why don't you benchmark the two approaches and see for yourself? (For what it's worth, I'd guess that the length comparison would be quicker, but probably not significantly.) — LukeH
– LukeH, Commented Sep 6, 2010 at 14:00
Why not just test it? I would not be surprised if string.Equals short-circuited it's test on a length comparison anyway. It would check 1st for reference equality, then the length of the two strings, then if length are the same perform character by character test. Just a guess. — Chris Taylor
– Chris Taylor, Commented Sep 6, 2010 at 14:03
@Chris Taylor: Equals does this short-circuit only for Ordinal and OrdinalIgnoreCase. In all the others, "\x00e9".Equals("e\x0301") is true despite being different length. — Timwi
– Timwi, Commented Sep 6, 2010 at 14:20

peterh · Accepted Answer · 2020-12-04 12:13:52Z

7

Yes, it is faster, because you only compare exactly one value, namely the length of the string.

By comparing two strings with each other, you compare each and every character, as long as both characters are the same. So if you're finding a match for the string "true", you're going to do 4 comparisons before the predicate evaluates to true.

The only problem you have with this solution is, that if someday the value is going to change from true to let's say 1, you're going to run into a problem here.

edited Dec 4, 2020 at 12:13

peterh

1

answered Sep 6, 2010 at 13:59

Giuseppe Accaputo

2,64217 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

MSalters Over a year ago

It's actually not a problem: you'd have the same failing comparisons when "1" != "true" and "0" != "false". You can't change one half interface implementation and expect the interface and all other implementations of that interface to magically change. See also Postel's Law: Be conservative in what you send; be liberal in what you accept.

Rune FS Over a year ago

@Msalters but the length of "1" and the length of "0" is the same so you can't determine the value based on the length.

Rune FS Over a year ago

But is it faster than xmlTextReader.Value[0] == 't'? Just wanted to raise the question of cause the right thing to do is to benchmark

MSalters Over a year ago

@RuneFS: that's besides the point. If the interface says to use "true" and "false", then it works. If the interface is changed to ujse "0" and "1", then String.Equal will work. But if the interface would be changed to use only "false", with true being the implied default if there is no element in your XML, then your string comparison breaks. Ergo, you can't speculate whether your parsing algorithm will understand a future protocol version, and you must consider that incompatible by default.

Alex Reitbort · Accepted Answer · 2010-09-06 13:59:58Z

4

Comparing length will be faster, but less readable. I wouldn't use it unless I profile the performance of the code and conclude that I need this optimization.

answered Sep 6, 2010 at 13:59

Alex Reitbort

13.7k1 gold badge44 silver badges62 bronze badges

Comments

Øyvind Skaar · Accepted Answer · 2010-09-06 14:51:05Z

3

What about comparing the first character to "t"?

Should (maybe :) be faster than comparing the whole string..

answered Sep 6, 2010 at 14:51

Øyvind Skaar

2,33815 silver badges16 bronze badges

Comments

Patrick · Accepted Answer · 2010-09-06 14:45:05Z

2

Measuring the length would almost invariably be faster. That said, unless this is an experiment in micro-optimization, I'd just focus on making the code to be readable and convey the proper semantics.

You might also try something like that uses the following approach:

Boolean.TryParse(xmlTextReader.Value, out subtitled)

I know that has nothing to do with your question, but I figured I'd throw it out there anyway.

answered Sep 6, 2010 at 14:45

Patrick

1,00110 silver badges17 bronze badges

Comments

gkrogers · Accepted Answer · 2010-09-06 14:41:27Z

0

Cant you just write a unit test? Run each scenario for example 1000 times and compare the datetimes.

edited Sep 6, 2010 at 14:41

gkrogers

8,3963 gold badges31 silver badges36 bronze badges

answered Sep 6, 2010 at 14:00

femseks

2,9663 gold badges24 silver badges20 bronze badges

Comments

tia · Accepted Answer · 2010-09-06 18:01:51Z

0

If you know it's either "true" or "false", the last snippet must be fastest.

Anyway, you can also write:

bool subtitled = (xtr.Value.length == 4);

That should be even faster.

answered Sep 6, 2010 at 18:01

tia

9,7481 gold badge33 silver badges47 bronze badges

Comments

MikeJ · Accepted Answer · 2020-12-04 16:27:16Z

0

Old question I know but the accepted answer is wrong, or at least, incorrect in it's explanation.

Comparing the lengths maybe be the slightest bit faster but only because string.Equals is likely doing some other comparisons before it too checks the lengths and decides that they are not equal strings.

So in practice this is an optimization of last resort.

Here you can find the source for .NET core string comparison.

answered Dec 4, 2020 at 16:27

MikeJ

1,3878 silver badges12 bronze badges

Comments

Dmitry Karpezo · Accepted Answer · 2010-09-06 15:56:12Z

-1

String comparing and parsing is very slow in .Net, I'd recommend avoid intensive using string parsing/comparing in .Net.

If you're forced to do it -- use highly optimized unmanaged or unsafe code and use parallelism.

IMHO.

answered Sep 6, 2010 at 15:56

Dmitry Karpezo

1,05811 silver badges26 bronze badges

5 Comments

Just another metaprogrammer Over a year ago

Any links to support this claim?

Christian.K Over a year ago

-1: Out of context, or especially in this context, this statement is already questionable. But even with the benefit of doubt, sorry, but without any evidence/facts/experience whatsoever, this is nothing but FUD.

Dmitry Karpezo Over a year ago

Just write simple tests and see it yourself. I did it. Write, for instance, string comparsion, char access, for example atoi() and Convert.ToInt32() in c/c++ and c# and you will see that the native unmanaged code hundreds time more efficient.

Christian.K Over a year ago

Using out of context micro benchmarks - maybe yes. However, you give a pretty bold recommendation in your answer to use unsafe or unmanaged code. Which, as such, is not necessarly bad, but should only - if even possible (think portability, mobile, silverlight, medium-trust environments, etc.) - used after really figuring out if that particular part of the process is really the bottleneck, compared to the other heavylifting that goes one. Regarding the original question I would assume the XML parsing to have the lion's share. Besides the question was about string-to-string-comparison anyway.

Dmitry Karpezo Over a year ago

Completely agree with you, optimization is not necessarily low-level tuning and hacks such as using unsafe code instead BCF routines. My point was the .NET string performance itself. Sorry for misunderstanding.

Collectives™ on Stack Overflow

c# string performance - what is faster to compare, string text or string length

8 Answers 8

4 Comments

Comments

Comments

Comments

Comments

Comments

Comments

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

4 Comments

Comments

Comments

Comments

Comments

Comments

Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related