4

I think the best way to explain my question is with a short (generic) linq-to-objects code sample:

IEnumerable<string> ReadLines(string filename)
{
    string line;
    using (var rdr = new StreamReader(filename))
        while ( (line = rdr.ReadLine()) != null)
           yield return line;
}

IEnumerable<int> XValuesFromFile(string filename)
{
    return ReadLines(filename)
               .Select(l => l.Substring(3,3))
               .Where(l => int.TryParse(l))
               .Select(i => int.Parse(i));
}

Notice that this code parses the integer twice. I know I'm missing an obvious simple way to eliminate one of those calls safely (namely because I've done it before). I just can't find it right now. How can I do this?

3 Answers 3

9

How about:

int? TryParse(string s)
{
    int i;
    return int.TryParse(s, out i) ? (int?)i : (int?)null;
}
IEnumerable<int> XValuesFromFile(string filename)
{
    return from line in ReadLines(filename)
           let start = line.Substring(3,3)
           let parsed = TryParse(start)
           where parsed != null
           select parsed.GetValueOrDefault();
}

You could probably combine the second/third lines if you like:

    return from line in ReadLines(filename)
           let parsed = TryParse(line.Substring(3,3))

The choice of GetValueOrDefault is because this skips the validation check that casting (int) or .Value perform - i.e. it is (ever-so-slightly) faster (and we've already checked that it isn't null).

Sign up to request clarification or add additional context in comments.

5 Comments

I guess I'm looking more for the generic case of filtering an enumerable based on a complex transformation - keep the changed version of everything that passed the change. This may be case for writing a new "operator".
Is using != null and GetValueOrDefault() really faster than using where parsed.HasValue and select parsed.Value? I guess I should go run some tests, because that seems counter-intuitive to me.
Another approach is to write a method that returns a Tuple<bool, T> result, if you've got .NET 4 or want to write your own Tuple class. This is how F# automatically handles TryParse and similar methods. Then the LINQ would be where tuple.Item1 select tuple.Item2
Joel Mueller: yeah, I was already working on something kinda like that :)
@Joel Mueller - !=null is exactly HasValue, so that is no different. The GetValueOrDefault() is a tiny bit faster by skipping the check - it simply returns the inner field directly.
3

It's not exactly pretty, but you can do:

return ReadLines(filename)
    .Select(l =>
                {
                    string tmp = l.Substring(3, 3);
                    int result;
                    bool success = int.TryParse(tmp, out result);
                    return new
                               {
                                   Success = success,
                                   Value = result
                               };
                })
    .Where(i => i.Success)
    .Select(i => i.Value);

Granted, this is mostly just pushing the work into the lambda, but it does provide the correct answers, with a single parse (but extra memory allocations).

1 Comment

Marc's option of using a Nullable<int> could be used here instead of the anonymous class, as well, which would prevent the GC pressure from occurring...
3

I think I'll go with something like this:

IEnumerable<O> Reduce<I,O>(this IEnumerable<I> source, Func<I,Tuple<bool, O>> transform )
{
    foreach (var item in source)
    {
       try
       {
          Result<O> r = transform(item);
          if (r.success) yield return r.value;
       }
       catch {}
    }
}

ReadLines().Reduce(l => { var i; new Tuple<bool, int>(int.TryParse(l.Substring(3,3),i), i)} );

I don't really like this, though, as I'm already on the record as not liking using tuples in this way. Unfortunately, I don't see many alternatives outside of abusing exceptions or restricting it to reference types (where null is defined as a failed conversion), neither of which is much better.

2 Comments

I looked at this approach. I just didn't like the fact that the compiler can't infer the type (at least in C# 3), so the "Reduce" extension usability suffers...
My main complaints are 1) that I can't express the conversion in a single statement. I still need a variable declaration inside the lambda. and 2) that I have to express the result in form a tuple rather than the converted item.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.