0

I'm trying to understand the following line of code. Sequence1 and 2 are two array of strings. The code is supposed to achieve an inner-join effect. Can someone please help to explain how to read it? i.e. what x => x.gn2 is? I understand that n1 => n1.Length is join condition. I'm struggling with lambda expressions. Many thanks in advance!

var j = sequence1.GroupJoin ( sequence2 , 
n1 => n1.Length , n2 => n2.Length , (n1, gn2) => new { n1, gn2 })
.SelectMany (x => x.gn2,(x, n2) => new { x.n1, n2 });
2

1 Answer 1

3

I am not sure that the expression does what you think it does. But here it is. Let's rewrite this a little bit:

static void Foo1()
{
    string[] sequence1 = new[] { "12", "34", "567" };
    string[] sequence2 = new[] { "ab", "cd", "efg" };

    var result = sequence1.GroupJoin(sequence2,
    n1 => n1.Length, n2 => n2.Length, (n1, gn2) => new { n1, gn2 })
    .SelectMany(x => x.gn2, (x, n2) => new { x.n1, n2 });

    result.ToList().ForEach(Console.WriteLine);
}

and now rewrite it again in another equivalent form:

static void Foo2()
{
    string[] sequence1 = new[] { "12", "34", "567" };
    string[] sequence2 = new[] { "ab", "cd", "efg" };

    var joinResult = sequence1.GroupJoin(
        sequence2,
        n1 => n1.Length,
        n2 => n2.Length,
        (n1, gn2) => new {n1, gn2});

    Console.WriteLine("joinResult: ");
    joinResult.ToList().ForEach(Console.WriteLine);

    var result = joinResult.SelectMany(
        x => x.gn2,
        (x, n2) => new { x.n1, n2 });

    Console.WriteLine("result: ");
    result.ToList().ForEach(Console.WriteLine);
}

Now let's take the first part (the GroupJoin):

    var joinResult = sequence1.GroupJoin(
        sequence2,
        n1 => n1.Length,
        n2 => n2.Length,
        (n1, gn2) => new {n1, gn2});

We are joining two collections. Note that GroupJoin is an extension method that is invoked on sequence1. Reading the documentation of GroupJoin we see that sequence1 is the outer sequence and the first parameter sequence2 is the inner sequence.
The second parameter n1 => n1.Length is a method that based on each element of the outer collection generates the key of that element.
The third parameter n2 => n2.Length is a method that based on each element of the inner collection generates the key of that element.
GroupJoin now has enough data to match elements of the first sequence with elements of the second sequence. In our case strings are matched based on their length. All strings of length 2 from the first sequence are matched with strings of the same length 2 in the second sequence. All strings of length 3 from the first sequence are matched with strings of the same length 3 in the second sequence. And so on for any value of the length of a string.
The last parameter (n1, gn2) => new {n1, gn2} is a method that based on an element from the outer sequence (that is sequence1) and a collection with all matching elements from sequence2 will generate some result. In this case the result is an anonymous type with two fields:

  • The first field named n1 is the element from sequence1.
  • The second field named gn2 is the collection of all matching elements from sequence2.

Next comes the SelectMany:

var result = joinResult.SelectMany(
    x => x.gn2,
    (x, n2) => new { x.n1, n2 });

SelectMany is an extension method that here is invoked on joinResult. Take a moment and look at the end of my post where I copied the output of the application to see how the joinResult sequence looks like. Note that each element x in joinResult is an anonymous type with fields {n1, gn2} where gn2 itself is a sequence.

The first parameter x => x.gn2 is a delegate written in lambda form. SelectMany will call this method for each element of the input sequence joinResult. SelectMany calls this method so that with each call you have the chance to generate an intermediate collection. Remember that each element x in joinResult is an anonymous type with fields {n1, gn2} where gn2 itself is a sequence. Having this, the lambda x => x.gn2 transforms each element x in the collection x.gn2.

Now that SelectMany based on each element of the input sequence can generate a new intermediate sequence it will proceed to process that intermediate sequence. For that we have the second parameter.

The second parameter (x, n2) => new { x.n1, n2 } is another delegate written in lambda form. This delegate is called by SelectMany for each element of the intermediate sequence with two parameters:

  • The first parameter is the current element from the input sequence.
  • The second parameter is a successive element of the intermediate sequence.

This lambda transforms these two parameters into another anonymous type with two fields:

  • The first field named n1. If you followed the data flow that is from field n1 from the collection in joinResult).
  • The second field named n2 is the current element of the intermediate sequence.

This all sounds awfully complicated but if you debug the app and place some breakpoints on strategic points it will become clear.

Lets rewrite this one more time in an equivalent form:

static void Foo3()
{
    string[] sequence1 = new[] { "12", "34", "567" };
    string[] sequence2 = new[] { "ab", "cd", "efg" };

    var joinResult = sequence1.GroupJoin(
        sequence2,
        element1 => GetKey1(element1),
        element2 => GetKey2(element2),
        (n1, gn2) =>
        {
            // place a breakpoint on the next line
            return new {n1, gn2};
        });

    Console.WriteLine("joinResult: ");
    joinResult.ToList().ForEach(Console.WriteLine);

    var result = joinResult.SelectMany(
        x =>
        {
            // place a breakpoint on the next line
            return x.gn2;
        },
        (x, n2) =>
        {
            // place a breakpoint on the next line
            return new {x.n1, n2};
        });

    Console.WriteLine("result: ");
    result.ToList().ForEach(Console.WriteLine);
}

private static int GetKey1(string element1)
{
    // place a breakpoint on the next line
    return element1.Length;
}

private static int GetKey2(string element2)
{
    // place a breakpoint on the next line
    return element2.Length;
}

I suggest you run method Foo3 that is the most verbose and put breakpoints where indicated. That will help you to figure out in more details how all this works.

Finally, I must say that one reason all this appears as complicated as it does is because how variables were named. Here is another form, not as verbose as Foo3 that may be reasonably easy to read:

static void Foo4()
{
    string[] sequence1 = new[] { "12", "34", "567" };
    string[] sequence2 = new[] { "ab", "cd", "efg" };

    var groupJoinResult = sequence1.GroupJoin(
        sequence2,
        elementFromSequence1 => elementFromSequence1.Length,
        elementFromSequence2 => elementFromSequence2.Length,
        (elementFromSequence1, matchingCollectionFromSequence2) => new { elementFromSequence1, matchingCollectionFromSequence2 });

    var result = groupJoinResult.SelectMany(
        inputElement => inputElement.matchingCollectionFromSequence2,
        (inputElement, elementFromMatchingCollection) => new { inputElement.elementFromSequence1, elementFromMatchingCollection });

    result.ToList().ForEach(Console.WriteLine);
}

Note: The output of running Foo3 is:

joinResult:
{ n1 = 12, gn2 = System.Linq.Lookup`2+Grouping[System.Int32,System.String] }
{ n1 = 34, gn2 = System.Linq.Lookup`2+Grouping[System.Int32,System.String] }
{ n1 = 567, gn2 = System.Linq.Lookup`2+Grouping[System.Int32,System.String] }
result:
{ n1 = 12, n2 = ab }
{ n1 = 12, n2 = cd }
{ n1 = 34, n2 = ab }
{ n1 = 34, n2 = cd }
{ n1 = 567, n2 = efg }
Sign up to request clarification or add additional context in comments.

2 Comments

Many thanks! I still don't quite get this line SelectMany(x => x.gn2, (x, n2) => new { x.n1, n2 });. Could you elaborate on that?
I updated my response to cover SelectMany. And if this helps, don't forget to mark the answer as accepted.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.