1

I have an array of strings, f.e.

string [] letters = { "a", "a", "b", "c" };

I need to find a way to determine if any string in the array appears more than once. I thought the best way is to make a new string-array without the string in question and to use Contains,

foreach (string letter in letters)
{
    string [] otherLetters = //?
    if (otherLetters.Contains(letter))
    {
        //etc.     
    }
}

but I cannot figure out how. If anyone has a solution for this or a better approach, please answer.

4 Answers 4

12

The easiest way is to use GroupBy:

var lettersWithMultipleOccurences = letters.GroupBy(x => x)
                                           .Where(g => g.Count() > 1)
                                           .Select(g => g.Key);

This will first group your array using the letters as keys. It then returns only those groups with multiple entries and returns the key of these groups. As a result, you will have an IEnumerable<string> containing all letters that occur more than once in the original array. In your sample, this is only "a".

Beware: Because LINQ is implemented using deferred execution, enumerating lettersWithMultipleOccurences multiple times, will perform the grouping and filtering multiple times. To avoid this, call ToList() on the result:

var lettersWithMultipleOccurences = letters.GroupBy(x => x)
                                           .Where(g => g.Count() > 1)
                                           .Select(g => g.Key).
                                           .ToList();

lettersWithMultipleOccurences will now be of type List<string>.

Sign up to request clarification or add additional context in comments.

Comments

4

You can the LINQ extension methods:

if (letters.Distinct().Count() == letters.Count()) {
    // no duplicates
}

Enumerable.Distinct removes duplicates. Thus, letters.Distinct() would return three elements in your example.

3 Comments

Tiny bit more efficient would be to use the .Length property of the array (letters.Length), no need for extension - but surely this is the most elegant and efficient way.
@Shadow: Good point. I'll leave it at Count() for aesthetic reasons, though, since it would look strange to use Count() on the left side and Length on the right.
It doesn't really matter. Enumerable.Count() uses the Count property, if the input is an ICollection or an ICollection<T> and a .NET array is both.
1

Create a HashSet from the array and compare their sizes:

var set = new HashSet(letters);
bool hasDoubleLetters = set.Size == letters.Length;

Comments

1

A HashSet will give you good performance:

HashSet<string> hs = new HashSet<string>();
foreach (string letter in letters)
{
    if (hs.Contains(letter))
    {
        //etc. more as once     
    }
    else
    {
           hs.Add(letter);
    }
}

2 Comments

This is approximately five times longer than my code and not necessarily more efficient.
That depends on the filling and size of the initial array.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.