1

I have 2 byte arrays:

    Dim A() As Byte = {1, 2, 3, 4, 5, 6, 7, 8, 9}
    Dim B() As Byte = {5, 6, 7}

Now I want to find the occurance of the full B in A. I tried Array.IndexOf(A, B) with no luck. Is there a simple way to search an array by array without the need to use any loops?

It should find the index (position) of 5,6,7 in the same order as in B(). If A() contains {1,2,3,4,7,6,5,9} it should return false or -1 because they are not in the same order.

2
  • 3
    almost the same: stackoverflow.com/questions/1020438/c-array-subset-fetching Commented Sep 8, 2011 at 8:30
  • 2
    "without any loops" is impossible. You can't get more than one value out of an array without a loop, since the array can be any size. Do you mean without an explicit loop in your code? Commented Sep 8, 2011 at 8:38

5 Answers 5

4

The following Linq statement will give an IEnumerable<int> containing the positions of b in a (or an empty set if none occur):

Enumerable
    .Range( 0, 1 + a.Length - b.Length )
    .Where( i => a.Skip(i).Take(b.Length).SequenceEqual(b) );

I have no idea how to translate to VB.NET.

Sign up to request clarification or add additional context in comments.

2 Comments

this method looks nice but is very slow. 16 Seconds for 100000 elements in byte() but it is good for smaller sizes. the length should be also +1 (at least in VB.NET) because if the elements are on the end of the list they wont be found, here is the corrected version: Enumerable.Range(0, A.Length - B.Length + 1).Where(Function(i) A.Skip(i).Take(B.Length).SequenceEqual(B))
@qxxx: You're right about the bug. Fixed. There are definitely faster ways to do this i.e. Knuth-Morris-Pratt, but I got hung up on the (explicit) loops requirement.
2

This might work, but it's C# and uses a loop:

private static int[] GetIndicesOf(byte[] needle, byte[] haystack)
{
    int[] foundIndices = new int[needle.Length];

    int found = 0;

    for (int i = 0; i < haystack.Length; i++)
    {

        if (needle[found] == haystack[i])
        {
            foundIndices[found++] = i;

            if (found == needle.Length)
                return foundIndices;
        }
        else
        {
            i -= found;   // Re-evaluate from the start of the found sentence + 1
            found = 0;    // Gap found, reset, maybe later in the haystack another occurrance of needle[0] is found
            continue;
        }
    }

    return null;            
}

Tested with input:

Byte[] haystack = { 5, 6, 7, 8, 9, 0, 5, 6, 7 };
Byte[] needle = { 5, 6, 7 };
// Returns {0, 1, 2}

Byte[] haystack = { 5, 6, 0, 8, 9, 0, 5, 6, 7 };
Byte[] needle = { 5, 6, 7 };
// Returns {6, 7, 8}

Byte[] haystack = { 5, 6, 0, 7, 9, 0, 5, 6, 8 };
Byte[] needle = { 5, 6, 7 };
// Returns null

Byte[] haystack = { 1, 2, 1, 2, 2 };
Byte[] needle = { 1, 2, 2 };
// Returns {2, 3, 4}

Byte[] haystack = { 1, 2, 1, 2, 1, 2, 3 };
Byte[] needle = { 1, 2, 1, 2, 3 };
// Returns {2, 3, 4, 5, 6}

Byte[] haystack = { 1, 1, 1, 1, 2 };
Byte[] needle = { 1, 2 };
// Returns {3, 4}

But the Linq implementation of @spender looks nicer. :-P

5 Comments

This is obviously broken for heystack = {1, 2, 1, 2, 2} and needle = {1, 2, 2}. It maybe works if you backtrack needle.length each time.
this is very fast for what i need, but didn't test it with the issue from Gleno
@Gleno not true, see my second test.
@CodeCaster, in my example there is overlap. Just try it.
@CodeCaster, yes that looks better. Still, good argument for unit testing. :)
0

How about creating a method that:

  1. Concatinates the elements of the searched list to one string
  2. Concatinates the elements of the list to search for to one string
  3. Looks in the first string for the precense of the second string

Like so:

public bool IsSubSetOf(IList<int> list1, IList<int> list2){
   var string1 = string.Join("", list1);
   var string2 = string.Join("", list2);
   return string1.Contains(string2);
}

Not tested...

2 Comments

This is broken for {1, 11, 111, 11, 11} and looking for {1, 1, 1}, say.
This is a good idea (no explicit loops!), but will break in a lot of cases. You probably need to add delimiters around each value (also before the first, and after the last) to get this to work flawlessly, which is slightly more complicated than a string.Join can provide. You also need to return an index, not just a boolean value - the OP said "find" not "determine if". This answer can be extended to work, but I wouldn't leave it as an exercise for the reader :)
0

An efficient way of solving this problem in general is the KMP algorithm. Quick googling suggest that a .NET implementation may be found here. It's implementational pseudocode is availible from Wikipedia.

An inefficient, but harmlessly easy to code way is presented in one of the links above as follows:

int[] T = new[]{1, 2, 3, 4, 5};
int[] P = new[]{3, 4};

for (int i = 0; i != T.Length; i++)
{
    int j = 0
    for (;i+j != T.Length && j != P.Length && T[i+j]==P[j]; j++);
    if (j == P.Length) return i;
}

Comments

0

My take would be:

public static int Search<T>(T[] space, T[] searched) {
  foreach (var e in Array.FindAll(space, e => e.Equals(searched[0]))) {
    var idx = Array.IndexOf(space, e);
    if (space.ArraySkip(idx).Take(searched.Length).SequenceEqual(searched))
      return idx;
  }
  return -1;
}

public static class Linqy {
  public static IEnumerable<T> ArraySkip<T>(this T[] array, int index) {
    for (int i = index;  i < array.Length; i++) {
      yield return array[i];
    }
  }
}

As always, it depends on your data whether this is "good enough" or you will have to resort to more complex yet efficient algorithms. I introduced an arrayskip as the Linq skip does indeed only assume the IEnumerable interface and would enumerate up to the index.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.