6

I need to verify if the bits in an array of bytes (i.e. chars) are a subset of another array of the same type: for example, 0001.0011 (19) is a subset of 0011.0011 (51), while 0000.1011 (11) is not.

I started playing with bitwise operations, and almost solved it with a XOR/OR/XOR sequence:

int is_subset (char *set_a, char *set_b, int size)
{
  /* The operation is performed with three bitwise operations, resulting in a
   * sequence of bits that will be equal to zero if set_a is a subset of
   * set_b. As a bonus, the positions where the sets differ will be
   * available in the resulting sequence, and thus the number of differing
   * positions can be obtained by counting the number of bits set (for exemple,
   * with __builtin_popcount in GCC).
   *
   *   Exemple (TRUE):              Exemple (FALSE):
   *   ================             ================
   *   set_a   00010011             set_a   00001011
   *   set_b   00110011             set_b   00110011
   *   ----------------             ----------------
   *   XOR     00100000             XOR     00111000
   *   set_b   00110011             set_b   00110011
   *   ----------------             ----------------
   *   OR      00110011             OR      00111011
   *   set_b   00110011             set_b   00110011
   *   ----------------             ----------------
   *   XOR     00000000             XOR     00001000
   */

  int i;
  for (i = 0; i < size; i++)
    if ( (((set_a[i] ^ set_b[i]) | set_b[i]) ^ set_b[i]) != 0)
      return FALSE;

  return TRUE;
}

but it fails (always returning TRUE) if set_a is zero (0000.0000). I tried different strategies (such as Bloom filters), but probably due to my programming skills it was far from fast or at least elegant.

Is there any standard, elegant way of doing this without exceptions?

EDIT: to be clear, in this context "subset" means that all bits TRUE in the first array (set_a) are also TRUE in the second one (set_b). There might be other bits TRUE in the second array, but it does not matter if they are FALSE in first array.

3
  • 1
    Perhaps I'm missing something in the question, but what is your definition of a subset in this context? Commented Dec 26, 2011 at 22:09
  • 1
    If your definition of a subset is what you appear to think it is, (judging by your code,) then you should not be bothering us with arrays, and you could just be asking us how to figure out whether the '1' bits of an int are a subset of the '1' bits of another int. Commented Dec 26, 2011 at 22:17
  • @MikeNakis that is exactly what I want, but the number of bits to verify is not constant (ranging from ~15 to ~100). I though about using two long integers, but it was more difficult than using an array of bytes... Commented Dec 26, 2011 at 22:24

4 Answers 4

18

a is a subset of b if and only if (a | b) == b. If this condition is satisfied for each byte, return TRUE. Otherwise return FALSE.

Or equivalently (a & b) == a.

Sign up to request clarification or add additional context in comments.

2 Comments

Alternatively: if and only if a & b == a.
great answer... anybody know how to modify this so that it is a strict subset (i.e. equality returns false)?
5

I am not sure it is correct to say that your code fails just because it returns TRUE if set_a is an array of zeros, because from a purely theoretical mathematical point of view, the empty set is a subset of any other set. If you do not like that, then you should just add an extra check to see if set_a is an array of zeros, and if so, return FALSE right away.

Comments

5

a is a subset of b is every bit in a implies the corresponding bit in b

a -> b

or equivalently,

~a | b //not a or b

should give 1111111.

Testing the negation agains zero might be simpler though (checking if there are no cases where we have a bit set in a but not in b)

0 == ( a & ~b)

int is_subset (char *set_a, char *set_b, int size)
{
  int i;
  for (i = 0; i < size; i++){
    if(0 != (set_a[i] & (~ set_b[i])))
      return FALSE;
  }
  return TRUE;
}

I don't remember if the bitwise stuff works properly with chars or if you need to cast to unsigned first.

3 Comments

Much better than my code, but it still returing me TRUE if set_a is an array os zeros... Perhaps I messed up somewhere else?
Giacomo is right: your solution will always return true if set_a is zero.
@Giacomo: However, by your definition it's unclear whether missingno's solution is correct or not. By a very strict reading of your definition, his solution is correct.
0

A technical trivia , adding "(theSubsetUnderTest) &&" to the left of your expression should exclude the special case of 0.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.