2

It's noted in a comment on the docs for array_uintersect that the callback function MUST return either -1 ($a < $b), 0 ($a === $b), or 1 ($a > $b)

The callback function's purpose is to compare $a and $b to determine whether to include them in the intersection, or exclude them. So why return -1, 0, or 1 instead of a simple boolean?

Here is some (working) example code of what I wanted to achieve, I'm just curious why it works that way.

5
  • But why are three values required? Commented Nov 10, 2012 at 2:15
  • 1
    I think those 3 conditions are used by some kind of internal sorting considering -1, 0 and 1 are actually typical sorting criteria. Commented Nov 10, 2012 at 2:24
  • Sorry, I didn't consider this particular case. Commented Nov 10, 2012 at 2:24
  • 1
    In this case, only two values are needed: whether a value is present in the other array or not. The sort functions do need to know whether one value is different because it's larger or smaller. To avoid having two kinds of callbacks with different semantic, they just make array_uintersect() use the same type as usort(). Commented Nov 10, 2012 at 2:28
  • Maybe they just want to be consistent with respect to array sorting functions. edit: basically what cleong said ;) Commented Nov 10, 2012 at 2:28

4 Answers 4

4

It is important to mention that array_uintersect() operates even more strangely on your array inputs than one would hope for. One expects that calling array_uintersect($firstArray, $secondArray, function ($a, $b) {}) would result in each entry from $firstArray and each entry from $secondArray to be compared once each (with the optimization of stopping comparisons for an entry after the first intersection is found). Any sane person would expect that each entry of $firstArray would land in the callback's $a argument and each entry of $secondArray would land in its $b argument.

This is not the case! Believe it or not, php's first call to your callback has $a and $b both set to entries from $firstArray! You're calling a function named after the intersection of arrays, but that function also compares the entries in your individual arrays rather than simply comparing between arrays. It's mind-numbing, really.

Thus, array_uintersect is not a replacement for the following block. Users beware.

$intersection = [];
foreach ($firstArray as $a) {
    foreach ($secondArray as $b) {
        if (user_compare_function($a, $b) === 0) {
            $intersection[] = $a;
            break;
        }
    }
}
Sign up to request clarification or add additional context in comments.

1 Comment

So in simple words, under what situations does array_uintersect() give a different result than your codes above?
1

the pear replacement accepts callbacks, that return only a boolean. the php function does not. so the reason is probably an optimization in php. you might check this here

Comments

1

I think the reason is in the PHP Source usort and array_uintersect and other similar user callback compare function which is php_array_user_compare

xref: /PHP_5_3/ext/standard/array.c

568static int php_array_user_compare(const void *a, const void *b TSRMLS_DC) /* {{{ */
569{
570    Bucket *f;
571    Bucket *s;
572    zval **args[2];
573    zval *retval_ptr = NULL;
574
575    f = *((Bucket **) a);
576    s = *((Bucket **) b);
577
578    args[0] = (zval **) f->pData;
579    args[1] = (zval **) s->pData;
580
581    BG(user_compare_fci).param_count = 2;
582    BG(user_compare_fci).params = args;
583    BG(user_compare_fci).retval_ptr_ptr = &retval_ptr;
584    BG(user_compare_fci).no_separation = 0;
585    if (zend_call_function(&BG(user_compare_fci), &BG(user_compare_fci_cache) TSRMLS_CC) == SUCCESS && retval_ptr) {
586        long retval;
587
588        convert_to_long_ex(&retval_ptr);
589        retval = Z_LVAL_P(retval_ptr);
590        zval_ptr_dtor(&retval_ptr);
591        return retval < 0 ? -1 : retval > 0 ? 1 : 0;
592    } else {
593        return 0;
594    }
595}

This uses retval which is an integer to compare the function if you look at

retval < 0 ? -1 : retval > 0 ? 1 : 0

If you are using using Boolean and conversion is required it can only give 0 or 1

Example

var_dump((int) true); // 1
var_dump((int) false); // 0

This means that you might be able to get away with boolean during intersect because only where $a === $b = 0 is required but not for other implementations where retval < 0

1 Comment

you dont get away with boolfunction myfunction($v1,$v2){ if ($v1===$v2) return 0; //if ($v1 > $v2) return 1;//uncomment this to see the correct result else return -1; } $a1=array(2,4,1); $a2=array(3,1,4); print_r(array_uintersect($a1,$a2,"myfunction"));
1

Under the hood, is a call to the C function zend_qsort.

if (behavior == INTERSECT_NORMAL) {
    zend_qsort((void *) lists[i], hash->nNumOfElements, sizeof(Bucket *), intersect_data_compare_func TSRMLS_CC);
} else if (behavior & INTERSECT_ASSOC) { /* triggered also when INTERSECT_KEY */
    zend_qsort((void *) lists[i], hash->nNumOfElements, sizeof(Bucket *), intersect_key_compare_func TSRMLS_CC);
}

Quicksort is sensitive to these relations so that it can perform the partition component of it's algorithm. Items with the same value as the pivot are placed adjacent to, and on either side of the pivot.

Interesting that the greater than comparison operator > works for object comparison, something of an undocumented behavior. According to one comment, PHP looks at the values of public objects for this comparison. This is actually a discussion point on the internals list right now!

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.