I'll give it a shot!
For an example, let's take a list of names. You and I both know how to sort a list of names alphabetically, and done it a hundred bazillion times. Same thing applies to, say, numbers. Computers need to be able to sort data where the sorting may not be so intuitive.
Callback functions for sorting arrays involve the idea of allowing the array to contain an arbitrary kind of data, with the callback supplying the "rules" for saying how any two items in that array relate to each other. In this way, a sorting algorithm can take that array of data, call the "callback" function repeatedly to determine which elements belong in which order, and return the sorted list. That's the very essence of a callback.
If you want a literal analogy, pretend you hand a list of data to a friend, and they only know a method for sorting, but don't know how the items in your list relate. They "call back" to you, asking the question "Which of these two elements comes first?" And you, knowing the data, and the rule, take the items, apply your rules, and then tell the friend the answer. Ultimately, after repeating this process multiple times, your data returns to you sorted.
In this way, a "callback" that returns -1, 0, or 1, returns a value that says, when called with two pieces of data, -1 says "item one precedes item two," 0 says "item one equals item two," and +1 says "item one follows item two." You simply supply the comparisons that determine which value is returned based on the rules for your data. You can define an arbitrary set of data and define whatever precedence rule you wish within your problem "space." As an aside, that's an important part of object-oriented programming - I can leverage this "callback" idea to implement a generic version of a complex sorting algorithm and never know or care about the kind of data being sorted - all because the programmer using this "canned" sorting mechanism will provide the "logical plumbing" the sort routine needs.
I think that's a decent shot at an explanation, and all in a language-agnostic way, too! :) Hope that helps.
Edit. Let me provide an example:
List:
Item # Value
1 12
2 15
3 9
4 26
5 4
if I set out to find the smallest item in the list, I'd start with item 1, and compare it against each remaining item in the list. So, first pass, I'd compare Item 1 to Item 2:
compare(item(1),item(2))
which would return -1, because the value of 12 precedes 15. Now, I go to the next item in the list:
compare(item(1),item(3))
which this time returns 1, because 12 follows 9, meaning item(3) is now the smallest item found so far. Now, we compare thusly:
compare(item(3),item(4))
In this case, compare return -1, because 9 precedes 26, leading us to the final compare:
compare(item(3),item(5))
And this call will return +1, because 9 comes after 4. As we've exhausted all items in the list, we know item(3) is the smallest item. We'd then swap that item for the "previously" top item, and repeat the entire process starting at item(2). This is an example of a notoriously inefficient "bubble sort," but works for the purposes of this illustration. That's where the "first item" and "second item" references come from. Sorting, regardless of language, and like any other problem in computer science, boils down to breaking down big problem in to small bites, and sorting boils down to repeatedly comparing two items from a larger list.