2

Moderately frequently, I find myself doing a grouping, that I know will result in the whole group having the same value in a particular column, but SQL Server doesn't know that.

Most often, it's that I've grouped by DATEPART(Month, my_date_column) and then I want to SELECT DATEPART(Year, my_date_column) where all the data is in a single year or SELECT DATENAME(Month, my_date_column)

SQL Server doesn't know that these are implicitly all identical, so I end up using MIN() or MAX().

This works, but it feels wrong. (And misleading for future developers!)

Is there a SINGLE() function or anything comparable?

Ideally it would error if they weren't all unique, but I'd taking anything that was more explicit about what I was doing.

3
  • To document your intentions, just group by it like everything else. Commented Mar 23, 2017 at 10:36
  • SINGLE is the misleading function. What would SINGLE do if it encountered a different value? It can't throw random errors. It could only work if somehow you ensured that all results were identical, as if you called DISTINCT on them. That's not how aggregates are expected to work Commented Mar 23, 2017 at 10:37
  • SQL, the language, deals with data sets. Aggregate functions work on an entire set and produce a result. It's OK for a function to throw when invalid data are encountered, like NaN or NULL. It's not OK if that happens at random based on the order or distribution of the data. For such a function to work deterministically the rest of the query would have to ensure that all values are identical Commented Mar 23, 2017 at 10:44

1 Answer 1

1

I just use MIN. There are only 13 aggregate functions and there's nothing that is more suitable.

If you wish to document that the result should be unique for the group and that multiple values are an error, put a tripwire in:

...
MIN(Expression) as a,
CASE WHEN MIN(Expression) != MAX(Expression) THEN 1/0 END as EnsureUnique,
...

The alternative is to write your own CLR Aggregate function for this.

Sign up to request clarification or add additional context in comments.

3 Comments

A SQLCLR aggregate that randomly throws after the results are selected isn't the best idea. Which is why SQL, the language, doesn't have such an aggregate function. You can't have an aggregate function that may throw even though the data is perfectly valid (ie not NULL or extremes). It would also cause serious performance problems
@PanagiotisKanavos - personally, I'd just use the MIN and not be looking to generate errors. Having said that, I'd still prefer any variation of this aggregate over the mysql "it's not in an aggregate, it's not in a GROUP BY, I'll just give you a random value from one of the rows".
You could also use FIRST_VALUE, if it doesn't introduce additional sorting. MySQL doesn't have windowing functions so it uses various tricks like previous + 1 to calculate row numbers

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.