3

I have data in a MSSQL table (TableB) where [dbo].tableB.myColumn changes format after a certain date...

I'm doing a simple Join to that table..

Select [dbo].tableB.theColumnINeed from [dbo].tableA 
left outer join [dbo].tableB on [dbo].tableA.myColumn = [dbo].tableB.myColumn

However, I need to join, using different formatting, based on a date column in Table A ([dbo].tableA.myDateColumn).

Something like...

Select [dbo].tableB.theColumnINeed from [dbo].tableA 
left outer join [dbo].tableB on [dbo].tableA.myColumn = 
    IF [dbo].tableA.myDateColumn > '1/1/2009'
        BEGIN
            FormatColumnOneWay([dbo].tableB.myColumn)
        END
    ELSE
        BEGIN
            FormatColumnAnotherWay([dbo].tableB.myColumn)
        END

I'm wondering if there's a way to do this.. or a better way I'm not thinking of to approach this..

8 Answers 8

8
SELECT [dbo].tableB.theColumnINeed
FROM   [dbo].tableA 
LEFT OUTER JOIN [dbo].tableB
ON [dbo].tableA.myColumn = 
   CASE
    WHEN [dbo].tableA.myDateColumn <= '1/1/2009' THEN FormatColumnOneWay([dbo].tableB.myColumn)
    ELSE FormatColumnAnotherWay([dbo].tableB.myColumn)
   END
Sign up to request clarification or add additional context in comments.

Comments

5

Rather than having a CASE statement in the JOIN, which will prevent the query using indexes, you could consider using a UNION

SELECT [dbo].tableB.theColumnINeed 
FROM   [dbo].tableA 
    LEFT OUTER JOIN [dbo].tableB 
         ON [dbo].tableA.myDateColumn > '1/1/2009'
        AND [dbo].tableA.myColumn = FormatColumnOneWay([dbo].tableB.myColumn)
UNION ALL
SELECT [dbo].tableB.theColumnINeed 
FROM   [dbo].tableA 
    LEFT OUTER JOIN [dbo].tableB 
         ON [dbo].tableA.myDateColumn <= '1/1/2009'
        AND [dbo].tableA.myColumn = FormatColumnAnotherWay([dbo].tableB.myColumn)

but if the FormatColumnOneWay / FormatColumnAnotherWay are functions, or field expressions, that is probably going to exclude use of inxdexes on [myColumn], although any index on myDateColumn should still be used

However, it might help to understand what the FormatColumnOneWay / FormatColumnAnotherWay logic is, as knowning that may enable a better optimisation

Couple of things to note:

UNION ALL will not remove any duplicates (unlike UNION). Because the two sub-queries are mutually exclusive this is OK and saves the SORT step which UNION would make to enable it to remove duplicates.

You should not use '1/1/2009' style for string-dates, you should use 'yyyymmdd' style without and slashes or hyphens (you can also use CONVERT with an parameter to explicitly indicate that the string is in d/m/y or m/d/y style

Comments

0

In SQL Server you'd use a CASE such as:

SELECT * 
FROM TableA
INNER JOIN TableB on TableA.Column=
CASE WHEN TableA.RecordDate>'1/2/08'
       THEN FormatCoumn(TableB.Column) 
     ELSE FormatColumnOtherWat(TableB.Column)
END

3 Comments

My suggestion would be to fix the data because the optimizer will disregard indexes with those functions in the JOIN condition
Yes but sometimes you can't fix the data;-)
It is the same column, I would fix it, put a CHECK CONSTRAINT on it so that it won't happen again because sooner or later someone will scream that performance is unacceptable and then what?
0

You know that this is bad for performance since you won't be able to use indexes right?

You can use a CASE statement kludge or...you can go and fix the data so that you CAN use the index and it will be many times faster

Comments

0

I agree that a CASE syntax would be more appropriate for reading purposes, although I don't know whether there's any significant difference in running time.

The "right" thing to do, really, is to re-do it and do it right to start with. Your dates should be stored in datetime columns, and you probably have quite a lot to gain on migrating all your dates in tableB to a datetime column. You could do it this way, among others:

  1. Add a dummie column to TableB with type datetime.
  2. Run a query that takes the date value from the current column and puts it in the datetime column.
  3. Rename and delete columns to match the previous data structure.

3 Comments

You forgot step 4: Spend weeks or months hunting down all the errors caused in other code/reports from deleting a column
Well, datetime values not stored in datetime columns are also evil. Depending on how large the application using the database is, there might be a lot of problems - yes, but if you've used a good separation of concerns etc you won't have a lot of places to change. Why spend time hacking smelly code?
He does say it is a date column, he never actually said it was a varchar/nvarchar/whatever.
0

Ok, hold up. What is the actual data type of the column? I'm guessing it isn't DateTime, because you don't really control the formatting... it just stores a date. Can it be CAST or CONVERTed to a DateTime though?

So you might want

left outer join tableb on tableA.myColumn = CAST(tableb.MyColumn as DateTime)

That way you're not matching up a string, but the actual date which should be more reliable. It's also simpler and easier to read. The real questions is why the date isn't stored as a DateTime in hte first place...

Comments

0

From the [dbo] prefix, I believe you're using SQL Server. While I don't have much experience with it, you can convert both fields to a specific date format:

select * from tableA
  Left Outer join tableB
       On CONVERT(CHAR(8), tableA.myColumn, 112) = CONVERT(CHAR(8), tableB.myColumn, 112)

The same should work on any DBMS, using the appropriate date formatting functions.

I don't know about SQL Server, but in Oracle you can create an index for the join expression.

Comments

0

Well, you could use a subquery to properly format the data in either table before the join.

SELECT
  newB.columnINeed
FROM
  tableA AS A
LEFT OUTER JOIN (
  SELECT
    columnINeed
  , CASE WHEN myColumn > '1/1/2009' THEN FormatColumnOneWay(myColumn)
    ELSE FormatColumnAnotherWay(myColumn)
    END AS myColumn
  FROM
    tableB
) AS NewB ON A.myColumn = B.myColumn

If performance matters, you could maybe used an indexed view (based on the subquery) instead of hard-coding the subquery into the overall query.

1 Comment

You may not be able to do this. I notice you are formatting B on the basis of A. My guess is you can probably format B without involving A, then do the join?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.