22

Are there any formal techniques for refactoring SQL similar to this list here that is for code?

I am currently working on a massive query for a particular report and I'm sure there's plenty of scope for refactoring here which I'm just stumbling through myself bit by bit.

3
  • 1
    Not that I'm really aware of but I keep meaning to get around to looking at this book amazon.co.uk/gp/product/0321293533 Commented Mar 24, 2010 at 12:03
  • 1
    That book does look interesting, although I get the feeling it is more about refactoring the database design rather than the queries against an existing design. I could be wrong tho. Commented Mar 24, 2010 at 12:07
  • 1
    This one oreilly.com/library/view/refactoring-sql-applications/… was fairly new reference when the question was asked. It has a chapter on the subject. Commented Oct 13, 2020 at 6:15

4 Answers 4

8

I have never seen an exhaustive list like the sample you provided.

The most effective way to refactor sql that I have seen is to use the with statement. It allows you to break the sql up into manageable parts, which frequently can be tested independently. In addition it can enable the reuse of query results, sometimes by the use of a system temporary table. It is well worth the effort to examine.

Here is a silly example

WITH 
mnssnInfo AS
(
    SELECT SSN, 
           UPPER(LAST_NAME), 
           UPPER(FIRST_NAME), 
           TAXABLE_INCOME,          
           CHARITABLE_DONATIONS
    FROM IRS_MASTER_FILE
    WHERE STATE = 'MN'                 AND -- limit to Minne-so-tah
          TAXABLE_INCOME > 250000      AND -- is rich 
          CHARITABLE_DONATIONS > 5000      -- might donate too
),
doltishApplicants AS
(
    SELECT SSN, SAT_SCORE, SUBMISSION_DATE
    FROM COLLEGE_ADMISSIONS
    WHERE SAT_SCORE < 100          -- Not as smart as the average moose.
),
todaysAdmissions AS
(
    SELECT doltishApplicants.SSN, 
           TRUNC(SUBMISSION_DATE)  SUBMIT_DATE, 
           LAST_NAME, FIRST_NAME, 
           TAXABLE_INCOME
    FROM mnssnInfo,
         doltishApplicants
    WHERE mnssnInfo.SSN = doltishApplicants.SSN
)
SELECT 'Dear ' || FIRST_NAME || 
       ' your admission to WhatsaMattaU has been accepted.'
FROM todaysAdmissions
WHERE SUBMIT_DATE = TRUNC(SYSDATE)    -- For stuff received today only

One of the other things I like about it, is that this form allows you to separate the filtering from the joining. As a result, you can frequently copy out the subqueries, and execute them stand alone to view the result set associated with them.

Sign up to request clarification or add additional context in comments.

4 Comments

Good, if they're on SQL 2005+. Pre then you can't use With statements so temp tables are your friend. (Note that for testing, it can be best to start it off as temp and convert to withs when you're happy with them so you can leave your built-up tables in memory and waiting rather than having to rebuild them separately each time you want to check something.)
You can also use views and inline table-valued functions instead of CTEs.
In addition it can enable the reuse of query results, sometimes by the use of a system temporary table - This is false. In every SQL implementation I know, CTEs are inlined; They are not cached.
@j. Oracle can store the results in a temp table.
3

There is a book on the subject: "Refactoring Databases". I haven't read it, but it got 4.5/5 stars on Amazon and is co-authored by Scott Ambler, which are both good signs.

1 Comment

It's also from the Martin Fowler signature series, which is another good sign - I don't know of anyone who has done more for popularising refactoring as a technique for clean code than Fowler has.
1

Not that I've ever found. I've mostly done SQL Server work and the standard techniques are:

  • Parameterise hard-coded values that might change (so the query can be cached)
  • Review the execution plan, check where the big monsters are and try changing them
  • Index tuning wizard (but beware you don't cause chaos elsewhere from any changes you make for this)

If you're still stuck, many reports don't depend on 100% live data - try precalculating portions of the data (or the whole lot) on a schedule such as overnight.

1 Comment

Sounds like you're talking about optimising (improving performance) rather than refactoring (improving design).
1

Not about techniques as much, but this question might help you find SQL refactoring tools:

Is there a tool for refactoring SQL, a bit like a ReSharper for SQL

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.