0

I can't quite seem to figure out how to add data only where it exists.

I have a statement that I would like to add fields to. But rather than only pulling Employees if the criteria is met (ie the WHERE statement) I would like to associate the data if and only if it exists. My base statement pulls 30 records, but when I add more details to my WHERE statement (to include other fields), it drops the record count to 20. How do I retain my 30 records, while also including details from separate tables (if they exist)?

My base statement - it pulls 30 records

SELECT DISTINCT EMPLOYEE_NUM "Employee #", 
START_DATE "Start Date",
NAME "Employee Name"
FROM EMPLOYEES E
JOIN EMPLOYEE_DETAILS D ON D.EMPLOYEE_ID = E.EMPLOYEE_ID
WHERE D.START_DATE >= DATE '2016-12-14'
ORDER BY 1;

Output Ex.

    Employee # | Start Date | Employee Name
    1234         12/15/2017   Jim Doe
    1456         01/16/2017   John Dillin
    5435         04/23/2017   Jane Mitchel
    9876         09/12/2017   Joan Smith
    7655         10/14/2017   Barry Gibb 
   ...25 more records

Detailed Statement to include extra fields - it only pulls 20 records

SELECT DISTINCT EMPLOYEE_NUM "Employee #", 
START_DATE "Start Date",
NAME "Employee Name",
OS.ONBOARDING_LOCATION "On-boarding Location",
OS.COMPLETION_DATE "Completion Date"
FROM EMPLOYEES E
JOIN EMPLOYEE_DETAILS D ON D.EMPLOYEE_ID = E.EMPLOYEE_ID
JOIN ONBOARDING_STATUS OS ON OS.EMPLOYEE_ID = E.EMPLOYEE_ID
WHERE D.START_DATE >= DATE '2016-12-14'
AND OS.DESCRIPTION LIKE 'START'
AND OS.CANCELLED IS NULL
ORDER BY 1;

Output Example

    Employee # | Start Date | Employee Name | On-boarding Location | Completion Date
    1234         12/15/2017   Jim Doe           Sacramento, CA         12/13/2017
    1456         01/16/2017   John Dillin       Atlanta, GA            01/19/2017
    7655         10/14/2017   Barry Gibb        Los Angeles, CA        10/17/2017
   ...17 more records

Here is what I tried to do, but it only duplicated the records:

SELECT DISTINCT EMPLOYEE_NUM "Employee #", 
START_DATE "Start Date",
NAME "Employee Name",
(CASE 
   WHEN OS.DESCRIPTION LIKE 'START' AND OS.CANCELLED IS NULL
   THEN OS.ONBOARDING_LOCATION
   ELSE NULL
END)"On-boarding Location",
(CASE 
   WHEN OS.DESCRIPTION LIKE 'START' AND OS.CANCELLED IS NULL
   THEN OS.COMPLETION_DATE
   ELSE NULL
END)"Completion Date"
FROM EMPLOYEES E
JOIN EMPLOYEE_DETAILS D ON D.EMPLOYEE_ID = E.EMPLOYEE_ID
JOIN ONBOARDING_STATUS OS ON OS.EMPLOYEE_ID = E.EMPLOYEE_ID
WHERE D.START_DATE >= DATE '2016-12-14'
ORDER BY 1;

My last attempt pulls the data, but doesn't seem adhere to the CASE WHEN statement and duplicates a lot of the records. Please let me know if that doesn't make sense. Any help or tips you can provide would be much appreciated.

Thanks in advance!

1
  • Use left join instead of join. This way when no related rows exist, you'll still get the main row (and null values in the related columns). Commented May 31, 2018 at 19:10

1 Answer 1

2

Use OUTER joins, as in:

SELECT
    DISTINCT EMPLOYEE_NUM "Employee #", 
    START_DATE "Start Date",
    NAME "Employee Name",
    OS.ONBOARDING_LOCATION "On-boarding Location",
    OS.COMPLETION_DATE "Completion Date"
  FROM EMPLOYEES E
  left JOIN EMPLOYEE_DETAILS D ON D.EMPLOYEE_ID = E.EMPLOYEE_ID
   and D.START_DATE >= DATE '2016-12-14'
  left JOIN ONBOARDING_STATUS OS ON OS.EMPLOYEE_ID = E.EMPLOYEE_ID
   AND OS.DESCRIPTION LIKE 'START'
   AND OS.CANCELLED IS NULL
  ORDER BY 1;

Please note I moved the filtering conditions (WHERE section) into the join clauses to enforce outer joins. If you keep the filters in the WHERE clause you are effectively converting back the joins into inner joins, and you don't want to do that.

Sign up to request clarification or add additional context in comments.

5 Comments

Thanks for this! Unfortunately it's still dropping records that don't have OS.DESCRIPTION LIKE 'START'. I can't quite seem to figure how to add data to the original data set, rather than further thinning the data set to meet the criteria.
Trying to understand what you want exactly. The query is retrieving all rows on E with (optional) accompannying rows on OS; but not all rows on OS are considered: only the ones with START in the description. Do you want something different?
Sorry about that, I'm still trying to wrap my head around it myself. In my Base Statement above it returns 30 unique items. I'd like to add a column that shows, of these records, which ones have a start date. If they have a start date, display it. If not, leave it blank. Instead, what I'm getting in return is less records. It's dropping the records that don't meet the OS.DESCRIPTION LIKE 'START'. When I try a` LEFT JOIN` it seems to pull what I need, but it duplicates some of the records. And in those duplicated records the Completion Date is blank. There may be an issue with their data...
If you have multiple rows in OS for each row in E, then of course you'll get duplicate rows. What do you want in those cases? Randomly pick one OS row, the first one, the last one?
You're right. I was thinking about this backwards. Your original solution was the right one. There was a secondary field forcing the record to be duplicated. When I added the new condition, it cut out the duplicated record (it wasn't actually duplicated, but since I didn't tell it NOT to pull those, it pulled them) while maintaining the original set. Thanks for working with me through this! I really appreciate it!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.