Postgres "NOT IN" operator usage

Question

In my database when I run the following query, I get 1077 as output.

select count(distinct a_t1) from t1;

Then, when I run this query, I get 459.

select count(distinct a_t1) from t1 
where a_t1 in (select a_t1 from t1 join t2 using (a_t1_t2) where a_t2=0);

The above is the same as, this query which also give 459:

select count(distinct a_t1) from t1 join t2 using (a_t1_t2) where a_t2=0

But when I run this query, I get 0 instead of 618 which I was expecting:

select count(distinct a_t1) from t1 
where a_t1 not in (select a_t1 from t1 join t2 using (a_t1_t2) where a_t2=0);

I am running PostgreSQL 9.1.5, which really might not be necessary. Please point out my mistake in the above query.

UPDATE 1: I created a new table and output the result of the subquery above into that one. Then, I ran a few queries:

select count(distinct a_t1) from t1
where a_t1 not in (select a_t1 from sub_query_table order by a_t1 limit 10);

And Hooray! now I get 10 as the answer! I was able to increase the limit until 450. After that, I started getting 0 again.

UPDATE 2:

The sub_query_table has 459 values in it. Finally, this query gives me the required answer:

select count(distinct a_t1) from t1
where a_t1 not in (select a_t1 from sub_query_table order by a_t1 limit 459);

Where as this one, gives 0 as the answer:

select count(distinct a_t1) from t1
where a_t1 not in (select a_t1 from sub_query_table);

But, why is this happening?

OkieOth · Accepted Answer · 2012-09-24 04:18:51Z

2

The 'NOT IN' operator works only over 'NOT NULL'. Columns with a value of null are not matched.

select count(distinct a_t1) from t1 
where a_t1 not in (select a_t1 from t1 join t2 using (a_t1_t2) where a_t2=0) OR a_t1 IS NULL;

answered Sep 24, 2012 at 4:18

OkieOth

3,7642 gold badges24 silver badges30 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Phani Over a year ago

True, but that only accounts for 1 value of a_t1, there are 617 other values of a_t1 which I don't see in the result of my last query.

OkieOth Over a year ago

@Phani Is it possible that the distinct reduce the count? Have all entries in subselect the same value?

Phani Over a year ago

I'm not sure I follow what you say. The 1st query shows that there are 1077 distinct a_t1 values. Then 2nd query shows that out of 1077 values, 459 distinct values are present in the result of subquery. So, there should be 618 distinct a_t1 values that are not present in the result of the subquery. But the last query returns 0 and if I update it with OR a_t1=0, I get 1 as the result.

OkieOth Over a year ago

@Phani do you type 'OR a_t1=0' or 'OR a_t1 IS NULL'?

Phani Over a year ago

I am sorry. I did use OR a_t1 IS NULL.

|

Collectives™ on Stack Overflow

Postgres "NOT IN" operator usage

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related