4

I'm looking for a easy function to get the number of datasets (of any table) containing NULL values. It's not important how many columns are involved.

For example:

CREATE TABLE tab1 (
id INTEGER PRIMARY KEY,
dat1 VARCHAR,
dat2 VARCHAR);

INSERT INTO tab1 VALUES 
(1, NULL, 'abc'), 
(2, 'abc', NULL), 
(3, 'abc', 'abc');

SELECT count(*)
FROM tab1
WHERE id IS NULL OR dat1 IS NULL OR dat2 IS NULL;
-- 2

(I know that Primary Keys are not likely to habe NULLs ;)

My current solution uses R to build the query. Any suggestions?

1
  • What version of postgres are you on? Commented Jun 6, 2011 at 13:15

2 Answers 2

3

You will need to use dynamic SQL to acheive this, perhaps like this:

testbed:

create role stack;
create schema authorization stack;
set role stack;

create table my_table as 
select generate_series(0,9) as id, null::integer as val;

create table my_table2 as 
select g as id, case when g<=5 then null::integer else 1 end as val
from generate_series(0,9) g;

function:

create function get_with_nulls_rowcount(p_schema in text, p_table in text) 
                returns integer language plpgsql as $$
declare 
  n integer;
  s text;
begin
  --
  select array_to_string(array_agg(column_name::text||' is null'), ' or ')
  into s
  from information_schema.columns
  where table_schema=p_schema and table_name=p_table;
  --
  execute 'select count(*) from '||p_table||' where '||s into n;
  return n;
end;$$;

query:

select table_schema, table_name, 
       get_with_nulls_rowcount(table_schema, table_name)
from information_schema.tables
where table_schema='stack';

result:

 table_schema | table_name | get_with_nulls_rowcount
--------------+------------+-------------------------
 stack        | my_table   |                      10
 stack        | my_table2  |                       6
(2 rows)
2

Add anything to NULL and the result is NULL. So I'd go by using:

SELECT count(*)
FROM tab1
WHERE --id IS NULL OR  /*PK is NULL? :-)*/
    (dat1 + dat2) IS NULL;

But that it's just easier to write, I don't think it's faster than you solution.

2
  • Thanks for the quick answer. The query for a paticular table seems to be quite clear - I'm actually looking for a function which can calculate this for any given table! Commented Jun 6, 2011 at 11:21
  • 1
    I don't think there is a function for showing if a row has or not null values, but I'll better let the wiser guys tell you. If not, I would build a dynamic sql statement that goes to the system tables and shows the table's columns and build a statement similar to the one from the answer. Commented Jun 6, 2011 at 11:47

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.