1

in Postgres 9.1 is it possible to skip row(s) if the value of NAME is equal to the one before f.e. following table

ID | NAME | AGE | SEX | CLASS
---------------------------------
1    Paul   17    M     2b
2    Paul   16    M     2b
3    Paul   18    F     2b
4    Lexi   18    M     2b
5    Sarah  16    F     2b
6    Sarah  17    F     2b

The result should be:

1    Paul   17    M     2b
4    Lexi   18    M     2b
5    Sarah  16    F     2b

Thanks for your help,

t book

0

2 Answers 2

4
select *
from (
  select id, 
         name, 
         age, 
         sex, 
         class, 
         lag(name) over (order by id) as prev_name
  from the_table
) as t
where name <> prev_name;

alternatively

select *
from (
  select id, 
         name, 
         age, 
         sex, 
         class, 
         row_number() over (partition by name order by id) as rn
  from the_table
) as t
where rn = 1;

Another option would be to use Postgres' distinct on operator:

select distinct on (name) 
       id, 
       name,
       age,
       sex,
       class
from the_table
order by name,id

but that will return the result ordered by name (which is limitation of the distinct on operator). If you don't want that you'll need to wrap this again:

select *
from (
  select distinct on (name) 
         id, 
         name,
         age,
         sex,
         class
  from the_table
  order by name,id
) t
order by id;
Sign up to request clarification or add additional context in comments.

1 Comment

WOW this was fast! I´ll try it…
1
SELECT ID , NAME , AGE , SEX , CLASS
FROM thetable t
WHERE NOT EXISTS (
    SELECT * FROM thetable nx
    WHERE nx.NAME = t.NAME
    -- AND nx.ID < t.ID -- ANY one before it
    AND nx.ID = t.ID-1  -- THE one before it
    );

8 Comments

I know that, but the op was not very clear in his requirements ("before it"). That why I added the other exclusion clause, and commented it out. BTW: in most cases, EXISTS will be faster than anything else.
Both working thanks, wildpassers seems a bit faster with 11000 rows! one more stupid question, would you add a where clause for the class (2b) after AND nx.ID = t.ID-1 AND t.CLASS = "2b" ?
I'm actually a bit surprised that the co-related subquery seems to be faster. It requires two scans on the table (albeit the second one only partially), whereas the solution with the window function only requires a single scan.
My fault, to long in front of the computer, yours was faster, mixed something. Time for a dog walk, thanks for your help!
Speed depend on the index structure, obviously. Normally one would expect name and id to be covered by usable indexes. Small queries will always us a hashed plan, which will probably favor the UNIQUE ON. For queries that outgrow work_mem, EXIST will probably win again.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.