0

I have a simple table generated with a subquery that applies many different filters.

  | project
1 | Hello
2 | Hello 2.0
3 | Ordinary Sheep
4 | Sheep

The next step is to remove projects with very similar names (for example, if a project has the same name but followed by a 2.0).

In this case I need my query to remove Project 2.0 from the results. This little issue is more challenging than I expected.

My best bet seems to be this one bellow where I correctly identify the project that should be excluded, but if I invert the operation I end up with duplicated data because due to the self join.

SELECT 
    q1.name,
    q2.name
FROM subquery q1
JOIN subquery q2 ON q1.name LIKE q2.name || '%'
WHERE q1.id <> q2.id;

Thank you so much!

1 Answer 1

1

May be you can match the first occurrence of a digit in a project and exclude everything after that.Then apply RTRIM and DISTINCT over it. This will however not work if the project name itself has a number in it.

with s as 
( 
   --your query that you have inside sub-query
)
select DISTINCT RTRIM(regexp_replace(project, '^([^\d]+)\d.*$','\1')) from s;

DEMO

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.