2

I am trying to concatenate 2 columns (one is a CHAR(7) and the other is CHAR(6)), add an _ separator between the 2 and limit the result to 10,000 records. I also want to take a random sample.

I have read a number of excellent posts on here including:

Combine two columns and add into one new column

The above post as well as others I have looked at had great starting points for me but I have not yet managed to deliver the expected result.

First I tried:

SELECT osnrth1m||'_'||oseast1m FROM postzon
ORDER BY RANDOM()
LIMIT 10000

While it delivered the correct result of the form '1234567_123456' (ignore the quotes) there were a number of results that were '_'.

The linked post above said that this was because using the concatenation operator (i.e attribute1||attribute2) does not account for NULL values.

I then tried to use the COALESCE function:

SELECT COALESCE(osnrth1m,'')||'_'||COALESCE(oseast1m,'') FROM postzon
ORDER BY RANDOM()
LIMIT 10000;

But this also produced some results that were just underscores as per previous.

I then tried the CONCAT_WS function:

SELECT CONCAT_WS('_',osnrth1m,oseast1m) FROM postzon
ORDER BY RANDOM()
LIMIT 10000;

But unfortunately again some rows returned with just an underscore.

Finally I tried messing about with CASE - I couldn't work out a way to write the query so that NULL values were ignored. In fact the last one I tried specifically says if NULL then show result as NULL (that's what I think it's saying anyway!):

SELECT CASE WHEN (osnrth1m,oseast1m) IS NULL THEN NULL
        ELSE CONCAT_WS('_',osnrth1m,oseast1m) 
        END
        FROM postzon
ORDER BY RANDOM()
LIMIT 10000;

I'm sorry if I'm missing something glaringly obvious but could anyone suggest a way achieve this that will just skip NULL values completely?

Something along the lines of:

SELECT CASE WHEN (osnrth1m,oseast1m) IS NULL THEN IGNORE/DROP/EXCLUDE
            ELSE CONCAT_WS('_',osnrth1m,oseast1m) 
            END
            FROM postzon
    ORDER BY RANDOM()
    LIMIT 10000;

Is there a PostgreSQL operator like IGNORE or DROP or EXCLUDE?

For each of the queries above I tried including the following line after the SELECT statement:

WHERE (osnrth1m IS NOT NULL OR oseast1m IS NOT NULL) 

but this didn't seem to change the results.

Any help would be gratefully received!

Thanks

4
  • this should run fine, weird SELECT COALESCE(osnrth1m,'')||'_'||COALESCE(oseast1m,'') FROM postzon WHERE (osnrth1m IS NOT NULL OR oseast1m IS NOT NULL) ORDER BY RANDOM() LIMIT 10000; Commented Nov 16, 2015 at 15:32
  • @LongBeard_Boldy - thanks for the quick reply. I tried this again but this time copying it from your answer incase I had made a silly typo - still gives me some results with underscores. Have you got any other suggestions? Many thanks Commented Nov 16, 2015 at 15:46
  • maybe ty this one ? SELECT COALESCE(osnrth1m,'')||''||COALESCE(oseast1m,'') FROM postzon WHERE not (COALESCE(osnrth1m,'')||''||COALESCE(oseast1m,'') = '_') ORDER BY RANDOM() LIMIT 10000; Commented Nov 16, 2015 at 16:01
  • @LongBeard_Boldy - thanks so much mate, managed to get it working using stas.yaranov's suggestion. Really appreciate your help - if I could select 2 answers than I would! Commented Nov 16, 2015 at 16:05

1 Answer 1

1

You need both osnrth1m and oseast1m to be not null. This is why you should use and but not or:

WHERE (osnrth1m IS NOT NULL AND oseast1m IS NOT NULL)

And the query is:

SELECT
        CONCAT_WS('_',osnrth1m,oseast1m)
    FROM postzon
    WHERE (osnrth1m IS NOT NULL AND oseast1m IS NOT NULL)
    ORDER BY RANDOM()
    LIMIT 10000;

If there a empty osnrth1m and oseast1m values which also need to be excluded then:

SELECT
            CONCAT_WS('_',osnrth1m,oseast1m)
        FROM postzon
        WHERE
            osnrth1m IS NOT NULL AND
            oseast1m IS NOT NULL AND
            osnrth1m <> '' AND
            oseast1m <> ''
        ORDER BY RANDOM()
        LIMIT 10000;
Sign up to request clarification or add additional context in comments.

3 Comments

Hi @stas.yaranov, thanks for the quick reply. I had actually tried the AND instead of OR but was worried that missed the case where one of them was NULL but the other wasn't. I tried your query anyway (the one I tried before using AND just used ||) unfortunately it also produced some results with just underscores. Have you got any other ideas? Many thanks
stas.yaranov is right, your main problem is not with null values, those are empty strings. thats why you get those underscores
@stas.yaranov Thanks so much that was exactly what that was! I was under the impression that all blank attributes in this database were NULL, not empty strings so a massive massive thanks for pointing that out to me asI would never have thought about that - life saver!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.