11

I'm writing a PostgreSQL function to count the number of times a particular text substring occurs in another piece of text. For example, calling count('foobarbaz', 'ba') should return 2.

I understand that to test whether the substring occurs, I use a condition similar to the below:

    WHERE 'foobarbaz' like '%ba%'

However, I need it to return 2 for the number of times 'ba' occurs. How can I proceed?

Thanks in advance for your help.

1

5 Answers 5

14

I would highly suggest checking out this answer I posted to "How do you count the occurrences of an anchored string using PostgreSQL?". The chosen answer was shown to be massively slower than an adapted version of regexp_replace(). The overhead of creating the rows, and the running the aggregate is just simply too high.

The fastest way to do this is as follows...

SELECT
  (length(str) - length(replace(str, replacestr, '')) )::int
  / length(replacestr)
FROM ( VALUES
  ('foobarbaz', 'ba')
) AS t(str, replacestr);

Here we

  1. Take the length of the string, L1
  2. Subtract from L1 the length of the string with all of the replacements removed L2 to get L3 the difference in string length.
  3. Divide L3 by the length of the replacement to get the occurrences

For comparison that's about five times faster than the method of using regexp_matches() which looks like this.

SELECT count(*)
FROM ( VALUES
  ('foobarbaz', 'ba')
) AS t(str, replacestr)
CROSS JOIN LATERAL regexp_matches(str, replacestr, 'g');
Sign up to request clarification or add additional context in comments.

Comments

10

How about use a regular expression:

SELECT count(*)
FROM regexp_matches('foobarbaz', 'ba', 'g');

The 'g' flag repeats multiple matches on a string (not just the first).

1 Comment

Check out my answer here for an update to this question and a comparison of both this method and an optimal way of doing this. Or, my answer to another question on DBA.SE, "How do you count the occurrences of an anchored string using PostgreSQL?".
1

There is a

str_count( src,  occurence )

function based on

SELECT (length( str ) - length(replace( str, occurrence, '' ))) / length( occurence )

and a

str_countm( src, regexp )

based on the @MikeT-mentioned

SELECT count(*) FROM regexp_matches( str, regexp, 'g')

available here: postgres-utils

Comments

1

Try with:

SELECT array_length (string_to_array ('1524215121518546516323203210856879', '1'), 1) - 1

--RESULT: 7

Comments

0

You can use regexp_count.

SELECT regexp_count('foobarbaz', 'ba');

The above command will give you the number 2.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.