87

I'm trying to count the number of times a string appears in another string.

I know you can count the number of times a letter appears in a string:

string = "aabbccddbb"
string.count('a')
=> 2

But if I search for how many times 'aa' appears in this string, I also get two.

string.count('aa')
=> 2

I don't understand this. I put the value in quotation marks, so I'm searching for the number of times the exact string appears, not just the letters.

3
  • 3
    Please clarify (with an edit): does 'aa' appear once or twice in the string 'aaa'. Commented Sep 19, 2014 at 16:35
  • It should probably be twice in that case. Positions 0 and 1 && Positions 1 and 2 Commented Sep 19, 2014 at 16:38
  • Certainly, you are an excellent poster. I rewarded you Cary Swoveland. Commented Sep 24, 2014 at 1:27

3 Answers 3

91

Here are two ways to count the numbers of times a given substring appears in a string (the first being my preference). Note (as confirmed by the OP) the substring 'aa' appears twice in the string 'aaa', and therefore five times in:

str = "aaabbccaaaaddbab"

1. Use String#scan with a regex that contains a positive lookahead that looks for the given substring

def count_em(str, substr)
  str.scan(/(?=#{substr})/).count
end
count_em(str,"aa")
  #=> 5
count_em(str,"ab")
  #=> 2

Note:

"aaabbccaaaaddbab".scan(/(?=aa)/)
  #=> ["", "", "", "", ""]

A positive lookbehind produces the same result:

"aaabbccaaaaddbab".scan(/(?<=aa)/)
  #=> ["", "", "", "", ""]

As well, String#scan could be replaced with the form of String#gsub that takes one argument (here the same regular expression) and no block, and returns an enumerator. That form of gsub in unusual in that has nothing to do with character replacement; it simply generates matches of the regular expression.

2. Convert given string to an array of characters, apply String#each_char then Enumerable#each_cons, then Enumerable#count

def count_em(str, substr)
  subarr = substr.chars
  str.each_char
     .each_cons(substr.size)
     .count(subarr)
end
count_em(str,"aa")
  #=> 5
count_em(str,"ab")
  #=> 2

We have:

subarr = "aa".chars
  #=> ["a", "a"]
enum0 = "aaabbccaaaaddbab".each_char
  #=> #<Enumerator: "aaabbccaaaaddbab":each_char>

We can see the elements that will generated by this enumerator by converting it to an array:

enum0.to_a
  #=> ["a", "a", "a", "b", "b", "c", "c", "a", "a", "a",
  #    "a", "d", "d", "b", "a", "b"]

enum1 = enum0.each_cons("aa".size)
  #=> #<Enumerator: #<Enumerator:
  #      "aaabbccaaaaddbab":each_char>:each_cons(2)> 

Convert enum1 to an array to see what values the enumerator will pass on to map:

enum1.to_a
  #=> [["a", "a"], ["a", "a"], ["a", "b"], ["b", "b"], ["b", "c"],
  #    ["c", "c"], ["c", "a"], ["a", "a"], ["a", "a"], ["a", "a"], 
  #    ["a", "d"], ["d", "d"], ["d", "b"], ["b", "a"],
  #    ["a", "b"]]
 
enum1.count(subarr)
  #=> enum1.count(["a", "a"])
  #=> 5
Sign up to request clarification or add additional context in comments.

Comments

63

It's because the count counts characters, not instances of strings. In this case 'aa' means the same thing as 'a', it's considered a set of characters to count.

To count the number of times aa appears in the string:

string = "aabbccddbb"
string.scan(/aa/).length
# => 1
string.scan(/bb/).length
# => 2
string.scan(/ff/).length
# => 0

5 Comments

I see, to find the count of actual strings, you use the scan method instead of the count method. Thank you.
Yeah scan takes a regular expression like /aa/ or even a string like "aa" if you prefer and returns the matches. length tells you how many matches if you don't care what the matches are.
You can also use count or size instead of length
There is no reason to use a Regexp instead of a String in this example.
Good answer. Worked on ruby 2.1.5 !
-5

try to use string.split('a').count - 1

2 Comments

Welcome to StackOverflow. Could you possibly elaborate on your answer a bit? It would help others in the future who might have this same question if you could explain the logic behind your solution!
"a".split('a').count == 0; "ba".split('a').count == 1; "bad".split('a').count == 2;

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.