0

I'm trying to take 2 pieces of text in php like this...

"A cat jumped over the hat"

"The mad hatter jumped over his cat"

And get results like this...

the
cat
jumped
over 

(i.e. the common words between the strings, where hat is NOT included because it's part of another word in the second string)

I've found a bunch of examples to help count occurrences of 1 string within another, but that would end up giving me the "hatter" problem so I'm guessing I need to tokenize both strings into word-lists and do one-to-one compares somehow.

Struggling to visualise an efficient way to achieve that though so appreciate any thoughts at all on what the correct approach is. Thanks!

2
  • 3
    str_word_count() is a very useful function, especially with 1 or 2 as the format value; and you can walk each array forcing lowercase and then array_intersect Commented Jun 7, 2016 at 23:05
  • but my understanding is that I can just use that to do something like str_word_count(string,1) to get the array of words for each string? Commented Jun 7, 2016 at 23:07

2 Answers 2

2

Here's a one-liner using


<?php

$str1 = "A cat jumped over the hat";

$str2 = "The mad hatter jumped over his cat";

print_r(array_intersect(array_map("strtolower", explode(' ',$str1)), array_map("strtolower", explode(' ',$str2))));

Results in this output:

Array
(
    [1] => cat
    [2] => jumped
    [3] => over
    [4] => the
)

Sign up to request clarification or add additional context in comments.

Comments

1

For this problem, I'd use explode to separate each string into words, then create an array for each string where the keys are the words, and the values all just true. Then, you can take one of the arrays, loop through its keys, and check whether they're present in the other array.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.