Decided to give it a go myself and there are a ton of problems with this concept. Here's a couple:
/^(tell me|hey what is) your name$/
A correct answer would be both 4 and 5 words - presenting inconsistency.
/^hey what (.+) up to$/
What happens in this instance? The parenthesis could contain any number of potential words.
So, all in all, the idea of a function to detect a definitive answer was, perhaps, pretty silly ^o^
Nevertheless, I gave it a shot and here's what I came up with, incompatible with (.+) and fairly untested, unleash the horror...
/**
* Try to detect min/max amount of words in the given pattern.
*
* @param string $pattern
* @param string $or_words_pattern
* @param string $unwanted_pattern
* @return array
*/
function regex_word_count(
$pattern,
$or_words_pattern = '/\((\w|\s|\|)+\)/',
$unwanted_pattern = '/[^a-zA-Z0-9\|\(\)\s]/')
{
$result = ['min' => 0, 'max' => 0];
$pattern = str_replace('\s', ' ', $pattern);
$pattern = preg_replace($unwanted_pattern, null, $pattern);
if (preg_match_all($or_words_pattern, $pattern, $ors)) {
$matches = current($ors);
foreach ($matches as $match) {
$strings = explode('|', $match);
foreach ($strings as $string) {
$counts[$match][] = count(explode(' ', $string));
}
}
foreach ($counts as $count) {
$result['min'] += min($count);
$result['max'] += max($count);
}
$pattern = trim(preg_replace($or_words_pattern, null, $pattern));
$pattern = preg_replace('/\s+/', ' ', $pattern);
}
if (!empty($pattern)) {
$count = count(explode(' ', $pattern));
$result['min'] += $count;
$result['max'] += $count;
}
return $result;
}
Example:
$x = regex_word_count('/^(a{3}) ([abc]) (what is the|tell me) your (name|alias dude)$/');
die(var_dump($x));
// array(2) {
// 'min' =>
// int(6)
// 'max' =>
// int(8)
// }
It was a fun exercise of trying to do something, well, impossible.
sizeof(explode(" ", $str))^.+$?