1

well, im a newbie in php, so i was making a program that counts words from a specific text file. This is my text file:

Hello Hello Hello Hello
Hello Word array sum
Hello Find

This is my code (php:

/*Open file*/
$handle = fopen($_FILES['file']['tmp_name'], 'r');

/*read all lines*/
while (! feof($handle)) {
$line = fgets($handle);

/*using array_count_values with str_word_count to count words*/
$result=       (array_count_values(str_word_count(strip_tags(strtoupper($line)), 1)));

/*sort array*/
arsort($result);

/*show the first ten positions and print array*/
$top10words2 = array_slice($result, 0, 10);
print "<pre>";
print_r ($top10words2);
print "</pre>";
}
fclose($handle);

but my output is like this:

Array{
[Hello] => 4
}
Array{
[Hello] => 1
[Word] => 1
[array] => 1
[sum] => 1
}
Array{
[Hello] => 1
[Find] => 1
}

I need the output to be like this:

Array{
[Hello] => 6
[Word] => 1
[array] => 1
[sum] => 1
[find] => 1
}

Any tips?

1
  • 1
    Of course you get that kind of output when you operate on each line separately, and don't do anything to merge the new count with those of the previous line. Just use file_get_contents to get the whole file content as one string. Commented Feb 24, 2017 at 17:36

3 Answers 3

1

Use file_get_contents instead

$fileContent = file_get_contents($_FILES['file']['tmp_name']);
/* using array_count_values with str_word_count to count words */
$result = (array_count_values(str_word_count(strip_tags(strtoupper($fileContent)), 1)));
/* sort array */
arsort($result);
/* show the first ten positions and print array */
$top10words2 = array_slice($result, 0, 10);
print "<pre>";
print_r($top10words2);
print "</pre>";

Here is the output :

Array
(
    [HELLO] => 6
    [FIND] => 1
    [SUM] => 1
    [ARRAY] => 1
    [WORD] => 1
)
Sign up to request clarification or add additional context in comments.

Comments

0

I agree with the file_get_contents() answer from Ayaou, however for very large files you may need to do it as you've started. You want to build the array of words in the loop and then count, sort and slice afterward:

$result = array();
while(!feof($handle)) {
    $line = fgets($handle);
    $result = array_merge($result, str_word_count(strip_tags(strtoupper($line)), 1));
}
$result = array_count_values($result);
arsort($result);
$top10words2 = array_slice($result, 0, 10);

2 Comments

(disregard comment; didn't look closely enough at your code)
@ethan OK, I did't see it anyway ;-)
0

You're not doing anything to combine the word counts that you calculate on each line. By setting $result = array_count_values(...) you're abolishing the results from the previous loop. Additionally, because you're performing your splice and dump from within the loop, you're never acting upon the full result set, and thus are never getting a real idea about what are the top 10 most used words.

Your code needs two changes:

  1. Combine the counts from each line into a single array.
  2. Wait until you're finished processing the file before looking at the results.

Using file_get_contents() will work, but depending on how large the file is that you're processing, this can cause memory limit errors. A solution that utilizes your initial method would look like this:

$results = [];
while (!feof($handle)) {
  $line = fgets($handle);
  $line_results = array_count_values(str_word_count(strip_tags(strtoupper($line)), 1));
  foreach ($line_results as $word => $count) {
    if (isset($results[$word])) {
      $results[$word] += $count;
    }
    else {
      $results[$word] = $count;
    }
  }
}

arsort($results);
// etc...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.