0

I am using two dynamic arrays to read from a file. They are to keep track of each word and the amount of times it appears. If it has already appeared, I must keep track in one array and not add it into the other array since it already exists. However, I am getting blank spaces in my array when I meet a duplicate. I think its because my pointer continues to advance, but really it shouldn't. I do not know how to combat this. The only way I have was to use a continue; when I print out the results if the array content = ""; if (*(words + i) == "") continue;. This basically ignores those blanks in the array. But I think that is messy. I just want to figure out how to move the pointer back in this method. words and frequency are my dynamic arrays.

I would like guidance in what my problem is, rather than solutions.

I have now changed my outer loop to be a while loop, and only increment when I have found the word. Thank you WhozCraig and poljpocket.

Now this occurs. enter image description here

12
  • Just so you know, we have no idea whatsoever if the words and frequency arrays are properly allocated at all. And you need to look closely at what entry's word and frequency is being updated with your usage of i. Look very closely at it. Commented Jan 23, 2014 at 6:40
  • Edited in the creations of the arrays. I will look into what you mean right now. Commented Jan 23, 2014 at 6:43
  • @WhozCraig I try to set i-- when there is a duplicate found, but when I run, it seems to go nowhere. Some infinite loop? Not sure why, either. Commented Jan 23, 2014 at 6:47
  • you're incorrectly using it for insertions and your loop control variable simultaneously. Further, is this supposed to read the entire file (as opposed just come passed-in number count words? Commented Jan 23, 2014 at 7:18
  • @WhozCraig I will look again and see where I am goofing up. This is suppose to read the whole file. What I did is have a function that counted all the words earlier that returned that count. Thus, giving me a reasonable initial size for the dynamic array. Then I reopen the file and do the addWords. Commented Jan 23, 2014 at 7:24

5 Answers 5

1

Instead of incrementing your loop variable [i] every loop, you need to only increment it when a NEW word is found [i.e. not one already in the words array].

Also, you're wasting time in your inner loop by looping through your entire words array, since words will only exist up to index i.

 int idx = 0;
 while (file >> hold && idx < count) {
    if (!valid_word(hold)) {
        continue;
    }

    // You don't need to check past idx because you
    // only have <idx> words so far.
    for (int i = 0; i < idx; i++) {
        if (toLower(words[i]) == toLower(hold)) {
            frequency[i]++;
            isFound = true;
            break;
        }
    }

    if (!isFound) {
        words[idx] = hold;
        frequency[idx] = 1;
        idx++;
    }
    isFound = false;
 }
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you, I see now. I was able to implement it using pointers.
0

First, to address your code, this is what it should probably look like. Note how we only increment i as we add words, and we only ever scan the words we've already added for duplicates. Note also how the first pass will skip the j-loop entirely and simply insert the first word with a frequency of 1.

void addWords(const std::string& fname, int count, string *words, int *frequency)
{
    std::ifstream file(fname);
    std::string hold;

    int i = 0;
    while (i < count && (file >> hold))
    {
        int j = 0;
        for (; j<i; ++j)
        {
            if (toLower(words[j]) == toLower(hold))
            {
                // found a duplicate at j
                ++frequency[j];
                break;
            }
        }

        if (j == i)
        {
            // didn't find a duplicate
            words[i] = hold;
            frequency[i] = 1;
            ++i;
        }
    }
}

Second, to really address your code, this is what it should actually look like:

#include <iostream>
#include <fstream>
#include <map>
#include <string>

//
// Your implementation of toLower() goes here.
//


typedef std::map<std::string, unsigned int> WordMap;

WordMap addWords(const std::string& fname)
{
    WordMap words;

    std::ifstream inf(fname);
    std::string word;

    while (inf >> word)
        ++words[toLower(word)];

    return words;
}

If it isn't obvious by now how a std::map<> makes this task easier, it never will be.

8 Comments

I understand Maps from Java. However, this is to practice pointers since I'm new to c++. I completely understand what you mean by how much better a map is for this.
@Brandon granted. now look at the code before it, particularly the increment of i, which only happens once we know there is no prior entry matching.
ah, so it seems the outer for loop was a poor choice. I should only increment when I have NOT found the word and this will be done in a while loop. hmm ok, i took a stab at it, but it seems create empty spaces again, but shove them to the end of the array.
For the blanks at the end of the array: look at my last comment in my answer.
Also, it now states segmentation fault. Do you have any idea to why it is doing that?
|
0

check out SEEK_CUR(). If you want to set the cursor back

Comments

0

The problem is a logical one, consider several situations:

  1. Your algorithm does not find the current word. It is inserted at position i of your arrays.
  2. Your algorithm does find the word. The frequency of the word is incremented along with i, which leaves you with blank entries in your arrays whenever there's a word which is already present.

To conclude, 1 works as expected but 2 doesn't.

My advice is that you don't rely on for loops to traverse the string but use a "get-next-until-end" approach which uses a while loop. With this, you can track your next insertion point and thus get rid of the blank entries.

int currentCount = 0;
while (file)
{
     // your inner for loop
     if (!found)
     {
         *(words + currentCount) = hold;
         *(frequency + currentCount) = 1;
         currentCount++;
     }
}

4 Comments

You also might want to check array boundaries with currentCount.
This makes sense now. However, the blank spaces have moved to the end of the array. Not sure why
That's because you are allocating space for the amount of words, not the amount of distinct words.
Is there any way around this, or is this as good as it will be if I choose this method?
0

Why not use a std::map?

void collect( std::string name, std::map<std::string,int> & freq ){
  std::ifstream file;
  file.open(name.c_str(), std::ifstream::in );
  std::string word;
  while( true ){
    file >> word; // add toLower
    if( file.eof() ) break;
    freq[word]++;
  }
  file.close();
}

The problem with your solution is the use of count in the inner loop where you look for duplicates. You'll need another variable, say nocc, initially 0, used as limit in the inner loop and incremented whenever you add another word that hasn't been seen yet.

3 Comments

I would if I could, but this program is to work with pointers (dynamic arrays).
while(!file.eof()) is wrong. You'll insert the last word twice. See this for why.
@WhozCraig Thanks - I should have checked the doc for tokenizing read.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.