0

I am not quite expert in C++ and I am writing a program to read multiple URL's on a single line of a html file, so I wrote this code:

ifstream bf;
short chapters=0;
string blkhtml;
string blktmpfile; //given
string urldown;    //given
size_t found = 0, limit;

    while(getline(bf, blkhtml)){
            while((blkhtml.find(urldown, found) != string::npos) == 1){
                found = blkhtml.find(urldown);
                limit = blkhtml.find("\"", found);
                found=limit + 1;
                chapters++;
            }
    }

My problem here is that found is not updated to be used in the while condition. As I've seen, std::string classes aren't updated unless another std::string class (for a string, str.erase() updates it's value, but (str.at() = '') doesn't), What can I do here if I want "found" to be updated every time the loop begins, and for the condition.

What I want to do is:

  • Check if there is a coincident expression for the urldown given string.

  • Set it's first and last character.

  • Update 'pos' in the loop after the found url, and then look for the next.

I've looke all over cplusplus.com and cppreference.com and I haven't found something that helps me.

I thought about std::list::remove on a loop with every number from 0 to 9, and then give it a new value, but I don't know if it is the best option.

5
  • You check the condition with blkhtml.find(urldown, found), but you update found with blkhtml.find(urldown). Can you spot the difference? Commented May 7, 2015 at 1:50
  • Yes :) it is because what I am reading is a html file with a very long, single line with all the URL's I am looking for, so I check the condition looking if there is a coincidence with urldown, then I use found and limit to go to the end of that url. As limit is the end, found = limit + 1; modifies found. Now when whilebegins the starting position has changed after the last coincidence. Commented May 7, 2015 at 2:18
  • 1
    ... but when you update found, you are searching from the beginning of the string again, not from where you left off last time. blkhtml.find(urldown) doesn't depend on the current value of found - it gives you the same position every time through the loop, so you are just running in circles. Commented May 7, 2015 at 5:17
  • O.o you are right, I will check it, I'm not in my pc right but I'll tell you Commented May 7, 2015 at 12:29
  • Yes @IgorTandetnik that was the problem, thanks, I checked it right now and it works :) thank you everybody Commented May 7, 2015 at 13:15

1 Answer 1

1

The problem is that you search from beginning each time:

while((blkhtml.find(urldown, found) != string::npos) == 1){
    found = blkhtml.find(urldown); // Searches from beginning of the string

This should be:

while((blkhtml.find(urldown, found) != string::npos) == 1){
    found = blkhtml.find(urldown, found); // Searches from "found"

Or, to seach only once, you can put it in the while clause:

while((found = blkhtml.find(urldown, found)) != string::npos){

Also, you don't reset found each time a new line is read:

while(getline(bf, blkhtml)){
    found = 0;
Sign up to request clarification or add additional context in comments.

3 Comments

But I don't need found to be 0, I need it to increase. Is a single line with "url1,url2,etc" so I find the first, the move to the end of the first, and look for the second, until there are no more matches
@JorgeBarroso Oh, I see the problem as well
Thank you, that was right, I was looking from the beginning on an infinite loop, now it's solved :D

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.