The time complexity of the for loop nested with if-statement and while loop

Question

void makeGraph(){
    for(auto itr = inputs.begin(); itr != inputs.end(); itr++){
        string str = itr->second;
        if (strstr(str.c_str(), "->") != NULL){
            char ch = '>';
            size_t pos = str.find_last_of(ch);
            parent = str.substr(0, pos-2);
            string child = str.substr(pos+2);
            if (strstr(parent.c_str(), ",") != NULL){
                int size_parent = parent.length();
                char ch_parent[size_parent+1];
                strcpy(ch_parent, parent.c_str());
                char *token = strtok(ch_parent, ",");
                while (token != NULL){
                    string sub_Parent = token;
                    sub_Parent.erase(std::remove(sub_Parent.begin(),
                                                 sub_Parent.end(),
                                                 ' '),
                                     sub_Parent.end());
                    graph[sub_Parent].push_back(child);
                    token = strtok(NULL, ",");
                }
            } else{
                graph[parent].push_back(child);
            }
        } else{
            cout<<"Just for TEST"<<endl;
        }
    }
    printGraph();
}

What is the time complexity of this function? Is it n O(n)?? How do you handle the loop inside the else statement?

The real question here is why you're bashing around with C strings in the middle of C++ code using std::string. — tadman
– tadman, Commented Mar 10, 2021 at 19:28
There's really no reason to drop down to dirty C strings in the middle of some C++ code, especially not just to use strtok(). Remember, C++ has regular expressions, so steer towards solutions in that domain rather than going back to the 1970s. — tadman
– tadman, Commented Mar 10, 2021 at 19:39
@simsim I've gone through and formatted your code consistently. In the future, please try to format code to make it as readable as possible. At the very least, use consistent indentation. On another note, empty character literals don't exist in C++, so this expression isn't valid: std::remove(sub_Parent.begin(), sub_Parent.end(), ''). Hence the odd highlighting going on. You might want to think about what that expression is meant to be doing. As it is, I think the answer to your question is "could be anything because code not valid". — Ken Wayne VanderLinde
– Ken Wayne VanderLinde, Commented Mar 10, 2021 at 19:51
@simsim Going back to Jarod's question ("What is N?"). You're code has multiple ways in which the inputs could scale, and therefore there are at least these variables you need to account for: 1. The number of elements in input. 2. The size of the strings stored in input. Are you assuming the strings are constant-bounded, and so focusing only on the size of input? Or perhaps the other way around? If both can scale independently, you need to use multiple variables (M, N) to determine its complexity. — Ken Wayne VanderLinde
– Ken Wayne VanderLinde, Commented Mar 10, 2021 at 19:59
@KenWayneVanderLinde I used this line of code only to remove extra spaces in the string. In the questions asked here(i mean stack overflow), many people suggested this way remove spaces in the string. — simsim
– simsim, Commented Mar 10, 2021 at 19:59

Ken Wayne VanderLinde · Accepted Answer · 2021-03-10 20:57:04Z

The difficulty in this case is that you have many variables that need to be accounted for. The important one being:

The number of elements in input.
For each string in input, the number of characters in the string.

If you want a precise estimate, you would have to separately account for each of these variables. If we use N to refer to the length of input, then we have N strings whose lengths are M1, M2, ..., MN. Each string can be handled differently, so that complicates things as well.

Since Big-O provides an upper bound, we can save ourself some trouble by instead considering these variables:

N, being the number of elements in input.
M, being the length of the longest input string.

The last big simplification comes from looking at the worst case behaviour. Without know more about the input data, we can't really do more nuanced calculation anyways, so let's just do the worst case.

Since there is so much going on here, I like to look at the "leaves" - the most nested blocks - and then work my way up. Let's hone in on the elephant in the room: the while loop. What's going on here? The loop itself is basically just iterating over parent. Note that the length of parent can scale linearly with the length of str, which means the complexity of the while loop will look like O(M * [complexity of the while body]). The body is only composed of a few steps:

string sub_Parent = token;
sub_Parent.erase(std::remove(sub_Parent.begin(), sub_Parent.end(), ' '), sub_Parent.end());
graph[sub_Parent].push_back(child);
token = strtok(NULL, ",");

The work involving sub_Parent and token plays into the looping behaviour of the while loop (i.e., the fewer while loop iterations there are, the longer token will generally be, and vice versa). However, none of those operations will scale worse than O(M) since the length of parent is itself O(M). Instead let's focus on the push_back() call. Since the length of child is O(M), and since this push_back() copies child, this operation is O(M) in time complexity. Therefore, we can confidently say that the complexity of the while loop body is O(M) as well. Combining the body with the loop itself, the total complexity of the while loop is:

  O(M * [complexity of the while body]
= O(M * M)
= O(M^2)

Now let's go up one level. The operations outside of the while loop are clearly O(M) since they deal with copying and searching parent. The while loop dominates this, so we can ignore those contributions.

Going up one more level, we hit the if. What do we do here? Well, in the true case, we get the while loop with complexity O(M^2). In the false case, we get a push_back() with a copy of child, making that O(M). Since we're looking at the worst case, we'll say the complexity of the if/else is the larger expression: O(M^2).

Going up another level (just outside the if from the previous step), we have a similar situation as before in that the operations are at worst O(M) as they search and copy str to create parent and child. These are dominated by the if complexity, so we can ignore their contributions.

Now for the outermost if. We again compare branches. In the true branch, we have the O(M^2) complexity we just worked out. In the false branch we have O(1) complexity (just printing a constant string). So in the worst case, this if has O(M^2) complexity.

We're almost there. Outside of that if we have a copy to create str. This is O(M), so we can ignore it as it is dominated by the O(M^2) from the if. So the for loop body is O(M^2) in total.

Finally we look at the for loop itself. It is a simple linear iteration over input (being N elements in length), meaning its complexity is:

  O(N * [complexity of the body])
= O(N * M^2)

So there you have it. The worst case complexity is linear in the length of input, but quadratic in the length of the longest string.

Thank you for taking the time to help me. You explained very well. Thank you very much.

Collectives™ on Stack Overflow

The time complexity of the for loop nested with if-statement and while loop

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related