0

I haven't been able to figure this out. Let's say I have a pandas dataframe (port_info) that looks like this:

         chass  olt  port   BW
0        1      1     1      80000
1        1      1     2     212000
2        1      1     3     926600
3        1      1     4      50000
4        1      1     5     170000
5        1      1     6     840000
6        1      1     7     320000
7        1      1     8     500000
8        1      1     9     270000
9        1      1    10     100000
10       1      2     1     420000
11       1      2     2      60000
12       1      2     3     480000
13       1      2     4      90000
14       1      2     5          0
15       1      2     6     520000
16       1      2     7     840000
17       1      2     8     900000
18       1      2     9     110000
19       1      2    10          0

I want to add a column depending on how many ports per olt per chassis. If there are more than 8 ports per olt per chass, then add a value of 1 to every row for that olt for that chass. Otherwise, add a value of 10 to every row for that olt for that chass.

In the end, I need a new column (port_info.BW_cap) that has a value for each port dependent on how many ports there are in that olt in that chass.

So far I have this to check the max port per olt:

test = pd.DataFrame(table.groupby(['chass','olt'])['port'].max()).reset_index()

That gets me a minimalist dataframe that looks like this:

chass  olt
1      1      10
       2      10
       3      10
       4      10
       5      10
       6      10
       7      10
       8      10
       11     10
       12     10
       13     10
       14     10
       15     10
       16     10
       17     10
       18     10

What's the best way to take all of the above and basically have pandas iterate through every row in the initial dataframe and compare to the appropriate row in the minimalist dataframe to check what the max port is for that olt for that chassis, and add a value to the row in the initial dataframe under a new column named 'BW_cap' dependant on the value in the minimalist dataframe for that same chass/olt combo?

So in the end, something that looks like this:

       chass  olt  port       BW    BW_cap
0        1    1     1    80000        1
1        1    1     2   212000        1
2        1    1     3   926600        1
3        1    1     4    50000        1
4        1    1     5   170000        1
2
  • 2
    You're question is not clear for me, for example, when I do table.groupby(['chass','olt'])['port'].max().reset_index() on table I get only two rows. This is correct since you have one chass which is (1) and two olt which are (1 and 2). Commented Mar 10, 2019 at 20:40
  • this isn't very clearly defined, could you demonstrate what exactly you're trying to accomplish with this BW_cap? Commented Mar 10, 2019 at 22:44

1 Answer 1

1

I think I get what you want. You just need the bottom 3 lines in this code. You were close, you can just join your groupby max result to the original dataframe.

One thing to note, saying "if there are more than 8 ports per chass/olt combination" is different than saying "the max port is > 8". If your ports aren't always number ascending 1 to 10. if there are chass/olt combinations that have 3, 6, 9 as the 3 ports, thats only 3 ports but the max is 9.

import random
random.seed(123)

df = pd.DataFrame({
        'chass':[random.randint(1, 10) for x in range(200)],
        'olt':[random.randint(1, 10) for x in range(200)],
        'port':[random.randint(1, 10) for x in range(200)],
        'BW':[random.randint(0, 1000000) for x in range(200)]})

g = df.groupby(['chass', 'olt']).apply(lambda x: 1 if x.port.max() > 8 else 10).reset_index()
g.columns = ['chass', 'olt', 'BW_cap']
df = pd.merge(df, g, on=['chass', 'olt'])
Sign up to request clarification or add additional context in comments.

2 Comments

This is looking like it. Thank you!. Will try this out and mark your answer once I can properly test this. And yes your right about the count vs the MAX. What i needed was the MAX specifically. We cant always account for how many ports there are. But the way things are done here, we can usually rely on the highest number port being used.
ok, if you need to see # of ports, you can just use .nunique() instead of max

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.