0

I have a section of python code as shown:

# Main Loop that take values attributed to the row by row basis and sorts
# them into correpsonding columns based on matching the 'Name' and the newly
# generated column names.
listed_names=list(df_cv) #list of column names to reference later.
variable=listed_names[3:] #List of the 3rd to the last column. Column 1&2 are irrelevant.
for i in df_cv.index: #For each index in the Dataframe (DF)
     for m in variable: #For each variable in the list of variable column names
            if df_cv.loc[i,'Name']==m: #If index location in variable name is equal to the variable column name...
                df_cv.loc[i,m]=df_cv.loc[i,'Value'] #...Then that location is equal to the value in same row under the column 'Value'

Basically it takes a 3xn list of time/name/value and sorts it into an pandas df of size n by unique(n).

Time   Name    Value
1      Color   Red
2      Age     6
3      Temp    25
4      Age     1

Into this:

Time   Color   Age    Temp
1      Red     
2              6
3                     25
4              1

My code take a terribly long amount of time to run and I wanted to know if there is a better way to set up my loops. I come from a MATLAB background so the style of python (ie not using rows/column for everything is still alien).

How can I make this section of code run faster?

1 Answer 1

4

Instead of looping, think of it as a pivot operation. Assuming that Time is a column and not an index (if it is, just use reset_index):

In [96]: df
Out[96]: 
   Time   Name Value
0     1  Color   Red
1     2    Age     6
2     3   Temp    25
3     4    Age     1

In [97]: df.pivot(index="Time", columns="Name", values="Value")
Out[97]: 
Name   Age Color  Temp
Time                  
1     None   Red  None
2        6  None  None
3     None  None    25
4        1  None  None

In [98]: df.pivot(index="Time", columns="Name", values="Value").fillna("")
Out[98]: 
Name Age Color Temp
Time               
1          Red     
2      6           
3                25
4      1         

This should be much faster on real datasets, and is simpler to boot.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.