1

I´m trying to create a Df to put together the time vol of the pairs. Thereby, I tried the code below. It is working, but, wondering a way more dynamically dealing with list append to DF. Any help will be appreciated.. Thanks in advance!

     time = []

     for i in (dfeuh,dfguh,dfuch,dfujh,dfauh,
               dfnuh,dfejh,dfegh,dfgjh,dfuchh,dfgoldh,dfdaxh):

         a = i.groupby(by='Time')['Ch Pip H'].mean()
                                  .sort_values(ascending=False).head(1)
         time.append(a)

    df = pd.DataFrame(time[0])
    df1 = pd.DataFrame(time[1])
    df2 = pd.DataFrame(time[2]) 
    df3 = pd.DataFrame(time[3])
    df4 = pd.DataFrame(time[4])
    df5 = pd.DataFrame(time[5])
    df6 = pd.DataFrame(time[6])
    df7 = pd.DataFrame(time[7])
    df8 = pd.DataFrame(time[8])
    df9 = pd.DataFrame(time[9])
    df10 = pd.DataFrame(time[10])
    df11= pd.DataFrame(time[11]) 
    dfto = pd.concat([df,df1,df2,df3,df4,df5,df6,df7,df8, 
                      df9,df10,df11],ignore_index=False)
    pairs = ['EU','GU','UC','UJ','AU','NU','EJ',
             'EG','GJ','UCH','GOLD','DAX']
    dfto['Ch Pip H'] = pairs
    dfto = dfto.reset_index()
    dfto.set_index(['Ch Pip H'],inplace=True)
    col = ['Time Vol Max']
    dfto.columns = col
4
  • Welcome to SO! Please provide a minimal reproducible example. If anything, this will allow you to judge easily which optimisations are optimal for your use case. Commented Mar 9, 2018 at 22:56
  • "put together the time vol of the pairs" -- what is a "time vol", please? Also, what is "a way more dynamically dealing with list append"? Are you talking about reducing lines of code, or reducing elapsed time, or something else? Commented Mar 9, 2018 at 22:58
  • talking about reduce line od codes.. mainly on part mentioned below by thesilkworm . Commented Mar 9, 2018 at 23:11
  • 1
    If the code is working without any errors, ask on Code Review. Commented Mar 9, 2018 at 23:42

2 Answers 2

1

I would recommend replacing this:

df = pd.DataFrame(time[0])
df1 = pd.DataFrame(time[1])
df2 = pd.DataFrame(time[2]) 
df3 = pd.DataFrame(time[3])
df4 = pd.DataFrame(time[4])
df5 = pd.DataFrame(time[5])
df6 = pd.DataFrame(time[6])
df7 = pd.DataFrame(time[7])
df8 = pd.DataFrame(time[8])
df9 = pd.DataFrame(time[9])
df10 = pd.DataFrame(time[10])
df11= pd.DataFrame(time[11]) 
dfto = pd.concat([df,df1,df2,df3,df4,df5,df6,df7,df8, 
                  df9,df10,df11],ignore_index=False)

With this:

dfs = []
for i in range(12):
    dfs.append(pd.DataFrame(time[i])

dfto = pd.concat(dfs,ignore_index=False)

Depending on exactly what your data looks like, there might be some way you can do what's needed without even needing to use a loop - loops are usually a last resort when working with Pandas, but with the information available I can at least say this does the same thing as your existing code but in a more concise and adaptable way.

Edit to add that the first three lines of the above can be reduced further to simply:

dfs = [pd.DataFrame(time[i]) for i in range(12)]
Sign up to request clarification or add additional context in comments.

5 Comments

great! I will test. Thanks
this is exactly that I was trying. ..I can not give you vote due to low reputation :/ .. anyway, Thank very much !
Glad I could help. Just realised that the first three lines in the code I posted can actually be cut down to one (see edit).
This simplifies the code tremendously. The only comment I have, is that it looks like you are trying to create a DF for individual financial data. If that is the case and you expect to have tens of thoursands of these dataframes, consider using a dict instead of a list. You will need to balance memory concerns with performance concerns, but the same concept applies. Simplify your logic by iterating over it.
Great. Thanks again!!
0

From what I understand, you want a cleaner way of creating the data frames?

You can use the higher-order function map in this case and reduce the number of lines to one!

data_frames = map(pd.DataFrame, time)
dfto = pd.concat(data_frames, ignore_index=False)

This will give you a list of data frames, and you can get rid of all those lines that create DataFrames individually. Now data_frames should include all those but implement with one line.

The higher-order function map takes as first argument a function, in this case pd.DataFrame(), and as a second argument a sequence of items, in this case, the list time, and will apply that function to every element of the sequence and will return back a list with the result of applying the function to every element of the sequence.

3 Comments

Really good approuch. O Will study more about map. Seems to be Very usefull. Thanks a Lot!
Map is great. Anything you can do with map can be done with a list comp. Generally it is recommended you use a list comp instead. It's not as if one is ultimately better than the other, but it's generally more readily recognizable by others in the python community, so moving forward if you need more help, others will be more prepared to read/understand your code. In other languages like JavaScript/Functional languages, Map is generally the way to go.
@Matt Map is cleaner and simpler. I think we should always aim for the cleaner and the simpler. Since you need to call a function over a sequence of items then function is all about this.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.