1

I have a dataframe and I am trying to get columns A,B,and C lined up next to each other in a new dataframe.

I have 15 columns for each letter. I tried to create a for-loop to loop through them so that A1,B1,C1 are next to each other up until A15,B15,and C15 are next to each other as well.

def organize_data(df):

rng = int(input('How many peptides do you have to analyze:  '))

number = 1
frames = []
for i in range(rng):
    if number == 16:
        break
    else:
        Ax='A'+ str(number)
        Bx='B'+ str(number)
        Cx='C'+ str(number)

        A = df.Ax[:41]
        B = df.Bx[:41]
        C = df.Cx[:41]
        dfABC = pd.concat([A,B,C], axis=1)
        frames.append(dfABC)
        number = number+1 

df1 = pd.concat(frames)
return(df1)

I keep getting this error: AttributeError: 'DataFrame' object has no attribute 'Ax'

Is there a way to get around this?

Here is my data set that I'm trying to organize: The "Wavelength" cell is at B29.

5
  • 1
    It seems you need A = df.iloc[:41, df.columns.get_loc(Ax)] Commented Aug 30, 2017 at 14:17
  • If you're trying to dynamically access columns, you'll need to use the [...] notation. Commented Aug 30, 2017 at 14:21
  • I reopen question because better solution is in my comment - df[Ax][:41] is not nice solution and multiple ][ creates chaining indexing. Commented Aug 30, 2017 at 14:35
  • Can you show example of your data and fix the indentation please? Commented Aug 30, 2017 at 14:38
  • @zipa i just edited the question to include a picture of my data Commented Aug 30, 2017 at 15:17

1 Answer 1

2

You need iloc with get_loc if need select first 41 rows with custom column name:

A = df.iloc[:41, df.columns.get_loc(Ax)]

EDIT:

I change solution completely - idea is use MultiIndex in columns with level with strings and numbers. Then sort it by second numeric level and last filter by rng. concat function is not necessary.

Sample:

np.random.seed(100)
mux = pd.MultiIndex.from_product([list('ABC'), range(1,16)])

df = pd.DataFrame(np.random.randint(10, size=(3,45)), columns=mux)
df.columns = [''.join((x[0], str(x[1]))) for x in df.columns]
print (df)
   A1  A2  A3  A4  A5  A6  A7  A8  A9  A10 ...   C6  C7  C8  C9  C10  C11  \
0   8   8   3   7   7   0   4   2   5    2 ...    9   3   2   5    8    1   
1   0   8   2   5   1   8   1   5   4    2 ...    6   6   0   7    2    3   
2   3   7   9   0   0   5   9   6   6    5 ...    9   0   9   8    6    2   

   C12  C13  C14  C15  
0    0    7    6    2  
1    5    4    2    4  
2    0    5    3    2  

[3 rows x 45 columns]

#helper df 
df1 = df.columns.to_series().str.extract('([a-zA-Z]+)(\d+)', expand=True)
#convert second column to int
df1[1] = df1[1].astype(int)
#create MultiIndex from df1
df.columns = df1.T.values.tolist()
#sort second level
df = df.sort_index(level=1, axis=1)
print (df)
   A  B  C  A  B  C  A  B  C  A ...  C  A  B  C  A  B  C  A  B  C
  1  1  1  2  2  2  3  3  3  4  ... 12 13 13 13 14 14 14 15 15 15
0  8  4  7  8  0  7  3  9  0  7 ...  0  1  7  7  0  1  6  8  1  2
1  0  3  2  8  6  4  2  3  2  5 ...  5  5  7  4  0  6  2  9  6  4
2  3  2  8  7  3  5  9  8  2  0 ...  0  7  4  5  3  8  3  9  9  2

#filter by condition
rng = 4
df2 = df.loc[:, df.columns.get_level_values(1) <= rng]
#convert MultiIndex to columns
df2.columns = [''.join((x[0], str(x[1]))) for x in df2.columns]
print (df2)
   A1  B1  C1  A2  B2  C2  A3  B3  C3  A4  B4  C4
0   8   4   7   8   0   7   3   9   0   7   6   2
1   0   3   2   8   6   4   2   3   2   5   4   7
2   3   2   8   7   3   5   9   8   2   0   7   7

All together in function:

def organize_data(df):

    rng = int(input('How many peptides do you have to analyze:  '))

    df1 = df.columns.to_series().str.extract('([a-zA-Z]+)(\d+)', expand=True)
    df1[1] = df1[1].astype(int)
    df.columns = df1.T.values.tolist()
    df = df.sort_index(level=1, axis=1)

    df2 = df.loc[:, df.columns.get_level_values(1) <= rng]
    df2.columns = [''.join((x[0], str(x[1]))) for x in df2.columns]
    return df2

a = organize_data(df)
print (a)
Sign up to request clarification or add additional context in comments.

1 Comment

This did the trick for me! My only issue now is aligning the data correctly in the final dataframe

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.