1

I am having a data frame of four columns. I want to find the minimum among the first two columns and the last two columns for each row. Code:

np.random.seed(0)
xdf = pd.DataFrame({'a':np.random.rand(1,10)[0]*10,'b':np.random.rand(1,10)[0]*10,'c':np.random.rand(1,10)[0]*10,'d':np.random.rand(1,10)[0]*10,},index=np.arange(0,10,1))

xdf['ab_min'] = xdf[['a','b']].min(axis=1)
xdf['cd_min'] = xdf[['c','d']].min(axis=1)
xdf['minimum'] = xdf['ab_min'].list()+xdf['cd_min'].list()

Expected answer:

xdf['minimum'] 

0   [ab_min,cd_min]
1   [ab_min,cd_min]
2   [ab_min,cd_min]
3   [ab_min,cd_min]

Present answer:

AttributeError: 'Series' object has no attribute 'list'

3 Answers 3

2

Select the columns ab_min and cd_min then use to_numpy to convert it to numpy array and assign the result to minimum column

xdf['minimum'] = xdf[['ab_min', 'cd_min']].to_numpy().tolist()

>>> xdf['minimum']

0      [3.23307959607905, 1.9836323494587338]
1     [6.189440334168731, 1.0578078219990983]
2    [3.1194570407645217, 1.2816570607783184]
3     [1.9170068676155894, 7.158027504597937]
4     [0.6244579166416464, 8.568849995324166]
5     [4.108986697339397, 0.6201685780268684]
6     [4.170639127277155, 2.3385281968695693]
7      [2.0831140755567814, 5.94063873401418]
8     [0.4887113296319978, 6.380570614449363]
9     [2.844815261473105, 0.9146457613970793]
Name: minimum, dtype: object
Sign up to request clarification or add additional context in comments.

Comments

2

try this:

import pandas as pd
import numpy as np

xdf = pd.DataFrame({'a':np.random.rand(1,10)[0]*10,'b':np.random.rand(1,10)[0]*10,'c':np.random.rand(1,10)[0]*10,'d':np.random.rand(1,10)[0]*10,},index=np.arange(0,10,1))

print(xdf)

ab = xdf['ab_min'] = xdf[['a','b']].min(axis=1)
cd = xdf['cd_min'] = xdf[['c','d']].min(axis=1)
blah = pd.concat([ab, cd], axis=1)

print(blah)

results:

enter image description here

Comments

1

You can use .apply with a lambda function along axis=1:

xdf['minimum'] = xdf.apply(lambda x: [x[['a','b']].min(),x[['c','d']].min()], axis=1)

Result:

>>> xdf
          a         b         c         d                                    minimum
0  0.662634  4.166338  8.864823  9.004818    [0.6626341544146663, 8.864822751494284]
1  6.854054  6.163417  6.510728  0.049498   [6.163416966676091, 0.04949754019059838]
2  6.389760  4.462319  2.435369  3.732534    [4.462318678134215, 2.4353686460846893]
3  4.628735  7.571098  1.900726  9.046384    [4.628735362058981, 1.9007255361271058]
4  3.203285  4.364302  2.473973  2.911911    [3.203285015796596, 2.4739732602476727]
5  5.357440  3.166420  9.908758  0.910704      [3.166420385020304, 0.91070444348338]
6  8.120486  6.395869  0.970977  5.278279    [6.395868901095546, 0.9709769503958143]
7  1.574765  7.184971  3.835641  4.495135     [1.574765093192545, 3.835640598199231]
8  8.688497  0.069061  0.771772  8.971878  [0.06906065557899743, 0.7717717844423222]
9  5.455920  2.630342  1.966357  7.374366    [2.6303421168291843, 1.966357159086991]

2 Comments

You nailed it. Thanks. I tried .apply but couldn't know how to iterate through two columns only. Also, my actual dataframe has 12 columns. I don't want to mention the column names but considered xdf.apply(lambda x: [x[xdf.columns[0:2]].min(),x[xdf.columns[2:4]].min()], axis=1)
@Mainland yeah that would work too but has the possible drawback of hardcoding the location of the columns. glad my answer was helpful!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.