1

I have a dataset of 50000 farmers who are growing crop in some villages. I Have to figure out how many farmers in same survey number land and how much his crop area [output image attached]

Here is my dummy data set

df
Out[5]: 
       Name    Village  Survey_no  Land_Area
0  Farmer_1  Village_1         26       0.33
1  Farmer_1  Village_1         26       0.40
2  Farmer_2  Village_1         26       0.30
3  Farmer_2  Village_1         26       0.40
4  Farmer_2  Village_1         26       0.50
5  Farmer_3  Village_1         26       0.52
6  Farmer_3  Village_1         26       0.40
7  Farmer_4  Village_1        151       0.23
8  Farmer_5  Village_1        151       0.25
9  Farmer_5  Village_1        151       0.10

Here is actual output required

enter image description here

Here is what I have so far:

df = (df.set_index(['Village','Survey_no', df.groupby(['Village','Survey_no']).cumcount().add(1)]).unstack().sort_index(axis=1, level=1))
df.columns = ['{}-{}'.format(x, y) for x, y in df.columns]

df = df.reset_index()


df

Village  Survey_no  Land_Area-1    ...       Name-6  Land_Area-7    Name-7
0  Village_1         26         0.33    ...     Farmer_3          0.4  Farmer_3
1  Village_1        151         0.23    ...          NaN          NaN       NaN

The output is not correct, because I don't get actual farmers wise total area of the same land and number of farmers in the same land.

2
  • 1
    look, is this what you want? Commented May 31, 2022 at 12:54
  • @inquirer, output required as per attached image, where survey number wise each farmers in same survey number show in different different columns with his area Commented May 31, 2022 at 14:59

2 Answers 2

3

Update:

dfs= df.groupby(['Name', 'Village', 'Survey_no']).agg('sum')
dfs = dfs.reset_index(level=0).set_index([dfs.groupby(['Village', 'Survey_no']).cumcount() + 1], append=True)\
         .unstack().sort_index(level=1, axis=1)
dfs.columns = [f'{i}_{j}' for i, j in dfs.columns]
dfs = dfs.assign(Total_Land_Area=dfs.filter(like='Land_Area').sum(axis=1))
dfs

Output:

                     Land_Area_1    Name_1  Land_Area_2    Name_2  Land_Area_3    Name_3  Total_Land_Area
Village   Survey_no                                                                                      
Village_1 26                0.73  Farmer_1         1.20  Farmer_2         0.92  Farmer_3             2.85
          151               0.23  Farmer_4         0.35  Farmer_5          NaN       NaN             0.58

Try this:

cnt = df.groupby(['Village', 'Survey_no']).cumcount()+1
dfs= df.groupby(['Village', 'Survey_no', cnt]).agg({'Name':'first',
                                              'Land_Area':'sum'})\
  .unstack()\
  .sort_index(level=1, axis=1)

dfs = dfs.assign(Total_Land_Area=dfs.filter(like='Land_Area').sum(axis=1))
dfs.columns = [f'{i}_{j}' if j else f'{i}' for i, j in dfs.columns]
dfs

Output:

                     Land_Area_1    Name_1  ...    Name_7 Total_Land_Area
Village   Survey_no                         ...                          
Village_1 26                0.33  Farmer_1  ...  Farmer_3            2.85
          151               0.23  Farmer_4  ...       NaN            0.58

[2 rows x 15 columns]
Sign up to request clarification or add additional context in comments.

1 Comment

Sir, in Our output, for Survey_no:26, need to be shown Farmer_1, Farmer_2 and Farmer_3 for single time, and Sum of the land area accordingly to the farmers.
1

Experience and strength was enough only for this. How to join bbb to aaa, overcomplicated solutions come to mind. Which I don't like.

bbb = df.groupby(['Name'])['Land_Area'].aggregate(['sum'])
aaa = df.groupby(['Village', 'Survey_no']).aggregate({'Land_Area': 'sum', 'Name': 'nunique'}).reset_index()
aaa = aaa.rename(columns={"Name": "No.of Farmers"})

Output bbb

           sum
Name          
Farmer_1  0.73
Farmer_2  1.20
Farmer_3  0.92
Farmer_4  0.23
Farmer_5  0.35

Output aaa

     Village  Survey_no  Land_Area  No.of Farmers
0  Village_1         26       2.85              3
1  Village_1        151       0.58              2

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.