0

i wanna concatenate two values from the same column in that column here is my csv file :

Date,Region,TemperatureMax,TemperatureMin,PrecipitationMax,PrecipitationMin
01/01/2016,Champagne Ardenne,12,6,2.5,0.3
02/01/2016,Champagne Ardenne,13,9,3.9,0.6
03/01/2016,Champagne Ardenne,14,7,22.5,12.5
01/01/2016,Bourgogne,9,5,0.1,0
02/01/2016,Bourgogne,11,8,16.3,4.2
03/01/2016,Bourgogne,10,5,12.2,6.3
01/01/2016,Pays de la Loire,12,6,2.5,0.3
02/01/2016,Pays de la Loire,13,9,3.9,0.6
03/01/2016,Pays de la Loire,14,7,22.5,12.5

i want to have Bourgogne Champagne Ardenne instead of having them separated and calculate the average of TemperatureMax, TemperatureMin, PrecipitationMax, PrecipitationMin:

01/01/2016,Bourgogne Champagne Ardenne,10.5,5.5,1.3,0.15
02/01/2016,Bourgogne Champagne Ardenne,12,8.5,10.1,2.4
03/01/2016,Bourgogne Champagne Ardenne,12,6,17.35,9.4
01/01/2016,Pays de la Loire,12,6,2.5,0.3
02/01/2016,Pays de la Loire,13,9,3.9,0.6
03/01/2016,Pays de la Loire,14,7,22.5,12.5
6
  • How would you know that Bourgogne comes first, and Champagne Ardenne second? Or do you have only these two in your dataset? Commented May 12, 2017 at 9:58
  • @IanS you can base on date Commented May 12, 2017 at 9:59
  • But groupby + concatenate could return Champagne Ardenne Bourgogne, would that be correct? Commented May 12, 2017 at 10:01
  • @lanS i have other values in my dataset and i want to have exactly Bourgogne Champagne Ardenne because i wanna join it with other dataset that contain Bourgogne Champagne Ardenne Commented May 12, 2017 at 10:04
  • Then you'll have to be more specific about how to do the grouping. Add other regions to your example, and explain how they should be grouped. Commented May 12, 2017 at 10:20

2 Answers 2

1

More general solution is first replace by dict and then groupby + aggregate mean:

d = {'Champagne Ardenne':'Bourgogne Champagne Ardenne',
     'Bourgogne':'Bourgogne Champagne Ardenne'}

df['Region'] = df['Region'].replace(d)

df1 = df.groupby(['Date', 'Region'], as_index=False, sort=False).mean()
print (df1)
         Date                       Region  TemperatureMax  TemperatureMin  \
0  01/01/2016  Bourgogne Champagne Ardenne            10.5             5.5   
1  02/01/2016  Bourgogne Champagne Ardenne            12.0             8.5   
2  03/01/2016  Bourgogne Champagne Ardenne            12.0             6.0   
3  01/01/2016             Pays de la Loire            12.0             6.0   
4  02/01/2016             Pays de la Loire            13.0             9.0   
5  03/01/2016             Pays de la Loire            14.0             7.0   

   PrecipitationMax  PrecipitationMin  
0              1.30              0.15  
1             10.10              2.40  
2             17.35              9.40  
3              2.50              0.30  
4              3.90              0.60  
5             22.50             12.50  
Sign up to request clarification or add additional context in comments.

1 Comment

Glad can help you!
1

Use groupby's agg method:

df.groupby('Date').agg({
    'Region': lambda g: g.sort_values().str.cat(sep=' '),
    'TemperatureMax': 'mean',
    'TemperatureMin': 'mean',
    'PrecipitationMax': 'mean',
    'PrecipitationMin': 'mean'
})

Note that this concatenates regions by alphabetical order.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.