0

This is the dataframe I want to iterate through. The index is set for both countries and year.

                            ISO_code    gini    ECONOMIC FREEDOM    rank    quartile    1a_government_consumption
        countries   year                                                                                    
        Argentina   1980    ARG         40.8    4.25    80.0    4.0 6.911765
                    1995    ARG         48.9    6.95    37.0    2.0 8.058824
                    2000    ARG         51.1    7.34    37.0    2.0 6.877627
                    2001    ARG         53.3    6.84    56.0    2.0 6.752473
                    2002    ARG         53.8    6.28    79.0    3.0 6.905961
                    2003    ARG         50.7    6.16    86.0    3.0 7.264992
        Bolivia     1980    BOL         40.8    4.25    80.0    4.0 6.911765
                    1985    BOL         48.9    6.95    37.0    2.0 8.058824
                    1995    BOL         51.1    7.34    37.0    2.0 6.877627
                    2000    BOL         53.3    6.84    56.0    2.0 6.752473
                    2001    BOL         53.8    6.28    79.0    3.0 6.905961
                    2002    BOL         50.7    6.16    86.0    3.0 7.264992
        Chile       1985    CHI         40.8    4.25    80.0    4.0 6.911765
                    1990    CHI         48.9    6.95    37.0    2.0 8.058824
                    1995    CHI         51.1    7.34    37.0    2.0 6.877627
                    1999    CHI         53.3    6.84    56.0    2.0 6.752473
                    2002    CHI         53.8    6.28    79.0    3.0 6.905961
                    2003    CHI         50.7    6.16    86.0    3.0 7.264992

I would like to create a for loop that returns a dataframe like this one:

countries    change gini    change ef                                                                 
Argentina    +              +
Bolivia      -              +
Chile        -              -
  1. countries is simply the columns with the country names from the previous dataframe.

  2. change gini should be the percentage difference between the last value of the gini column for each country and the most recent one. If the percentage increase is positive, then it should show a +; if it is negativa then it should show a -.

  3. change ef follows the same logic as the change gini in the new dataframe, with the only difference that the values used for calculating the percentage change come from the ECONOMIC FREEDOM column in the original dataframe.

2 Answers 2

1

You can achieve this quite easily via grouping functions.
Unfortunately the first and last values of the three countries in your dataset are the same, so the result is also three times the same two values.
(Perhaps there is sth wrong with the sample data?)

First group the dataframe by countries and pick just the two columns of interest:

grpd = df.groupby('countries')['gini', 'ECONOMIC FREEDOM']

With this Groupby-Object you can apply functions to the the subsets of your data which are separated by the grouping feature, countries in your case.
E.g. to get the last value in each group just ask for

grpd.last()

           gini  ECONOMIC FREEDOM
countries                        
Argentina  50.7              6.16
Bolivia    50.7              6.16
Chile      50.7              6.16

or accordingly for the first row per group

grpd.first()

           gini  ECONOMIC FREEDOM
countries                        
Argentina  40.8              4.25
Bolivia    40.8              4.25
Chile      40.8              4.25

for calculating the percentage of change of the last with respect to the first you could therefore simply write

(grpd.last() - grpd.first()) / grpd.first()

                gini  ECONOMIC FREEDOM
countries                             
Argentina  0.242647         0.449411
Bolivia    0.242647         0.449411
Chile      0.242647         0.449411

EDIT:
the output can also be formatted, e.g. like:

df_change = (grpd.last() - grpd.first()) / grpd.first()

df_change.applymap(lambda x: str.format('{:+.1f%}', x))

             gini ECONOMIC FREEDOM
countries                         
Argentina  +24.3%           +44.9%
Bolivia    +24.3%           +44.9%
Chile      +24.3%           +44.9%

EDIT2:
for signs only:

df_change.applymap(lambda x: ['-', ' ', '+'][np.sign(x).astype(int)+1])

          gini ECONOMIC FREEDOM
countries                      
Argentina    +                +
Bolivia      +                +
Chile        +                +
Sign up to request clarification or add additional context in comments.

Comments

0

Create empty lists and append desired values from countries, gini, and ECONOMIC FREEDOM columns for each country.

    countries = []
    gini = []
    efw = []
    for i in new_df.index.levels[0]:
        countries.append(i)
        country = new_df.loc[i]
        country = country.reset_index()
        x = country.iloc[0].tolist()
        y = country.iloc[-1].tolist()
        change_g = (((y[2] / x[2]) - 1) * 100)
        change_e = (((y[3] / x[3]) - 1) * 100)
        gini.append(change_g)
        efw.append(change_e)

Then do a for loop. For each number you append a + or a -.

g = []
e = []
for n in gini:
    if n > 0:
        g.append("+")
    g.append("-")

for f in efw:
    if f > 0:
        e.append("+")
    e.append("-")

Then create a dataframe with the lists countries, g, and e.

tuples = list(zip(countries,g,e))
changes = pd.DataFrame(tuples, columns=['Country','Change in Gini', "Change in Economic Freedom"])

1 Comment

This reads more like general purpose Python and not pandas-style Python. Consider groupby and vectorized (non-loop) processing.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.