3

I am new to python and would like your help if possible. I have a .csv file that contains multiple rows. In one column I have Country, other column I have id and in other I have latitude and longitude. I would like to combine into a new data frame unique values of country and latitude and longitude with all ids. To make it easier, I have this input df example:

Country id  longitude   latitude
Angola  Pable   17.470  -12.245
Angola  Juan    17.470  -12.245
Albania Dimitri 20.032  41.141
Albania Dinko   20.032  41.141
United States   John    -112.599    45.705
United States   Paul    -112.599    45.705
United States   David   -112.599    45.705

I have tried:

df1 = df.groupby('Country').apply(lambda x: ','.join(x.id))

But it is not working.

The output I'm looking for is:

Country id  longitude   latitude
Angola  Pable, Juan 17.470  -12.245
Albania Dimitri, Dinko  20.032  41.141
United States   John, Paul, David   -112.599    45.705

I expected this output as a pandas data frame, which I will be using to plot a map using plotly in python. Any ideas? Thank you in advance.

1 Answer 1

2
print(
    df.groupby("Country")
    .agg({"id": ", ".join, "longitude": "first", "latitude": "first"})
    .reset_index()
)

Prints:

         Country                 id  longitude  latitude
0        Albania     Dimitri, Dinko     20.032    41.141
1         Angola        Pable, Juan     17.470   -12.245
2  United States  John, Paul, David   -112.599    45.705
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.