1

I have a data in a csv file here is a sample:

firstnb,secondnb,distance
901,19011,459.73618164837535
901,19017,492.5540450352788
901,19018,458.489289271722
903,13019,167.46632044684435
903,13020,353.16001204909657

the desired output:

901,19011,19017,19018
903,13019,13020

As you can see in the output I want to take firstnb column (901/903)

and put beside each one the secondnb I believe you can understand from the desired output better than my explanation :D

What I tried so far is the following:

import pandas as pd
import csv
df = pd.read_csv('test.csv')
    with open('neighborList.csv','w',newline='') as file:
        writer = csv.writer(file)
        secondStation = []
        for row in range(len(df)):
            firstStation = df['firstnb'][row]
            for x in range(len(df)):
                if firstStation == df['firstnb'][x]:
                    secondStation.append(df['secondnb'][x])
                    # line = firstStation ,secondStation
                    # writer.writerow(line)        
            print(firstStation,secondStation)
            secondStation = []

my code output this :

901 [19011, 19017, 19018]
901 [19011, 19017, 19018]
901 [19011, 19017, 19018]
903 [13019, 13020]
903 [13019, 13020]

3 Answers 3

2

Pandas has a built in function to do this, called groupby:

df = pd.read_csv(YOUR_CSV_FILE) 
df_grouped = list(df.groupby(df['firstnb'])) # group by first column

# chain keys and values into merged list
for key, values in df_grouped:
    print([key] + values['secondnb'].tolist())

Here I just print the sublists; you can save them into a new csv in any format you'd like (strings, ints, etc)

Sign up to request clarification or add additional context in comments.

Comments

2

First, I grouped the data by firstnb, creating a list of the values in secondnb using the aggregate function.

df[['firstnb','secondnb']].groupby('firstnb').aggregate(func=list).to_dict()

By turning this into a dict, we get:

{'secondnb': {901: [19011, 19017, 19018], 903: [13019, 13020]}}

I'm not entirely clear on what the final output should be (plain strings, lists, …), but from here on, it's easy to produce whatever you'd like.

For example, a list of lists:

intermediate = df[['firstnb','secondnb']].groupby('firstnb').aggregate(func=list).to_dict()

[[k] + v for k,v in intermediate['secondnb'].items()]

Result:

[[901, 19011, 19017, 19018], [903, 13019, 13020]]

Comments

1
def toList(a):
  res = []
  for r in a:
    res.append(r)
  return res
df.groupby('firstnb').agg(toList)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.