Read in a specific way from a csv file with pandas python

Question

I have a data in a csv file here is a sample:

firstnb,secondnb,distance
901,19011,459.73618164837535
901,19017,492.5540450352788
901,19018,458.489289271722
903,13019,167.46632044684435
903,13020,353.16001204909657

the desired output:

901,19011,19017,19018
903,13019,13020

As you can see in the output I want to take firstnb column (901/903)

and put beside each one the secondnb I believe you can understand from the desired output better than my explanation :D

What I tried so far is the following:

import pandas as pd
import csv
df = pd.read_csv('test.csv')
    with open('neighborList.csv','w',newline='') as file:
        writer = csv.writer(file)
        secondStation = []
        for row in range(len(df)):
            firstStation = df['firstnb'][row]
            for x in range(len(df)):
                if firstStation == df['firstnb'][x]:
                    secondStation.append(df['secondnb'][x])
                    # line = firstStation ,secondStation
                    # writer.writerow(line)        
            print(firstStation,secondStation)
            secondStation = []

my code output this :

901 [19011, 19017, 19018]
901 [19011, 19017, 19018]
901 [19011, 19017, 19018]
903 [13019, 13020]
903 [13019, 13020]

Stefan · Accepted Answer · 2022-03-21 15:16:49Z

2

Pandas has a built in function to do this, called groupby:

df = pd.read_csv(YOUR_CSV_FILE) 
df_grouped = list(df.groupby(df['firstnb'])) # group by first column

# chain keys and values into merged list
for key, values in df_grouped:
    print([key] + values['secondnb'].tolist())

Here I just print the sublists; you can save them into a new csv in any format you'd like (strings, ints, etc)

answered Mar 21, 2022 at 15:16

Stefan

9575 silver badges14 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

DocZerø · Accepted Answer · 2022-03-21 15:18:26Z

2

First, I grouped the data by firstnb, creating a list of the values in secondnb using the aggregate function.

df[['firstnb','secondnb']].groupby('firstnb').aggregate(func=list).to_dict()

By turning this into a dict, we get:

{'secondnb': {901: [19011, 19017, 19018], 903: [13019, 13020]}}

I'm not entirely clear on what the final output should be (plain strings, lists, …), but from here on, it's easy to produce whatever you'd like.

For example, a list of lists:

intermediate = df[['firstnb','secondnb']].groupby('firstnb').aggregate(func=list).to_dict()

[[k] + v for k,v in intermediate['secondnb'].items()]

Result:

[[901, 19011, 19017, 19018], [903, 13019, 13020]]

answered Mar 21, 2022 at 15:18

DocZerø

8,61511 gold badges44 silver badges75 bronze badges

Comments

faire · Accepted Answer · 2022-03-21 15:19:48Z

1

def toList(a):
  res = []
  for r in a:
    res.append(r)
  return res

df.groupby('firstnb').agg(toList)

answered Mar 21, 2022 at 15:19

faire

1396 bronze badges

Collectives™ on Stack Overflow

Read in a specific way from a csv file with pandas python

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related