How to split row of data into columns separated by comma in python

Question

I have a text file in the following format that I am trying to convert into rows and columns:

red,red,blue
blue,red,blue 
blue,blue,red

Once the conversion is complete, I want to store the above in a rows variable:

row[0] # should return 'red red blue'
row[0][2] # should return 'blue'

So far I have gotten as far as:

file = open('myfile.txt')
for row in file:
    # do something here

But i'm not sure what to do next.. can someone help? Thanks in advance!

Use the csv module. You've tagged this with numpy; are you trying to do this with pure python or numpy? — roganjosh
– roganjosh, Commented Oct 14, 2017 at 18:11

jezrael · Accepted Answer · 2017-10-14 18:37:37Z

1.numpy solution: (because numpy tag)

Use numpy.genfromtxt for numpy array:

import numpy as np
arr = np.genfromtxt('file.txt',dtype='str',delimiter=',')
print (arr)
[['red' 'red' 'blue']
 ['blue' 'red' 'blue']
 ['blue' 'blue' 'red']]

print (arr[0])
['red' 'red' 'blue']

print (arr[0][2])
blue

2.pandas solution:

Use read_csv for DataFrame and for select values loc:

import pandas as pd

df = pd.read_csv('file.txt', header=None)
print (df)
      0     1      2
0   red   red   blue
1  blue   red   blue
2  blue  blue    red

#select first row to Series
print (df.loc[0])
0     red
1     red
2    blue
Name: 0, dtype: object

#select value by index and column
print (df.loc[0, 2])
blue

3.pure python solutions:

If want nested lists use nested list comprehension:

data = [[item for item in line.rstrip('\r\n').split(',')] 
         for line in open('file.txt')]
print (data)

[['red', 'red', 'blue'], ['blue', 'red', 'blue'], ['blue', 'blue', 'red']]

Or with module csv:

import csv

reader = csv.reader(open("file.txt"), delimiter=',')
data = [word for word in [row for row in reader]]
print (data)

[['red', 'red', 'blue'], ['blue', 'red', 'blue'], ['blue', 'blue', 'red']]

print (data[0])
['red', 'red', 'blue']

print (data[0][2])
blue

Stack · Accepted Answer · 2017-10-14 18:13:30Z

5

Solution without any external modules :

output = []

with open('file.txt', 'r') as reading:
    file_input = reading.read().split('\n')

for row in file_input:
    output.append(row.split(','))

print(output)

answered Oct 14, 2017 at 18:13

Stack

4,6363 gold badges20 silver badges24 bronze badges

10 Comments

roganjosh Over a year ago

Why wouldn't you use the csv module? It's part of core python and anyway, the question is tagged with numpy.

Elazar Over a year ago

Just a nit, in case someone is going to copy-paste this: output can be initialized just before the loop, and even better, using a list comprehension.

Stack Over a year ago

yes you can optimize the code, just a simple quick solution. You can suggest edits if you want.

Stack Over a year ago

Yes you can, i was referring to numpy and pandas @roganjosh

roganjosh Over a year ago

I don't get what you're saying. Ok, pandas and numpy are separate, but if you're doing this in pure python you could still use the csv module.

|

RomanPerekhrest · Accepted Answer · 2017-10-14 18:15:18Z

1

Alternative solution with pandas module which is good for csv files processing:

import pandas as pd

df = pd.read_csv('file.txt', header=None).T

print(df[0].tolist())    # ['red', 'red', 'blue']
print(df[0][2])          # blue

answered Oct 14, 2017 at 18:15

RomanPerekhrest

93.1k4 gold badges75 silver badges112 bronze badges

Collectives™ on Stack Overflow

How to split row of data into columns separated by comma in python

3 Answers 3

Comments

10 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

10 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related