4

I have a text file in the following format that I am trying to convert into rows and columns:

red,red,blue
blue,red,blue 
blue,blue,red

Once the conversion is complete, I want to store the above in a rows variable:

row[0] # should return 'red red blue'
row[0][2] # should return 'blue'

So far I have gotten as far as:

file = open('myfile.txt')
for row in file:
    # do something here

But i'm not sure what to do next.. can someone help? Thanks in advance!

2
  • Use the csv module. You've tagged this with numpy; are you trying to do this with pure python or numpy? Commented Oct 14, 2017 at 18:11
  • Take a look at numpy genfromtxt() Commented Oct 14, 2017 at 18:13

3 Answers 3

6

1.numpy solution: (because numpy tag)

Use numpy.genfromtxt for numpy array:

import numpy as np
arr = np.genfromtxt('file.txt',dtype='str',delimiter=',')
print (arr)
[['red' 'red' 'blue']
 ['blue' 'red' 'blue']
 ['blue' 'blue' 'red']]

print (arr[0])
['red' 'red' 'blue']

print (arr[0][2])
blue

2.pandas solution:

Use read_csv for DataFrame and for select values loc:

import pandas as pd

df = pd.read_csv('file.txt', header=None)
print (df)
      0     1      2
0   red   red   blue
1  blue   red   blue
2  blue  blue    red

#select first row to Series
print (df.loc[0])
0     red
1     red
2    blue
Name: 0, dtype: object

#select value by index and column
print (df.loc[0, 2])
blue

3.pure python solutions:

If want nested lists use nested list comprehension:

data = [[item for item in line.rstrip('\r\n').split(',')] 
         for line in open('file.txt')]
print (data)

[['red', 'red', 'blue'], ['blue', 'red', 'blue'], ['blue', 'blue', 'red']]

Or with module csv:

import csv

reader = csv.reader(open("file.txt"), delimiter=',')
data = [word for word in [row for row in reader]]
print (data)

[['red', 'red', 'blue'], ['blue', 'red', 'blue'], ['blue', 'blue', 'red']]

print (data[0])
['red', 'red', 'blue']

print (data[0][2])
blue
Sign up to request clarification or add additional context in comments.

Comments

5

Solution without any external modules :

output = []

with open('file.txt', 'r') as reading:
    file_input = reading.read().split('\n')

for row in file_input:
    output.append(row.split(','))

print(output)

10 Comments

Why wouldn't you use the csv module? It's part of core python and anyway, the question is tagged with numpy.
Just a nit, in case someone is going to copy-paste this: output can be initialized just before the loop, and even better, using a list comprehension.
yes you can optimize the code, just a simple quick solution. You can suggest edits if you want.
Yes you can, i was referring to numpy and pandas @roganjosh
I don't get what you're saying. Ok, pandas and numpy are separate, but if you're doing this in pure python you could still use the csv module.
|
1

Alternative solution with pandas module which is good for csv files processing:

import pandas as pd

df = pd.read_csv('file.txt', header=None).T

print(df[0].tolist())    # ['red', 'red', 'blue']
print(df[0][2])          # blue

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.