Python text file strings into columns in spreadsheet

Question

Huge newbie to python and this is probably really easy, but I can't get my head around it at all.

I have a text file with a number of rows following this format

 nothing doing    nothing[0]    doing[0] 
 hello world      hello[0]        world[2]

There are only spaces between the strings, no markers.

I'd like to extract these strings into excel file in the following format - so that each 'set' of strings are in a separate column.

           |        1      |       2        |       3
    ------------------------------------------------------
      1    | nothing doing |   nothing[0]   |  doing[0] 
    ------------------------------------------------------
      2    | hello world   |   hello[0]     |  world[2]

I've been looking at answers on here but they don't quite full fill this question.

Is the text file exactly like that? Are there tabs between like nothing doing\tnothing[0]\tdoing[0]? How do you differenciate between the first col with a space and the other two cols? — dawg
– dawg, Commented Jan 23, 2014 at 18:50
The text file is exactly like this. there are spaces between each set of strings. No markers. — user3220585
– user3220585, Commented Jan 23, 2014 at 19:22
Your desired output file doesn't seem to have any commas (or any other fixed delimiter, like semicolons or tabs), but does seem to have vertical alignments. IOW, it doesn't look much like a csv file. Is that exactly the format you want? If so, you can remove csv from the question, because neither the input nor the output are csv. — DSM
– DSM, Commented Jan 23, 2014 at 20:03
I just want them separated with commas or if opened in excel in separate columns — user3220585
– user3220585, Commented Jan 23, 2014 at 22:13
You can create excel spreadsheets using the python-excel package directly in python, I can post an answer on how I'd you'd like — wnnmaw
– wnnmaw, Commented Jan 24, 2014 at 17:13

wnnmaw · Accepted Answer · 2014-01-26 18:33:17Z

3

Alright, here's how you'd write to an actual Excel file. Note that my method of splitting isn't as complicated as others because this is mostly about writing to Excel. You'll need the python-excel package to do this.

>>> data = []
>>> with open("data.txt") as f:
...     for line in f:
...         data.append([word for word in line.split("  ") if word])
...
>>> print data
[['nothing doing', 'nothing[0]', 'doing[0]\n'], ['hello world', 'hello[0]', 'world[2]']]
>>>
>>> import xlwt
>>> wb = xlwt.Workbook()
>>> sheet = wb.add_sheet("New Sheet")
>>> for row_index in range(len(data)):
...     for col_index in range(len(data[row_index])):
...         sheet.write(row_index, col_index, data[row_index][col_index])
>>>
>>> wb.save("newSheet.xls")
>>>

This produces a workbook with one sheet called "New Sheet" that looks like this

Sample output

Hopefully this helps

answered Jan 26, 2014 at 18:33

wnnmaw

5,5343 gold badges41 silver badges63 bronze badges

Sign up to request clarification or add additional context in comments.

13 Comments

user3220585 Over a year ago

you mention the print data, there are over 600 rows to print!

wnnmaw Over a year ago

Then don't print it! :P I included that here so you can get a better idea of what I'm doing. Its not necessary for this to work

user3220585 Over a year ago

removed the print because that was just stupid of me :P but again, this just produces all of the row in one excel cell :/

wnnmaw Over a year ago

@user3220585 Did you make any changes aside from removing print?

user3220585 Over a year ago

no changes at all, other than renaming the text file to my own

|

kch · Accepted Answer · 2014-01-23 19:45:24Z

0

You could use numpy to read the txt file and csv to write it as csv file. The csv package among others allows you to choose the delimiter of your preference.

import numpy
import csv

data = numpy.loadtxt('txtfile.txt', dtype=str)

with open('csvfile.csv', 'w') as fobj:
    csvwriter = csv.writer(fobj, delimiter=',')
    for row in data:
        csvwriter.writerow(row)

answered Jan 23, 2014 at 19:45

kch

6986 silver badges10 bronze badges

2 Comments

user3220585 Over a year ago

numpy library. I've never heard of this. I presume it needs to be downloaded?

kch Over a year ago

it depends on the python distribution you use whether it is already installed or you need to install it. Python(x,y) includes numpy as far as I know.

DSM · Accepted Answer · 2014-01-24 16:22:09Z

0

Sometimes people who use mostly Excel get confused about the difference between how Excel displays its sheets and the csv representation in a file. Here, even though @martineau gave you exactly what you showed you wanted, I think what you're actually going to want is something more like:

import re, csv

with open("infile.txt") as fp_in, open("outfile.csv", "wb") as fp_out:
    writer = csv.writer(fp_out)
    for line in fp_in:
        row = re.split("\s\s+", line.strip())
        writer.writerow(row)

which will turn

$ cat infile.txt 
nothing doing    nothing[0]    doing[0] 
hello world      hello[0]        world[2]

into

$ cat outfile.csv 
nothing doing,nothing[0],doing[0]
hello world,hello[0],world[2]

answered Jan 24, 2014 at 16:22

DSM

355k67 gold badges606 silver badges504 bronze badges

2 Comments

Brian Schlenker Over a year ago

as long as there are guaranteed to be more than one space between columns

DSM Over a year ago

@BrianSchlenker: if that's not guaranteed, we'd have to come up with another rule to separate column from column, and that would require knowing more about the values themselves.

martineau · Accepted Answer · 2014-01-24 17:07:32Z

0

The following assumes that each "column" is separated by two or more space characters in a row and that they will never contain a comma in their content.

import csv
import re

splitting_pattern = re.compile(r" {2,}")  # two or more spaces in a row
input_filepath = 'text_file_strings.txt'
output_filepath = 'output.csv'

with open(input_filepath, 'rt') as inf, open(output_filepath, 'wb') as outf:
    writer = csv.writer(outf, dialect='excel')
    writer.writerow([''] + range(1, 4))  # header row
    for i, line in enumerate(inf, 1):
        line = splitting_pattern.sub(',', line.strip())
        writer.writerow([i] + line.split(','))

Contents ofoutput.csvfile created:

,1,2,3
1,nothing doing,nothing[0],doing[0]
2,hello world,hello[0],world[2]

edited Jan 24, 2014 at 17:07

answered Jan 23, 2014 at 22:33

martineau

124k29 gold badges181 silver badges319 bronze badges

2 Comments

user3220585 Over a year ago

this is currently exporting each row into one cell and not all the row is being displayed?

martineau Over a year ago

With the additional information you've added to your question, my updated answer should correct those problems.

Collectives™ on Stack Overflow

Python text file strings into columns in spreadsheet

4 Answers 4

13 Comments

2 Comments

2 Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

13 Comments

2 Comments

2 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related