I have a CSV file of interview transcripts exported from an h5 file. When I read the rows into python, the output looks something like this:
line[0]=['title,date,responses']
line[1]=['[\'Transcript 1 title\'],"[\' July 7, 1997\']","[ '\nms. vogel: i look at all sectors of insurance, although to date i\nhaven\'t really focused on the reinsurers and the brokers.\n']']
line[2]=['[\'Transcript 2 title\'],"[\' July 8, 1997\']","[ '\nmr. tozzi: i formed cambridge in 1981. we are top-down sector managers,\nconstantly searching for non-consensus companies and industries.\n']']
etc...
I'd like to extract the text from the "responses" column ONLY into separate .txt files for every row in the CSV file, saving the .txt files into a specified directory and naming them as "t1.txt", "t2.txt", etc. according to the row number. The CSV file has roughly 30K rows.
Drawing from what I've already been able to find online, this is the code I have so far:
import csv
with open("twst.csv", "r") as f:
reader = csv.reader(f)
rownumber = 0
for row in reader:
g=open("t"+str(rownumber)+".txt","w")
g.write(row)
rownumber = rownumber + 1
g.close()
My biggest problem is that this pulls all columns from the row into the .txt file, but I only want the text from the "responses" column. Once I have that, I know I can loop through the various rows in the file (right now, what I have set up is just to test the first row), but I haven't found any guidance on pulling specific columns in the python documentation. I'm also not familiar enough with python to figure out the code on my own.
Thanks in advance for the help!