0

I have a text file that has numerous lines. I want to extract certain lines and write them to a CSV file. However, I want to write particular lines to the same row in the CSV file. For example, my text file is like this:

Name= Sarah F
Location= Baltimore MD
Name= Bob M
Location= Sacramento CA
Name= Tom M NY
Location= Brooklyn NY
Name= Anne F
Location= Morristown NJ

My CSV file I want to generate will include the name of the person, their sex, the city and state they reside in:

Sarah,F,Baltimore,MD
Bob,M,Sacramento,CA
Tom,M,Brooklyn,NY
Anne,F,Morristown,NJ

When I use csv.writerows([list]) I get the names,sex and the city,state written in separate rows:

Sarah,F
Baltimore,MD
Bob,M
Sacramento,CA
Tom,M
Brooklyn,NY
Anne,F
Morristown,NJ

When I try to append to the list with: [name, sex] the city and state the override the original list instead of appending.

Here is my code to do this:

import csv

file = open("file_to_use.txt", 'r')
csv_file = open("file_to_write.csv", 'wb')
writer = csv.writer(csv_file)

Row_lines =[]

for line in file: 

    if line.startswith("Name="):
        name_line = line.replace(" ", ",")
        name_line = name_line.strip("\n")

        Row_lines.append(name_line)

    if line.startswith("Location="):
        loc_line = line.replace(" ", ",")
        loc_line = loc_line.strip("\n")                

        Row_lines.append(loc_line)

    writer.writerows(Row_lines)

csv_file.close()

I know I have some logical order in the incorrect place, but I can't seem to figure it out.

1
  • Are Name= and Location= lines always alternating in the input file? Are there any other lines in the input file or only those two types? Commented Feb 28, 2016 at 16:15

4 Answers 4

2

There are two parts to your task. First is joining the rows, you can use zip for that:

with open(inputfile) as propsfile:
    data = [row.split("=")[1].split() for row in propsfile]

# join two at a time
data_tuples = zip(data[::2], data[1::2])

Second is writing the rows, you can use the csv module for that:

import csv
with open(outputfile, 'w') as csvfile:
    writer = csv.writer(csvfile)
    writer.writerows([name+location for name, location in data_tuples])

Now we have the data in outputfile:

Sarah,F,Baltimore,MD
Bob,M,Sacramento,CA
...
Sign up to request clarification or add additional context in comments.

Comments

1

You are adding two different rows to Row_lines which represent one single csv row, you should add only one row to Row_lines for each row.

Comments

1

Each time you call Row_lines.append(), you are adding a new item to the list. Each item in the list is written as a separate line when you call writer.writerows(Row_lines).

Each time you encounter a name line, you should create a new string from that line, but don't add it to the Row_lines list yet. Each time you encounter a location line, you should append it to the name line string, creating a complete row which you can now add to the Row_lines list.

And instead of calling writerows() on each iteration of the loop, you should call it once after you have compiled the full list of rows.

import csv

file = open("file_to_use.txt", 'r')
csv_file = open("file_to_write.csv", 'wb')
writer = csv.writer(csv_file)

Row_lines =[]

for line in file: 

    if line.startswith("Name="):
        name_line = line.replace(" ", ",")
        name_line = name_line.strip("\n")

        # start building the new line
        current_line = name_line

    if line.startswith("Location="):
        loc_line = line.replace(" ", ",")
        loc_line = loc_line.strip("\n")                

        # append the extra fields to the current line
        current_line = current_line + ',' + loc_line

        # add the current line to the output list
        Row_lines.append(current_line)

# call this after you have added
# all lines, not after each one
writer.writerows(Row_lines)

csv_file.close()

Comments

0

Here is a code that does not use any external libraries.

Since all your lines are not necessarily consistent (e.g. "Name= Tom M NY" - NY should probably not be there), this code looks at the 2 first data entries following "Name=" or "Location=" and ignores any subsequent entries (like "NY" in the example above).

# Opening the file to use
input_file = open(r"C:\Temp\file_to_use.txt", 'r')

# Creating an empty CSV file
output_file = open(r"C:\Temp\output.csv", 'w')

# Going through the text file, it is checking whether the line holds name or location information
# If it holds location information, all saved information so far is saved to the CSV file
for line in input_file:
    line = line.split("=")
    if line[0] == "Name":
        first_name, last_name = line[1].strip().split()[:2]
    elif line[0] == "Location":
        city, state = line[1].strip().split()[:2]
        output_file.write('%s,%s,%s,%s\n' % (first_name, last_name, city, state))

# Closes the opened files
input_file.close()
output_file.close()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.