0

I am new to data processing using CSV module. And i have input fileInput Data Set And using this code`

import csv
path1 = "C:\\Users\\apple\\Downloads\\Challenge\\raw\\charity.a.data"
csv_file_path =          "C:\\Users\\apple\\Downloads\\Challenge\\raw\\output.csv.bak"

with open(path1, 'r') as in_file:
    in_file.__next__()
    stripped = (line.strip() for line in in_file)
    lines = (line.split(":$%:") for line in stripped if line)
    with open(csv_file_path, 'w') as out_file:
        writer = csv.writer(out_file)
        writer.writerow(('id', 'donor_id','last_name','first_name','year','city','state','postal_code','gift_amount'))
    writer.writerows(lines)

`Current Output File

Is it possible to remove (:) in the first and last column of csv file. And i want output be like Expected OUTPUT(After removing :) Please help me.

6
  • So you want us to do this for you? Do you have any code you've tried? Commented Apr 1, 2017 at 12:17
  • Just a notice. Keep in mind that the gift_amount column has commas (,) in the values, which means your dataset has to be tab (or something else other than comma) separated. As @Artagel said, please provide some code of what you have done so far. Commented Apr 1, 2017 at 12:35
  • My initial input is text file and the format is :id:$%:donor_id:$%:last_name:$%:first_name:$%:year:$%:city:$%:state:$%:postal_code:$%:gift_amount:$ :1:$%:10763:$%:Aaron and Shirley Family Foundation:$%:Aaron:$%:2017:$%:New York:$%:NY:$%:10065:$%:380.00: which is converted into csv file. Commented Apr 1, 2017 at 12:51
  • Using the above code i converted the text file into csv file. But i am not able to remove the colon in the first column and last column. Commented Apr 1, 2017 at 13:01
  • Yes it's possible, read docs.python.org/3/library/csv.html Commented Apr 1, 2017 at 13:24

1 Answer 1

1

If you just want to eliminate the ':' at the first and last column, this should work. Keep in mind that your dataset should be tab (or something other than comma) separated before you read it, because as I commented in your question, there are commas ',' in your dataset.

path1 = '/path/input.csv'
path2 = '/path/output.csv'

with open(path1, 'r') as input, open(path2, 'w') as output:
file = iter(input.readlines())
output.write(next(file))

for row in file:
    output.write(row[1:][:-2] + '\n')

Update

So after giving your code, I added a small change to do the whole process starting from the initial file. The idea is the same. You should just exclude the first and the last char of each line. So instead of line.strip() you should have line.strip()[1:][:-2].

import csv
path1 = "C:\\Users\\apple\\Downloads\\Challenge\\raw\\charity.a.data"
csv_file_path = "C:\\Users\\apple\\Downloads\\Challenge\\raw\\output.csv.bak"

with open(path1, 'r') as in_file:
    in_file.__next__()
    stripped = (line.strip()[1:][:-2] for line in in_file)
    lines = (line.split(":$%:") for line in stripped if line)
    with open(csv_file_path, 'w') as out_file:
        writer = csv.writer(out_file)
        writer.writerow(('id', 'donor_id','last_name','first_name','year','city','state','postal_code','gift_amount'))
        writer.writerows(lines)
Sign up to request clarification or add additional context in comments.

2 Comments

This code will ONLY work with the .csv file you create after doing your processing. I will include to my answer another solution that uses your code to do the whole process from the beginning.
I'm glad it worked! Please accept my solution if it answers your question :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.