1

I have a .csv file and I am trying to erase some rows/lines that have no usable information. I want to delete lines that do not have a value in a certain column. I am kinda new to programming and I could not find a way to do this. Is this possible?

I tried to delete a line if it did not have a certain number in it but that did not work as wel.

f = open('C:myfile.csv', 'rb')
lines = f.readlines()
f.close()

filename = 'myfile.csv'

f = open(filename, 'wb')
for line in lines:
    if line != "1":
        f.write(line)

f.close()

here are some sample rows:

0,593   0,250984    -20,523384  -25,406271
0,594   0,250984        
0,595   0,250984        
0,596   0,250984        
0,597   0,250984    -15,793088  -21,286336
0,598   0,250984        
0,599   0,908811        
0,6     0,893612        
0,601   0,784814    -12,130922  -11,825742
0,602   0,909238        
0,603   0,25309     
0,604   0,38435     
0,605   0,602954    -8,316167   -3,43328
0,606   0,642628        
0,607   0,39201     
0,608   0,384289        
0,609   0,251656    -11,825742  -5,874723

So I want to delete the rows when there is no number in the third and fourth column.

3
  • 1
    Have you tried coding this yet? Can you share your code and explain what is currently not working out for you? Commented Mar 24, 2016 at 13:53
  • can you show us some sample rows? you tagged your question with python, any sample python code and/or research? Commented Mar 24, 2016 at 13:54
  • Please don't put code in comments — edit your question and add it. Commented Mar 24, 2016 at 13:59

2 Answers 2

1

You can use Python's csv library to help you do this. Your data appears to be tab delimited, as such the following script should work:

import csv

with open('input.csv', 'rb') as f_input, open('output.csv', 'wb') as f_output:
    csv_output = csv.writer(f_output, delimiter = '\t')

    for row in csv.reader(f_input, delimiter = '\t'):
        if len(row[2]) and len(row[3]):
            csv_output.writerow(row)

Giving you an output.csv file containing:

0,593   0,250984    -20,523384  -25,406271
0,597   0,250984    -15,793088  -21,286336
0,601   0,784814    -12,130922  -11,825742
0,605   0,602954    -8,316167   -3,43328
0,609   0,251656    -11,825742  -5,874723

Note, each of your rows appears to have 4 columns (your data has tabs for these missing entries), because of this, it is not enough to simply test the length is 4. You need to test the contents of the two cells.

Sign up to request clarification or add additional context in comments.

Comments

0
import csv

fn_in = 'test.csv'
fn_out = 'outfile.csv'

with open(fn_in, 'r') as inp, open(fn_out, 'w') as out:
    writer = csv.writer(out)
    for row in csv.reader(inp):
        if len(row)==4:
            writer.writerow(row)

1 Comment

Please explain your code to make your answer more helpful for others.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.