0

I have python code as below:

import os
from os import listdir

def find_csv_filenames( path_to_dir, suffix=".csv" ):
    filenames = listdir(path_to_dir)
    return [ filename for filename in filenames if filename.endswith( suffix ) ]
    #always got the error this below code
filenames = find_csv_filenames('C:\casperjs\project\teleservices\csv')
for name in filenames:
    print name

I meet the error :

filenames = find_csv_filenames('C:\casperjs\project\teleservices\csv')
Error message: `TabError: inconsistent use of tabs and spaces in indentation`

What I need : I want to read all csv files and convert it from encoding ansi to utf8 but the code above is only read path of each csv files. I don't know what's wrong with it?

4
  • Format your code and post full error message, please. Commented Dec 12, 2013 at 8:12
  • ok thanks now I already show you the error message. Commented Dec 12, 2013 at 8:22
  • 2
    You should fix indentation at first. Commented Dec 12, 2013 at 8:30
  • Does the error gone after formatting? Commented Dec 12, 2013 at 8:51

3 Answers 3

1

Below will convert each line in ascii-file:

import os
from os import listdir

def find_csv_filenames(path_to_dir, suffix=".csv" ):
    path_to_dir = os.path.normpath(path_to_dir)
    filenames = listdir(path_to_dir)
    #Check *csv directory
    fp = lambda f: not os.path.isdir(path_to_dir+"/"+f) and f.endswith(suffix)
    return [path_to_dir+"/"+fname for fname in filenames if fp(fname)]

def convert_files(files, ascii, to="utf-8"):
    for name in files:
        print "Convert {0} from {1} to {2}".format(name, ascii, to)
        with open(name) as f:
            for line in f.readlines():
                pass
                print unicode(line, "cp866").encode("utf-8")    

csv_files = find_csv_filenames('/path/to/csv/dir', ".csv")
convert_files(csv_files, "cp866") #cp866 is my ascii coding. Replace with your coding.
Sign up to request clarification or add additional context in comments.

Comments

0

Refer to documentation: http://docs.python.org/2/howto/unicode.html

If you need a string, say it is stored as s, that you want to encode as a specific format, you use s.encode()

Comments

0

Your code is just listing csv files. It doesn't do anything with it. If you need to read it, you can use the csv module. If you need to manage encoding, you can do something like this:

import csv, codecs
def safe_csv_reader(the_file, encoding, dialect=csv.excel, **kwargs):
    csv_reader = csv.reader(the_file, dialect=dialect, **kwargs)
    for row in csv_reader:
        yield [codecs.decode(cell, encoding) for cell in row]

reader = safe_csv_reader(csv_file, "utf-8", delimiter=',')
for row in reader:
    print row

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.