How to replace number in csv with a string with python

Question

I am trying to fix the first row of a CSV file. If column name in header starts from anything other than a-z, NUM has to be prepended. The following code fixes the special characters in each column of the first row but somehow can't get the !a-z.

path = ('test.csv')

for fname in glob.glob(path):

    with open(fname, newline='') as f:
        reader = csv.reader(f)
        header = next(reader) 
        header = [column.replace ('-','_') for column in header]
        header = [column.replace ('[!a-z]','NUM') for column in header]

what am I doing wrong. Please provide suggestions. Thanks

str.replace does not take regex patterns. You want re.sub instead. — Moses Koledoye
– Moses Koledoye, Commented Oct 24, 2017 at 17:30

utengr · Accepted Answer · 2017-10-24 18:31:54Z

1

You can do it like this.

# csv file: 
# 2Hello, ?WORLD
# 1, 2

import csv
with open("test.csv", newline='') as f:
    reader = csv.reader(f)
    header = next(reader)
    print("Original header", header)
    header = [("NUM" + header[indx][1::]) for indx in range(len(header)) if not header[indx][0].isalpha()]
    print("Modified header", header)

Output:

Original header ['2HELLO', '?WORLD']
Modified header ['NUMHELLO', 'NUMWORLD']

The above list comprehension is equivalent to the following for loop:

 for indx in range(len(header)):
        if not header[indx][0].isalpha():
            header[indx] = "NUM" + header[indx][1::]

If you want to replace only numbers, then use the following:

if header[indx][0].isdigit():

You can modify this according to your requirements in case if it changes based on many relevant string functions. https://docs.python.org/2/library/string.html

edited Oct 24, 2017 at 18:31

answered Oct 24, 2017 at 18:10

utengr

3,3755 gold badges34 silver badges72 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Stas Christiansen · Accepted Answer · 2017-10-24 17:41:13Z

0

I believe you would want to replace the 'column.replace' portion with something along these lines:

re.sub(r'[!a-z]', 'NUM', column)

The full documentation reference is here for specifics: https://docs.python.org/2/library/re.html https://www.regular-expressions.info/python.html

answered Oct 24, 2017 at 17:41

Stas Christiansen

2142 silver badges8 bronze badges

3 Comments

user8651755 Over a year ago

And the ! needs to be replaced with ^.

subash707 Over a year ago

well I replaced with header = re.sub([^a-z'], 'NUM', str(header)), but another issue is it splits each word column and throws them individually in separate columns.

user8651755 Over a year ago

You would need to do something like this to make the re.sub() approach work: re.sub(r'^([^a-z])', r'NUM\1', header)

user8651755 · Accepted Answer · 2017-10-24 17:58:49Z

0

Since you said you want to prepend 'NUM', you could do something like this (which could be more efficient, but this shows the basic idea).

import string

column = '123'

if column[0] not in string.ascii_lowercase:
    column = 'NUM' + column

# column is now 'NUM123'

answered Oct 24, 2017 at 17:58

user8651755

Collectives™ on Stack Overflow

How to replace number in csv with a string with python

3 Answers 3

Comments

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related