2

I want to remove commas from a column named size. CSV looks like below:

number   name    size
   1      Car     9,32,123
   2      Bike    1,00,000
   3      Truck   10,32,111

I want the output as below:

number   name    size
   1      Car     932123
   2      Bike    100000
   3      Truck   1032111

I am using python3 and Pandas module for handling this csv. I am trying replace method but I don't get the desired output.

Snapshot from my code :

import pandas as pd

df = pd.read_csv("file.csv")

// df.replace(",","")
// df['size'] = df['size'].replace(to_replace = "," , value = "")
// df['size'] = df['size'].replace(",", "")

df['size'] = df['size'].replace({",", ""})
print(df['size'])  // expecting to see 'size' column without comma

I don't see any error/exception. The last line print(df['size']) simply displays values as it is, ie, with commas.

2
  • Are you sure that sample you showed represents actual file content? Commented Jun 9, 2021 at 14:09
  • @OlvinRoght it is just a sample. Commented Jun 9, 2021 at 14:11

3 Answers 3

4

With replace, we need regex=True because otherwise it looks for exact match in a cell, i.e., cells with , in them only:

>>> df["size"] = df["size"].replace(",", "", regex=True)
>>> df

   number   name     size
0       1    Car   932123
1       2   Bike   100000
2       3  Truck  1032111
Sign up to request clarification or add additional context in comments.

Comments

4

I am using python3 and Pandas module for handling this csv

Note that pandas.read_csv function has optional argument thousands, if , are used for denoting thousands you might set thousands="," consider following example

import io
import pandas as pd
some_csv = io.StringIO('value\n"1"\n"1,000"\n"1,000,000"\n')
df = pd.read_csv(some_csv, thousands=",")
print(df)

output

     value
0        1
1     1000
2  1000000

For brevity I used io.StringIO, same effect might be achieved providing name of file with same content as first argument in io.StringIO.

1 Comment

by far the easiest solution!
3

Try with str.replace instead:

df['size'] = df['size'].str.replace(',', '')

Optional convert to int with astype:

df['size'] = df['size'].str.replace(',', '').astype(int)
   number   name     size
0       1    Car   932123
1       2   Bike   100000
2       3  Truck  1032111

Sample Frame Used:

df = pd.DataFrame({'number': [1, 2, 3], 'name': ['Car', 'Bike', 'Truck'],
                   'size': ['9,32,123', '1,00,000', '10,32,111']})
   number   name       size
0       1    Car   9,32,123
1       2   Bike   1,00,000
2       3  Truck  10,32,111

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.