How to format numbers without comma in csv using python? [duplicate]

Question

I am given a csv file which contains numbers ranging from 800 to 3000. The problem is numbers greater than thousand has a comma in them e.g. 1,227 or 1,074 or 2,403. When I want to calculate their mean, variance or standard deviation using scipy or numpy, I get error: ValueError: could not convert string to float: '1,227'. How convert them to numbers so that I could do calculations on them. CSV file should not be changed as it is read only file.

You haven't shown any code. Theres loads of ways to do this, depending on your actual approach when reading the csv — roganjosh
– roganjosh, Commented Oct 7, 2017 at 18:35
This isn't a formatting issue but rather a reading issue - how to load a csv into an array. stackoverflow.com/questions/6633523/… has replace and locale solutions. — hpaulj
– hpaulj, Commented Oct 7, 2017 at 19:21
How about writing a new version of the file without commas? tr -d ',' < originalFile.csv > noCommas.csv? — Mark Setchell
– Mark Setchell, Commented Oct 7, 2017 at 21:14
my_string=[val[2] for val in csvfile] my_float=[float(my_string.replace(',', '')) for i in my_string)] this is what I am trying to do. So my_string has string list. e.g. numbers with comma. I am trying to convert to my_float where replace would have worked. Since it is a list of strings, this code is not working. — Said Akbar
– Said Akbar, Commented Oct 7, 2017 at 23:04

Said Akbar · Accepted Answer · 2017-10-07 23:24:08Z

1

Thanks, guys! I fixed it by using replace function. hpaulj's link was useful.

my_string=[val[2] for val in csvtext]
my_string=[x.replace(',', '') for x in my_string]
my_float=[float(i) for i in my_string]

This is the code, in which, 1st line loads csv string list to my_string and 2nd line removes comma and 3rd line produces numbers that are easy for calculation. So, there is no need for editing the file or creating a new one. Just a list manipulation will do the job.

answered Oct 7, 2017 at 23:24

Said Akbar

4631 gold badge3 silver badges16 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Bart Van Loon · Accepted Answer · 2017-10-07 18:40:57Z

0

This really is a locale issue, but a simple solution would be to simply call replace on the string first:

a = '1,274'
float(a.replace(',',''))  # 1274.0

Another way is to use pandas to read the csv file. Its read_csv function has a thousands argument.

If you do know something about the locale, then it's probably best to use the locale.atof() function

edited Oct 7, 2017 at 18:40

answered Oct 7, 2017 at 18:35

Bart Van Loon

1,5309 silver badges18 bronze badges

4 Comments

roganjosh Over a year ago

Not if you use numpy to read in the CSV, or even the base CSV module. You need clarification from OP to hope to answer this.

Bart Van Loon Over a year ago

I agree. The question isn't very clear. However, the ValueError message does indicate that he is dealing with numbers as strings.

roganjosh Over a year ago

Then don't shoot for an answer. Ask for clarification first. Rep gain is secondary to providing something that's useful.

hpaulj Over a year ago

I found an old SO question that gives essentially these two answers. But if pandas is available, then I'd use that.

Collectives™ on Stack Overflow

How to format numbers without comma in csv using python? [duplicate]

2 Answers 2

Comments

4 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

4 Comments

Linked

Related