cannot convert string to numbers in pandas.read_excel

Question

Issue

I have an excel file in German format. It looks like this

I want to read the first column as numbers into pandas using the flowing code:

import pandas as pd
import numpy as np
tmp = pd.read_excel("test.xlsx", dtype = {"col1": np.float64})

It gives me the error

ValueError: Unable to convert column col1 to type <class 'numpy.float64'>

The issue is in excel. If I modify the col1 manuelly to number format, it solves the issue. See this new excel file:

Approach

I can first read col1 as object into pandas, then I need to replace , to ., at the last I can change the string to float.

However

The approach is tedious. How can I solve this problem more efficiently?

perhaps this answer might help you pandas.pydata.org/pandas-docs/stable/reference/api/… — sophocles
– sophocles, Commented Apr 15, 2021 at 10:30

norie · Accepted Answer · 2021-04-15 10:33:42Z

2

Unfortunately, there is no way to tell pandas what decimal separator is being used.

What you could do though is create a function to do the conversion and pass it to read_excel as part of the converters argument.

def fix_decimal(num):
### convert numeric value with comma as decimal separator to float
  print(num)
  return float(num.replace(',', '.')) if num else 0
  
tmp = pd.read_excel("test.xlsx", converters={0: fix_decimal} )

answered Apr 15, 2021 at 10:33

norie

9,9372 gold badges14 silver badges19 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

cannot convert string to numbers in pandas.read_excel

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related