3

I have a 21840x39 data frame. A few of my columns are numerically valued and I want to make sure they are all in the same data type (which I want to be a float).

Instead of naming all the columns out and converting them:

df[['A', 'B', 'C', '...]] = df[['A', 'B', 'C', '...]].astype(float)

Can I do a for loop that will allow me to say something like " convert to float from column 18 to column 35"

I know how to do one column: df['A'] = df['A'].astype(float)

But how can I do multiple columns? I tried with list slicing within a loop but couldn't get it right.

2 Answers 2

5

First idea is convert selected columns, python counts from 0, so for 18 to 36 columns use:

df.iloc[:, 17:35] = df.iloc[:, 17:35].astype(float)

If not working (because possible bug) use another solution:

df = df.astype(dict.fromkeys(df.columns[17:35], float))

Sample - convert 8 to 15th columns:

np.random.seed(2020)
df = pd.DataFrame(np.random.randint(10, size=(3, 18)),
                  columns=list('abcdefghijklmnopqr')).astype(str)
print (df)
   a  b  c  d  e  f  g  h  i  j  k  l  m  n  o  p  q  r
0  0  8  3  6  3  3  7  8  0  0  8  9  3  7  2  3  6  5
1  0  4  8  6  4  1  1  5  9  5  6  6  6  5  4  6  4  2
2  3  4  7  1  4  9  3  2  0  9  1  2  7  1  0  2  8  8

df = df.astype(dict.fromkeys(df.columns[7:15], float))
print (df)
   a  b  c  d  e  f  g    h    i    j    k    l    m    n    o  p  q  r
0  0  8  3  6  3  3  7  8.0  0.0  0.0  8.0  9.0  3.0  7.0  2.0  3  6  5
1  0  4  8  6  4  1  1  5.0  9.0  5.0  6.0  6.0  6.0  5.0  4.0  6  4  2
2  3  4  7  1  4  9  3  2.0  0.0  9.0  1.0  2.0  7.0  1.0  0.0  2  8  8
Sign up to request clarification or add additional context in comments.

1 Comment

i think it should be 36, since iloc does a slice up till -1 of the ending entry
1

Tweaked @jezrael code as typing in column names (I feel) is a good option.

import pandas as pd
import numpy as np

np.random.seed(2020)
df = pd.DataFrame(np.random.randint(10, size=(3, 18)),
                  columns=list('abcdefghijklmnopqr')).astype(str)

print(df)

columns = list(df.columns)

#change the first and last column names below as required
df = df.astype(dict.fromkeys(
    df.columns[columns.index('h'):(columns.index('o')+1)], float))

print (df)

Leaving the original answer below here but note: Never loop in pandas if vectorized alternatives exist

If I had a dataframe and wanted to change columns 'col3' to 'col5' (human readable names) to floats I could...

import pandas as pd
import re

df = pd.read_csv('dummy_data.csv')

df

enter image description here

columns = list(df.columns)

#change the first and last column names below as required
start_column = columns.index('col3')
end_column   = columns.index('col5')

for index, col in enumerate(columns):
    if (start_column <= index) & (index <= end_column):
        df[col] = df[col].astype(float)
df

enter image description here

...by just changing the column names. Perhaps it's easier to work in column names and 'from this one' and 'to that one' (inclusive).

1 Comment

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.