0

I have a large CSV file with over 200+ columns. Some of the columns are string, some varchar, some integers and some floats.

When i just read my csv file into a pandas dataframe, it is able to detect which are the numerical columns. However, it will give me the specify dtype or low memory error warning.

df = pd.read_csv('myfile.csv')
df_not_num = df_raw.select_dtypes(exclude =[np.number,np.int16,np.bool,np.float32])
print len(df)
>>>200
print len(list(df_not_num))
>>> 10

Then i try to specify a dtype: dtype='unicode' But this causes all my columns to be objects. It is too much manual work to speicfy each dtype per column name when reading the CSV into a dataframe.

pd.read_csv('myfile.csv', dtype = 'unicode')
df_not_num = df_raw.select_dtypes(exclude =[np.number,np.int16,np.bool,np.float32])
print len(df)
>>>>200
print len(list(df_not_num))
>>> 200

So the only way to avoid the low memory warning is to specify a dtype. But how do i specify that i have mixed dtypes for different columns without having to manually specify the dtype of each of the 200 columns?

2
  • Just specifying "mixed types" won't help read_csv. You either have to specify particular types for some columns by passing a dict, e.g.: {‘a’: np.float64, ‘b’: np.int32} or specify one dtype, which will try to be applied to all columns, or none. Also, there is no "varchar" type in Python. Commented Feb 27, 2017 at 23:42
  • Possible duplicate of Pandas read_csv low_memory and dtype options Commented Feb 27, 2017 at 23:44

1 Answer 1

2

You can read just the first row from the csv to have the list of column names:

col_names = pd.read_csv('file.csv', nrows=0).columns.tolist()

Then transform it into a dictionary dtypes_dict={col_name: dtype} based on the conditions you need.

Then use the dictionary of dtypes during reading:

pd.read_csv('file.csv', dtype=dtypes_dict)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.