0

The application I use generates data in a dataframe which I need to use upon request.

It looks similar to this.

<class 'pandas.core.frame.DataFrame'>
             E         Gg        gnx2    J chs lwave J_ID
0    27.572025  82.308581    7.078391  3.0   1   [0]    1
1    46.387728  77.029548   58.112338  3.0   1   [0]    1
2    75.007554  82.087407    0.535442  3.0   1   [0]    1

Everything worked perfectly while I didn't try to use dataframes saved in separate files before. Because when I am trying to use the data after loading - I got errors about data types for the columns which contain arrays. (lvawe for example) is an array and when saved in csv the information about data type is lost.

#saving the data to csv
csv_filename = "ladder.csv"
ladder.to_csv(csv_filename)

So when loading a dataframe next time to use the data I can't get access to array elements like it should.

Because as I understand data in this column is loaded like string. After loading the data through load_csv I get this for the data types:

Unnamed: 0      int64
E             float64
Gg            float64
gnx2          float64
J             float64
chs             int64
lwave          object
J_ID            int64
dtype: object

How can I resolve this issue? How can I correctly load the data with the correct data type or maybe explicitly assign a data type to a column after loading?

2 Answers 2

0

In the read_csv function, you can manually assign data types to your new columns. Pass in a dictionary of column name --> preferred data type.

data_type_mapping = {‘a’: np.float64, ‘b’: np.int32, ‘c’: ‘Int64’}
my_df = pd.read_csv('myfile.csv', dtypes = data_type_mapping)

From pandas documentation:

Data type for data or columns. E.g. {‘a’: np.float64, ‘b’: np.int32, ‘c’: ‘Int64’} Use str or object together with suitable na_values settings to preserve and not interpret dtype. If converters are specified, they will be applied INSTEAD of dtype conversion.

Sign up to request clarification or add additional context in comments.

11 Comments

Hi, thanks for your reply. It seems correct, but I don't know how to apply it in my case. The code I have tried is listed below.
Tried to use this one, @StonedTensor, data_type_mapping = {... 'lwave': np.ndarray, 'J_ID': np.int64 } filename = isotope_name + '_Emin_10_Emax_1000_2022.10.30_ladder.csv' resonance_ladder = pd.read_csv(filename, dtype = data_type_mapping) But I am having error : TypeError: dtype '<class 'numpy.ndarray'>' not understood I don't get which dtype I need to use..
What is an example of what lwave is supposed to look like? What is its type before you save the dataframe originally?
sorry for my bad understanding of stackoverflow :) I don't get how to add code or markup in the comments. Here is an example of the data in lwave column [1.0, 1.0]
The data types before saving looks like this: print(resonance_ladder.dtypes) E object ... chs object lwave object J_ID object dtype: object
|
0

Question was resolved by the use of json.loads feature.

#modifying the ladder using json

modified = ladder_df.lwave.apply(json.loads)
ladder_df['lwave'] = modified

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.