How to import a csv with multiple data into different dataframes?

Question

Here is the csv that i have as an input:

I would like to create 3 dataframes for each variable: BEL AIR - feeder_30 - MANGUIER, BEL_TC_30_TR3_MW and BEL AIR - feeder_30 - MTOA. It will have Date value as an index and Valeur as values.

I tried this :

data_adms = pd.read_csv(full_path,sep=';',index_col=False, header=6)

But it gets rid of the name of nom du point distant. And the timestamp is not the same between rows.

Does somebody have an idea how can I properly proceed?

Hi, is it safe to assume that the number of columns between each dataset (from 'source' to the next 'one' is the same? Also, the header you want to retain is on the sixth line, right? — Laurent
– Laurent, Commented Oct 19, 2021 at 4:52
Thanks for the edit ! Yes the number of columns between each dataset is the same. On the 7th line yes. But I will need to pass the name of each dataset (in cell B2, F2, ....) — Thomas LESIEUR
– Thomas LESIEUR, Commented Oct 20, 2021 at 9:43

Laurent · Accepted Answer · 2021-10-20 18:48:45Z

So, given the following csv file:

                     0                               1   ...          10           11
0                Source                PUISSANCE ACTIVE  ...         NaN          NaN
1  Nom du point distant  BEL AIR - feeder_30 - MANGUIER  ...         NaN          NaN
2          Propriétaire                Puissance active  ...         NaN          NaN
3                 Unité                              Kw  ...         NaN          NaN
4                 Phase                        L1 L2 L3  ...         NaN          NaN
5       Echantillonnage      Moyenne (Interval : 01:00)  ...         NaN          NaN
6                  Date                          Valeur  ...     Qualité  Indicateurs
7      01/09/2020 13:53                      5189,60325  ...  Discutable          NaN
8      02/09/2020 13:54                      5043,68066  ...  Discutable          NaN
9      03/09/2020 13:55                      4805,71191  ...  Discutable          NaN

You could get the names and col numbers like this:

raw_data = pd.read_csv(
    filepath_or_buffer="./file.csv", sep=";", header=None, engine="python"
)
data = {raw_data.loc[1, x]: x for x in range(1, raw_data.shape[1], 4)}
print(data)
# Outputs
{'BEL AIR - feeder_30 - MANGUIER': 1, 'BEL_TC_30_TR3_MW': 5, 'BEL AIR - feeder_30 - MTOA': 9}

Then, you could import dataframes and pair them with their names like this:

dfs = []
for name, col in data.items():
    df = pd.read_csv(
        filepath_or_buffer="./scripts/test.csv",
        sep=";",
        header=0,
        skiprows=6,
        usecols=[col - 1, col],
        engine="python",
    )
    df.columns = ["Date", "Valeur"]
    df.set_index("Date", inplace=True)
    df.columns.name = name
    dfs.append(df)

In which case:

for df in dfs:
    print(df)

# Outputs

BEL AIR - feeder_30 - MANGUIER      Valeur
Date
01/09/2020 13:53                5189,60325
02/09/2020 13:54                5043,68066
03/09/2020 13:55                4805,71191

BEL_TC_30_TR3_MW        Valeur
Date
01/09/2020 13:53   -47,3029671
02/09/2020 13:54  -5,829510403
03/09/2020 13:55    1,52590215

BEL AIR - feeder_30 - MTOA      Valeur    
Date                                      
01/09/2020 13:53            5189,60325    
02/09/2020 13:54            5043,68066    
03/09/2020 13:55            4805,71191

Sorry for the late reply, I just came back on the projet. It's exactly what I was looking for and works perfectly. Thank you very much !!

Collectives™ on Stack Overflow

How to import a csv with multiple data into different dataframes?

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related