So, given the following csv file:
0 1 ... 10 11
0 Source PUISSANCE ACTIVE ... NaN NaN
1 Nom du point distant BEL AIR - feeder_30 - MANGUIER ... NaN NaN
2 Propriétaire Puissance active ... NaN NaN
3 Unité Kw ... NaN NaN
4 Phase L1 L2 L3 ... NaN NaN
5 Echantillonnage Moyenne (Interval : 01:00) ... NaN NaN
6 Date Valeur ... Qualité Indicateurs
7 01/09/2020 13:53 5189,60325 ... Discutable NaN
8 02/09/2020 13:54 5043,68066 ... Discutable NaN
9 03/09/2020 13:55 4805,71191 ... Discutable NaN
You could get the names and col numbers like this:
raw_data = pd.read_csv(
filepath_or_buffer="./file.csv", sep=";", header=None, engine="python"
)
data = {raw_data.loc[1, x]: x for x in range(1, raw_data.shape[1], 4)}
print(data)
# Outputs
{'BEL AIR - feeder_30 - MANGUIER': 1, 'BEL_TC_30_TR3_MW': 5, 'BEL AIR - feeder_30 - MTOA': 9}
Then, you could import dataframes and pair them with their names like this:
dfs = []
for name, col in data.items():
df = pd.read_csv(
filepath_or_buffer="./scripts/test.csv",
sep=";",
header=0,
skiprows=6,
usecols=[col - 1, col],
engine="python",
)
df.columns = ["Date", "Valeur"]
df.set_index("Date", inplace=True)
df.columns.name = name
dfs.append(df)
In which case:
for df in dfs:
print(df)
# Outputs
BEL AIR - feeder_30 - MANGUIER Valeur
Date
01/09/2020 13:53 5189,60325
02/09/2020 13:54 5043,68066
03/09/2020 13:55 4805,71191
BEL_TC_30_TR3_MW Valeur
Date
01/09/2020 13:53 -47,3029671
02/09/2020 13:54 -5,829510403
03/09/2020 13:55 1,52590215
BEL AIR - feeder_30 - MTOA Valeur
Date
01/09/2020 13:53 5189,60325
02/09/2020 13:54 5043,68066
03/09/2020 13:55 4805,71191