def prepare_dataframe(df):
    df.rename(columns={
        'Bomba Calor - Temperatura de Aire (°C)': 'temp_aire',
        'Bomba Calor - Temperatura Entrada (°C)': 'temp_entrada',
        'Bomba Calor - Temperatura Salida (°C)' : 'temp_salida',
        'Bomba Calor - Estado Caldera 2 (estado)':'estado_caldera2',
        'Bomba Calor - Estado Caldera 1 (estado)': 'estado_caldera1',
        'Bomba Calor - Estado Bomba de Calor (estado)': 'estado_bomba_calor'
    }, inplace=True, errors='ignore')

    if 'timestamp' in df.columns:
        df['timestamp'] = pd.to_datetime(df['timestamp'], format="%d-%m-%y %H:%M").astype("datetime64[s]")
        # Se establece el 'timestamp' como índice para poder filtrar
        df.set_index('timestamp', inplace=True)

        df.sort_index(inplace=True)
    elif 'Fecha' in df.columns and 'Hora' in df.columns:
        
        df['timestamp'] = pd.to_datetime(df['Fecha'] + ' ' + df['Hora'], format="%d-%m-%y %H:%M").astype("datetime64[s]")
     
        df.set_index('timestamp', inplace=True)

        df.sort_index(inplace=True)

        final_columns = [
        'temp_aire', 'temp_entrada', 'temp_salida', 
        'estado_caldera2', 'estado_caldera1', 'estado_bomba_calor'
    ]
    existing_columns = [col for col in final_columns if col in df.columns]
    
    return df[existing_columns]

def load_data(source="csv", file_path="caldera_comepa.csv"):
    if source == "csv":
        # Lógica de carga del CSV
        df_crudo = pd.read_csv(file_path)
        #proximamente
        #elif fuente == "elastic":

        df_clean = prepare_dataframe(df_crudo)
        return df_clean

Hi guys, i would like to know if what i'm doing is a good practice, because i'm renaming the columns because it's name is very long, also i'm establishing the timestamp as an index for filter the data easily, you can check the example of csv below, as you can see, it's a wide format csv file, but i would like to know if i can change that into a long format, because of that way i can render a graphic using the plotly library. Finally after working with a csv i would like to migrate that into a database using elasticsearch, so i want to make a function that read a csv or elasticsearch

Fecha,Hora,Bomba Calor - Temperatura de Aire (°C),Bomba Calor - Temperatura Entrada (°C),Bomba Calor - Temperatura Salida (°C),Bomba Calor - Estado Caldera 2 (estado),Bomba Calor - Estado Caldera 1 (estado),Bomba Calor - Estado Bomba de Calor (estado)
04-10-25,00:01,22.2,63.4,63.4,0.0,0.0,0.0
04-10-25,00:11,21.9,61.8,61.7,0.0,0.0,0.0
04-10-25,00:21,21.7,60.3,60.3,0.0,0.0,0.0

2 Replies 2

Sure, you can convert wide format into long format using pandas.DataFrame.melt.

Migrating into Elasticsearch (ES) will require you to create records (list of dictionaries) which you can then index into your cluster. You can choose to define a template or let ES infer the types. ES has a python sdk that makes it easy to bulk insert or search the cluster.

Thanks. Can I turn the date and time into a timestamp manually, and set it as the index of the dataframe, and then render that into a plotly line, for example?

Your Reply

By clicking “Post Your Reply”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.