1

I have a CSV file that was downloaded from InfluxDB UI. I want to extract useful data from the downloaded file. A snippet of the downloaded file is as follows:

#group  FALSE   FALSE   TRUE    TRUE    FALSE   FALSE   TRUE    TRUE    TRUE    TRUE    TRUE
#datatype   string  long    dateTime:RFC3339    dateTime:RFC3339    dateTime:RFC3339    double  string  string  string  string  string
#default    mean                                        
    result  table   _start  _stop   _time   _value  _field  _measurement    smart_module    serial  type
        0   2023-03-31T08:12:40.697076925Z  2023-03-31T09:12:40.697076925Z  2023-03-31T08:20:00Z    0   sm_alarm    system_test 8   2.14301E+11 sm_extended
        0   2023-03-31T08:12:40.697076925Z  2023-03-31T09:12:40.697076925Z  2023-03-31T08:40:00Z    0   sm_alarm    system_test 8   2.14301E+11 sm_extended
        0   2023-03-31T08:12:40.697076925Z  2023-03-31T09:12:40.697076925Z  2023-03-31T09:00:00Z    0   sm_alarm    system_test 8   2.14301E+11 sm_extended
        0   2023-03-31T08:12:40.697076925Z  2023-03-31T09:12:40.697076925Z  2023-03-31T09:12:40.697076925Z  0   sm_alarm    system_test 8   2.14301E+11 sm_extended

I'd like to have the output CSV as follows:

_time                   sm_alarm  next_column next_column ....... ...........
2023-03-29T08:41:15Z    0

Please note that sm_alarm is only one field among 9 others (that are under _filed).

I tried to do with the following script, but could not solve my problem.

import csv

# Specify the input and output file names
input_file = 'influx.csv'
output_file = 'output.csv'

try:
    # Open the input file for reading
    with open(input_file, 'r') as csv_file:
        # Create a CSV reader object
        csv_reader = csv.reader(csv_file)

        # Skip the first row (header)
        next(csv_reader)

        # Open the output file for writing
        with open(output_file, 'w', newline='') as output_csv:
            # Create a CSV writer object
            csv_writer = csv.writer(output_csv)

            # Write the header row
            csv_writer.writerow(['_time', '_field', '_value'])

            # Iterate over the input file and write the rows to the output file
            for row in csv_reader:
                # Check if the row is not empty
                if row:
                    # Split the fields
                    fields = row[0].split(',')

                    # Write the row to the output file
                    csv_writer.writerow(fields)

    print(f'{input_file} converted to {output_file} successfully!')

except FileNotFoundError:
    print(f'Error: File {input_file} not found.')

except Exception as e:
    print(f'Error: {e}')

Thank you.

1
  • 2
    There are 11 headers and 10 values. It seems either the result or table value is missing from each row. Commented Mar 31, 2023 at 18:25

1 Answer 1

2

The format of your expected output is ambiguous and not fully clear.
But as a starting point, you can straighten your file with read_csv from this way :

import pandas as pd
​
with open("influx.csv", "r") as csv_file:
    headers = csv_file.readlines()[3].strip().split()[1:]
    
df = pd.read_csv("influx.csv", header=None, skiprows=4, sep="\s+",
                 engine="python", names=headers).iloc[:, 1:]

#df.to_csv("output.csv", index=False, sep=",") # <- uncomment this line to make a real csv

Output :

print(df)

                           _start                           _stop                           _time  _value    _field _measurement  smart_module        serial         type
0  2023-03-31T08:12:40.697076925Z  2023-03-31T09:12:40.697076925Z            2023-03-31T08:20:00Z       0  sm_alarm  system_test             8  2.143010e+11  sm_extended
1  2023-03-31T08:12:40.697076925Z  2023-03-31T09:12:40.697076925Z            2023-03-31T08:40:00Z       0  sm_alarm  system_test             8  2.143010e+11  sm_extended
2  2023-03-31T08:12:40.697076925Z  2023-03-31T09:12:40.697076925Z            2023-03-31T09:00:00Z       0  sm_alarm  system_test             8  2.143010e+11  sm_extended
3  2023-03-31T08:12:40.697076925Z  2023-03-31T09:12:40.697076925Z  2023-03-31T09:12:40.697076925Z       0  sm_alarm  system_test             8  2.143010e+11  sm_extended

If you share a clear expected ouptut, I'll update my answer accordingly.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.