0

I am processing a CSV file in python thats delimited by a comma (,).

Each column is a sampled parameter, for instance column 0 is time, sampled at once a second, column 1 is altitude sampled at 4 times a second, etc.

So columns will look like as below:

Column 0 -> ["Time", 0, " "," "," ",1] 
Column 1 -> ["Altitude", 100, 200, 300, 400]

I am trying to create a list for each column that captures its name and all its data. That way I can do calculations and organize my data into a new file automatically (the sampled data I am working with has substantial number of rows)

I want to do this for any file not just one, so the number of columns can vary.

Normally if every file was consistent I would do something like:

import csv
time =[]
alt = []
dct = {}
with open('test.csv',"r") as csvfile:
    csv_f = csv.reader(csvfile)
    for row in csv_f:
        header.append(row[0])
        alt.append(row[1]) #etc for all columns

I am pretty new in python. Is this a good way to tackle this, if not what is better methodology?

Thanks for your time

2

2 Answers 2

1

Pandas will probably work best for you. If you use csv_read from pandas, it will create a DataFrame based on the column. It's roughly a dictionary of lists.

You can also use the .tolist() functionality of pandas to convert it to a list if you want a list specifically.

import pandas as pd
data = pd.read_csv("soqn.csv")
dict_of_lists = {}

for column_name in data.columns:
    temp_list = data[column_name].tolist()
    dict_of_lists[column_name] = temp_list

print dict_of_lists

EDIT: 
dict_of_lists={column_name: data[column_name].tolist() for column_name in data.columns}
#This list comprehension might work faster.
Sign up to request clarification or add additional context in comments.

Comments

0

I think I made my problem more simpler and just focused on one column.

What I ultimately wanted to do was to interpolate to the highest sampling rate. So here is what I came up with... Please let me know if I can do anything more efficient. I used A LOT of searching on this site to help build this. Again I am new at Python (about 2-3 weeks but some former programming experience)

import csv
header = []
#initialize variables
loc_int = 0
loc_fin = 0
temp_i = 0
temp_f = 0

with open('test2.csv',"r") as csvfile: # open csv file
    csv_f = csv.reader(csvfile)
    for row in csv_f:
        header.append(row[0]) #make a list that consists of all content in column A

for x in range(0,len(header)-1): #go through entire column

    if header[x].isdigit() and header[x+1]=="": # find lower bound of sample to be interpolated
        loc_int = x
        temp_i = int(header[x])

    elif header[x+1].isdigit() and header[x]=="": # find upper bound of sample to be interpolated

        loc_fin = x
        temp_f = int(header[x+1])

    if temp_f>temp_i: #calculate interpolated values
        f_min_i = temp_f - temp_i
        interp = f_min_i/float((loc_fin+1)-loc_int)

        for y in range(loc_int, loc_fin+1):
            header[y] = temp_i + interp*(y-loc_int)

print header

with open("output.csv", 'wb') as g: #write to new file
    writer = csv.writer(g)
    for item in header:
        writer.writerow([item])

I couldn't figure out how to write my new list "header" with its interpolated values and replace it with column A of my old file , test2.csv.

Anywho thank you very much for looking...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.