6

I have a text file with the format (date, time, resistance):

12/11/2013  13:20:38    28.321930E+3
...         ...             ...

I need to extract the value of resistance (third column) from every 6 seconds after the first data entry. To start I wanted to import the text file using:

date, time, resistance = loadtxt('Thermometers.txt', unpack=True, usecols=[0,1,2])

However before I've hardly begun my program, I get the error:

ValueError: invalid literal for float(): 12/11/2013

-ALSO-

I am not sure how to also iterate through time given that the date changes as it's an over-night data run. Elegant solutions to my problem(s) would be much appreciated.

7
  • Is there a problem in opening a text file object and doing readline and finally doing the_line.split()? Commented Nov 14, 2013 at 13:29
  • I need to extract data from a time interval, which also requires considering the date. Commented Nov 14, 2013 at 13:32
  • Show the definition of the loadTxt function, please. Commented Nov 14, 2013 at 13:39
  • numpy.loadtxt(fname, dtype=<type 'float'>, comments='#', delimiter=None, converters=None, skiprows=0, usecols=None, unpack=False, ndmin=0) Load data from a text file. Each row in the text file must have the same number of values. Commented Nov 14, 2013 at 13:53
  • Are the data taken at constant time intervals? Further, I think @Jack_of_All_Trades is on the right track - why not just read and split the line using standard Python read/string operations, instead of through numpy? Commented Nov 14, 2013 at 14:07

2 Answers 2

1

I think this code will do what you want to do. And also, you don't have to worry about the overnight data and changing date since this converts it to datetime object.

    import datetime

    filtered_data=[]

    my_data=open(my_file,'r')
    for line in my_data:

        data_arr=line.split()
        dte=data_arr[0].split("/") r
        tme=data_arr[1].split(":") 
        new_date=datetime.datetime((int(dte[2]),int(dte[0]),int(dte[1]),
                                    int(tme[0]),int(tme[1]),int(tme[2]))

        if filtered_data==[]:
           filtered_data.append(data_arr)

        else:
           if (new_date-old_date).seconds==6:
                filtered_data.append(data_arr)

        old_date=new_date

This will give you a list where the items are filtered as per your situation ( in every 6 seconds). Now if you just want the array of your resistance which are distributed at 6 seconds interval, using simple loop or list comprehension like below will suffice:

R_in_six_sec_interval=[R[2] for R in filtered_data]
Sign up to request clarification or add additional context in comments.

4 Comments

hmm how about new_date=dateutil.parser.parse(' '.join(data_arr[:2]),dayfirst=True) parse the date?
A few ideas... I think you need to convert the dte values to numbers to keep datetime happy. Also, wouldn't a conditional of >= 6 be more appropriate? Another idea is to split the line just once using re.compile(r'[/:\s]+'). If you initialize old_date to a really old value, you can drop the check for empty filtered_data.
@FMc: You are absolutely right. I remembered at first but forgot it later since it is since int(...). I just commented in my answer. I am being lazy now. I think OP gets the idea now.
@FMc: For >= 6, I think the OP wants exactly 6 seconds interval from the previous data.
0

you might want to have a look at this if you want to do stick to numpy for other reasons.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.