0

I use the following code to read data from a text file, and add some lines in between of some of the lines, but after running the code, before doing that, I get an error as mentioned bellow.

The data in the text file is converted from csv file and are comma separeted.

import os
import re
import time
from datetime import datetime

file_list = []
file_counter = 1
for filename in os.listdir(os.getcwd()):
    file_list.append(filename)

for filename in file_list:
    if(filename=='q.py'):   continue
    file = open(filename,"r").read().split('\n')
    file_to_write = "file" +str(file_counter) +".txt"
    file_w = open(file_to_write,"w")
    file_w.write(file[0])
    file_w.write("\n")
    number_of_lines = sum(1 for _ in file)
    lis = []
    for i in range(number_of_lines):
        lis.append(file[i])
    modified_time = []
    for i in range(number_of_lines):
        line = lis[i].split(' ')

        if(line[0]=='name' or line[0]==''):
            continue;
        temp1 = ""
        temp1+=line[2][0]
        temp1+=line[2][1]
        temp1+=line[2][2]
        temp1+=line[2][3]
        temp1+=line[2][4]
        temp1+=line[2][5]
        temp1+='0'
        temp1+='0'
        line[2] = temp1
        try:
            modified_time.append(line[0]+' '+line[1]+' '+line[2]+' 
'+line[3]+' '+line[4]+' '+line[5])
        except:
            continue

    for i in range(len(modified_time)-1):
        line1 = modified_time[i].split(' ')
        if(line1[0]=='name' or line1[0]==''):
            continue;
        line2 = modified_time[i+1].split(' ')
        date1 = line1[1]
        date2 = line2[1]
        time1 = line1[2]
        time2 = line2[2]
        day1 = datetime.strptime(date1, '%Y/%m/%d').date()
        day2 = datetime.strptime(date2, '%Y/%m/%d').date()
        diff1 = (day2-day1).days*24*60


        format = '%H:%M:%S'
        startDateTime = datetime.strptime(time1, format)
        endDateTime = datetime.strptime(time2, format)

        diff2 = endDateTime-startDateTime
        diff2 = diff2.seconds/60

        diff =  diff1 + int(diff2)

        lis_written = line1[0] + ' ' + line1[1] + ' ' + line1[2] + ' ' + 
line1[3] + ' ' + line1[4] + ' ' + line1[5] + '\n';
        file_w.write(lis_written)
        format = '%Y/%m/%d %H:%M:%S'
        time_counter = datetime.strptime(line1[1]+' '+line1[2], format)


        from datetime import timedelta

        for i in range(diff-1):
            time_counter = time_counter +timedelta(0,60)
            time_value = str(time_counter)
            time_value = time_value.split(' ')
            giventime = time_value[1]
            givendate = time_value[0]

            temp_str = ""
            temp_str+=givendate[0]
            temp_str+=givendate[1]
            temp_str+=givendate[2]
            temp_str+=givendate[3]
            temp_str+='/'
            temp_str+=givendate[5]
            temp_str+=givendate[6]
            temp_str+='/'
            temp_str+=givendate[8]
            temp_str+=givendate[9]

            lis_written = line1[0] + ' ' + temp_str + ' ' + giventime + ' ' + 
str(-999) + ' ' + line1[4] + ' ' + line1[5] + '\n';
            file_w.write(lis_written)


    file_w.write(file[len(file)-1])
    file_w.write("\n")
    file_counter+=1

This is the error I get at line 33:

Traceback (most recent call last):
  File "C:\Users\bxr5813\Desktop\time series\run\New folder\New folder\q.py", 
line 33, in <module>
    temp1+=line[2][4]
IndexError: string index out of range

The format of the input text file is as follows:

Value Date Time MilliSecond
919 04/15/16 19:41:02 700682752
551 04/15/16 19:46:51 014109952
717 04/15/16 19:49:48 333956864
2679 04/15/16 19:52:49 8053952
2890 04/15/16 19:55:43 73351552
2897 04/15/16 19:58:38 257767936
1790 04/15/16 21:39:14 13785728
2953 04/15/16 21:42:10 801841152
2516 04/15/16 21:45:04 467205376
2530 04/15/16 21:47:58 688858368
2951 04/15/16 21:51:02 6165952
2954 04/15/16 21:53:56 48836992
2537 04/15/16 21:56:52 105879296
2523 04/15/16 21:59:45 920951808
2536 04/15/16 22:02:49 103219968
2727 04/15/16 22:05:43 708147456
2554 04/15/16 22:11:48 323045888
2703 04/15/16 22:14:46 932627712
2958 04/15/16 22:17:40 574788352
2683 04/15/16 22:20:34 7734976
2542 04/15/16 22:23:29 353888512
2536 04/15/16 22:29:15 787323136

2 Answers 2

1

Have you tried debugging? I think you should change if(line[0]=='name' or line[0]==''): to if(line[0]=='Value' or line[0]==''): and you get Index Error because the code tries to copy the fourth index of 'Time' in the first loop.

Also, your code seems like it needs a bit of simplification. For example, you can write

file_list = []
for filename in os.listdir(os.getcwd()):
    file_list.append(filename)

as file_list = [i for i in os.listdir(os.getcwd())].

You can also simplify

temp1 = ""
temp1+=line[2][0]
temp1+=line[2][1]
temp1+=line[2][2]
temp1+=line[2][3]
temp1+=line[2][4]
temp1+=line[2][5]
temp1+='0'
temp1+='0'
line[2] = temp1

to line[2] = line[2][:5] + '00'.

line[0]+' '+line[1]+' '+line[2]+' '+line[3]+' '+line[4]+' '+line[5] + '\n' can be modified to ' '.join(modified_time) + '\n', and line array seems like it has index 3 as maximum, not 5.

Also, it is recommended for you to close your files opened to read or write if you don't use file anymore, using file_used.close().

I haven't debugged throughout the program, but I think you need to debug your code. I recommend you to use Visual Studio Code program. It's quite useful.

Sign up to request clarification or add additional context in comments.

Comments

0

If you want to grab the last element on the 3rd line (index 2) it would be

temp1 += line[2][3]

Referencing line [x][4] implies you want the 5th column of a row that only has 4

1 Comment

Actually, this did not help!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.