0

I have a csv file with two columns: date string in ISO8601 and a linux timestamp. How do I use awk to get the output in the following format: col-1: original ISO; col-2: convert timestamp (2) to ISO8601; col-3: diff between the two times (say, in ms)

Example:

Input:

  2018-01-09T16:55:22.545+0000,1515508979185

Output:

  2018-01-09T16:55:22.545+0000,2018-01-09T14:42:59.185+0000,36743360
2
  • Not clear, please post more clear requirements of your question with more suitable examples in your post. Commented Mar 16, 2018 at 17:57
  • I'm not sure what is not clear about calculating a difference between two dates and normalizing them to the same ISO 8601 format. Could you be more specific about what is not clear? Commented Mar 16, 2018 at 18:12

2 Answers 2

1

Gawk has all the necessary functions to convert date and time between different formats. This is a Gawk extension.

Consider the following command

awk -F, '{ patsplit($1,a,"[0-9]*");
      time1 = mktime(sprintf("%d %d %d %d %d %d",
                   a[1], a[2] ,a[3], a[4], a[5], a[6]))*1000 + a[7];
      time2 = mktime(strftime("%Y %m %d %H %M %S",$2/1000,a[8]))*1000 +$2 %1000;  
      isodate2 = strftime("%Y-%m-%dT%H:%M:%S",$2/1000,a[8]);
      printf "%s;%s.%03d;%s\n",
             $1,
             isodate2,$2 % 1000,
             time1 - time2}' csvfile

It would produce

2018-01-09T16:55:22.545+0000;2018-01-09T14:42:59.185;7943360

Explanation

We use , as a field separator as the input is a CSV file. First we parse the 1st column argument which is an ISO 8601 date. We use patsplit() to extract all numbers out of an ISO 8601 string into an array a so that

  a[1] = YYYY, a[2] = mm, a[3] = dd, 
  a[4] = HH, a[5] = MM, a[6] = SS, a[7] = uuu

We use the array a to convert the 1st column date into a timestamp and compute the difference in microseconds and store the result in the time1 variable.

Handling timezones here requires to compute the equivalent of the 2nd time in the timezone of the 1st timestamp.

Then we print the output line starting with the 1st column; using strftime to convert the timestamp from the 2nd column into ISO8601 date and printing the microseconds separately.

The difference between time1 and time2 is not the same as in the original post.

Sign up to request clarification or add additional context in comments.

1 Comment

You should mention that's gawk-only for time functions.
1

awk solution:

awk 'BEGIN{ FS=OFS="," }
     { 
         cmd1 = "date -d"$1" +%s"; 
         cmd2 = "date -d@"int($2/1000)" +%FT%T.%3N%z";
         cmd1 | getline d1; close(cmd1);
         cmd2 | getline d2; close(cmd2);
         print $1, d2, d1*1000 - $2 
     }' file

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.