3

I have a text stream that looks like this:

----------------------------------------
s123456789_9780
  heartbeat:test       @ 1344280205000000: '0'
  heartbeat:test       @ 1344272490000000: '0'

Those long numbers are timestamps in microseconds. I would like to run this output through some sort of pipe that will change those timestamps to a more human-understandable date.

I have a date command that can do that, given just the timestamp (with the following colon):

$ date --date=@$(echo 1344272490000000: | sed 's/.......$//') +%Y/%d/%m-%H:%M:%S
2012/06/08-10:01:30

I would like to end up with something like this:

----------------------------------------
s123456789_9780
  heartbeat:test       @ 2012/06/08-12:10:05: '0'
  heartbeat:test       @ 2012/06/08-10:01:30: '0'

I don't think sed will allow me to match the timestamp and replace it with the value of calling a shell function on it (although I'd love to be shown wrong). Perhaps awk can do it? I'm not very familiar with awk.

The other part that seems tricky to me is letting the lines that don't match through without modification.

I could of course write a Python program that would do this, but I'd rather keep this in shell if possible (this is generated inside a shell script, and I'd rather not have dependencies on outside files).

5 Answers 5

3

This might work for you (GNU sed):

sed '/@ /!b;s//&\n/;h;s/.*\n//;s#\(.\{10\}\)[^:]*\(:.*\)#date --date=@\1 +%Y/%d/%m-%H:%M:%S"\2"#e;H;g;s/\n.*\n//' file

Explanation:

  • /@ /!b bail out and just print any lines that don't contain an @ followed by a space
  • s//&\n/ insert a newline after the above pattern
  • h copy the pattern space (PS) to the hold space (HS)
  • s/.*\n// delete upto and including the @ followed by a space
  • s#\(.\{10\}\)[^:]*\(:.*\)#date --date=@\1 +%Y/%d/%m-%H:%M:%S"\2"#e from whats remaining in the PS, make a back reference of the first 10 characters and from the : to the end of the string. Have these passed in to the date command and evaluate the result into the PS
  • H append the PS to the HS inserting a newline at the same time
  • g copy the HS into the PS
  • s/\n.*\n// remove the original section of the string
Sign up to request clarification or add additional context in comments.

4 Comments

The e flag works with files, but not with functions. I am using GNU sed version 4.2.1.
@michelpm this probably only works with GNU sed within a linux/unix bash environment.
@potong as I said, I am using GNU sed version 4.2.1 and I tested on Ubuntu 13.04 with both bash and zsh. It DOES work with executable files, but not with functions. square () { echo $(($1 * $1)) }, seq () { for (( i = 0 ; i < $1 ; i++ )); do echo $i; done }, seq 10 | sed 's/[0-9]*/square &/ge' complains that function square doesn't exist. It does work if I turn the square function into square.sh and change sed to call the file instead. Any ideas?
@potong a inlined version works though: seq 10 | sed 's/[0-9]*/echo \$((& * &))/ge'
1

Bash with a little sed, preserving the whitespace of the input:

while read -r; do                                                                                                                                                                                                                                          
    parts=($REPLY)
    if [[ ${parts[0]} == "heartbeat:test" ]]; then
        dateStr=$(date --date=@${parts[2]%000000:} +%Y/%d/%m-%H:%M:%S)
        REPLY=$(echo "$REPLY" | sed "s#[0-9]\+000000:#$dateStr#")
    fi
    printf "%s\n" "$REPLY"
done

2 Comments

Whitespace preservation is awesome! I changed the sed expression to use # instead of / so I don't have to do all that escaping, but otherwise this works wonderfully! Just for my edification, the parts=($REPLY) line is just splitting $REPLY into an array using the IFS, right?
That's right. Good idea changing the delimiter for sed; I wish I'd remembered that was an option. I'll update the answer to use #.
1

How about:

while read s1 at tm s2
do 
    tm=${tm%000000:}
    echo $s1 $at $(date --date @$tm +%Y/%d/%m-%H:%M:%S)
done < yourfile

1 Comment

This barfs on the lines without the timestamps -- I like the simplicity, though.
1

I would also like to see a sed solution, but it is a bit beyond my sed-fu. As awk supports strftime it is fairly straight forward here:

awk '
/^ *heartbeat/ { 
  gsub(".{7}$", "", $3)
  $3 = strftime("%Y/%d/%m-%T", $3)
  print " ", $1, $3
}

$0 !~ /heartbeat/' file

Output:

s123456789_9780
heartbeat:test 2012/06/08-21:10:05
heartbeat:test 2012/06/08-19:01:30

$3 is the microsecond field. gsub converts the timestamp to seconds.

The $0 !~ makes sure non-heartbeat lines are printed ({ print } implicitly is the default block).

7 Comments

I don't think gsub is working -- I run this, and I get dates of 42600513/23/11-15:33:20, which is a little bit off. :-)
That's odd, I've added what I get to the answer. Which version of awk are you using? I've tested this with gawk.
I'm using GNU awk 3.1.6. If I remove the strftime line, then I'm getting the long timestamps.
Maybe try with the original format string you were using? %Y/%d/%m-%H:%M:%S.
That doesn't seem to work -- if I change the regexp from ".{7}$" to "000000:" it does work, though.
|
0

This does it mostly within bash using your date command:

#!/bin/bash
IFS=$
while read a ; do
case "$a" in
*" @ "[0-9]*) pre=${a% @ *}
              a=${a#$pre @ }
              post=${a##*:}
              a=${a%??????:$post}
              echo "$pre$(date --date=@$a +%Y/%d/%m-%H:%M:%S):$post"
              ;;
*)            echo "$a" ;;
esac
done <<.
----------------------------------------
s123456789_9780
  heartbeat:test       @ 1344280205000000: '0'
  heartbeat:test       @ 1344272490000000: '0'
.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.