1

I'm a perl rookie and dont know how to do this...

My input file:

random text 00:02 23
random text 00:04 25
random text 00:06 53
random text 00:07 56
random text 00:12 34
 ... etc until 23:59

I would like to have the following output:

00:00
00:01
00:02 23
00:03
00:04
00:05
00:06 53
00:07 56
00:08
00:09
00:10
00:11
00:12 34
... etc until 23:59

So an output file with a every minute timestamp and the corresponding value if found in input file. My input file starts at 00:00 and ends 23:59

My code sofar:

 use warnings;
 use strict;

 my $found;
 my @event;
 my $count2;

 open (FILE, '<./input/input.txt');
 open (OUTPUT, '>./output/output.txt');


    while (<FILE>){
           for ($count2=0; $count2<60; $count2++){

                my($line) = $_;

                if($line =~ m|.*(00:$count2).*|){
                $found = "$1 \n";
                push @event, $found;     
                }

                if (@event){
                }
                else {                                                      
                    $found2 = "00:$count2,";
                    push @event, $found2;   

                }         
                }                    
                }
                print OUTPUT (@event);

               close (FILE);
               close (OUTPUT);
7
  • You shouldn't reset $count2 to 0 on every line. It should be set to the next minute after the minute in the line you just read. And when the minute reaches 60 you need to increment the hour. Commented Oct 18, 2013 at 2:19
  • Instead of counting up to 60, count up to 24*60. Then convert the counter into hours and minutes by dividing by 60 and taking the modulus. Commented Oct 18, 2013 at 2:20
  • In your inner loop, when you find a match, you should end the loop with last. Commented Oct 18, 2013 at 2:22
  • Do you have seconds showing twice for a specific minute, e.g., 14:23 and 14:47? Commented Oct 18, 2013 at 3:30
  • Thanks Barmar, I'll test your suggestions Commented Oct 18, 2013 at 4:04

2 Answers 2

4

Here's one approach to your task:

use strict;
use warnings;

my %hash;

open my $inFH, '<', './input/input.txt' or die $!;

while (<$inFH>) {
    my ( $hr_min, $sec ) = /(\d\d:\d\d)\s+(.+)$/;
    push @{ $hash{$hr_min} }, $sec;
}

close $inFH;

open my $outFH, '>', './output/output.txt' or die $!;

for my $hr ( 0 .. 23 ) {
    for my $min ( 0 .. 59 ) {
        my $hr_min = sprintf "%02d:%02d", $hr, $min;
        my $sec = defined $hash{$hr_min} ? " ${ $hash{$hr_min} }[-1]" : '';
        print $outFH "$hr_min$sec\n";
    }
}

close $outFH;

The first part reads your input file and uses a regex to grab the time at the end of each string. A hash of arrays (HoA) is built, with the HH:MM as the key and seconds in the array. For example:

09:14 => ['21','45']

This means that at 09:14 there were two second entires: one at 21 seconds and the other at 45 seconds. Since the times in the input file are in ascending order, the highest one in the array can be obtained by using the [-1] subscript.

Next, two loops are set up: the outer is (0..23) and the inner (0..59), and sprintf is used to format the HH:MM. When a key is found in the hash that corresponds to the current HH:MM in the loops, HH:MM and the last item in the array (the largest seconds) is printed out to a file (e.g., 00:02 23). If there isn't a corresponding HH:MM in the hash, just the loop's HH:MM is printed (e.g., 00:03):

Sample output:

00:00
00:01
00:02 23
00:03
00:04 45
00:05
00:06 53
00:07 59
00:08
00:09
00:10
00:11
00:12 34
...
23:59

Hope this helps!

Sign up to request clarification or add additional context in comments.

3 Comments

Excellent! This is what I was looking for, thanks a lot! The use of hash is indeed the way to go
If you are always using the last value pushed onto the array, you might as well not use an array at all. Why save data that you're not going to use? Just use a plain assignment, and the last value you assign will be the one you want.
@TLP - Yes, excellent point! I greatly appreciate your code's use of nested captures (!) and the defined-or operator.
1

This is best done with a hash, as Kenosis has already shown. There are some simplifications/improvements that can be done, however.

  • By using assignment = we store the latest value for each time, because identical hash keys will overwrite each other.
  • The range operator .. can also increment strings, so that we can get a range of strings, like 00, 01, ... 59.
  • The defined-or operator // can be used as a more concise way to check if a key for a certain time is defined.
  • Using \d+ rather than .+ will be much safer, as it will prevent something like hindsight is 20:20 at 01:23 45 to match 20:20 incorrectly.
  • We do not use hardcoded file names, instead using shell redirection and arguments.

In the below example code, I used a smaller range of numbers for demonstration purposes. I also used the DATA file handle so that this code can be copy/pasted and tried out. To try it, change <DATA> to <> and run it like this:

perl script.pl input.txt > output.txt

Code:

use strict;
use warnings;
use feature 'say';

my %t;
while (<DATA>) {
    if (/((\d{2}:\d{2})\s+\d+)$/) {
        $t{$2} = $1;             # store most recent value
    }
}
for my $h ('00' .. '00') {          
    for my $m ('00' .. '12') {      
        my $time = "$h:$m";
        say $t{$time} // $time;  # say defined $t{$time} ? $t{$time} : $time;
    }
}
__DATA__
random text 00:02 23
random text 00:04 25
random text 00:06 53
random text 00:07 56
random text 00:12 34
random text 00:12 39

Output:

00:00
00:01
00:02 23
00:03
00:04 25
00:05
00:06 53
00:07 56
00:08
00:09
00:10
00:11
00:12 39

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.