0

So im trying to learn how to use regular expressions in Perl. I have a textfile.txt that contains information and i want to access specific portions of that textfile. The textfile.txt contains the following entry (first 3 lines):

Jan    2016-01-01 Friday   12:00
Feb    2016-02-01 Monday   23:45
Mar    2016-03-01 Tuesday  15:30

What I want to do is to put the names of the month "Jan/feb/mar" in one array, their numerical value "2016-01-01" in a second array. My current script takes the entire first line and puts it in the same element. This is my code for writing to the array so far:

while (<FILE>) {
push (@newArray, $_);
}
close FILE

How would I go about only putting the entries of the date (2016-01-01) or the name of the month (Jan/feb/mar) into the array from the file, instead of putting the entire line into the array element?

3
  • 2
    Does it have to be regex? Because split would work quite nicely. Commented Feb 2, 2016 at 17:32
  • 2
    "So im trying to learn how to use regular expressions in Perl." One of the most important things to learn about regex is when to use it and when another tool would be more suitable. In this case, I agree with Sobrique that split would be better. Commented Feb 2, 2016 at 17:49
  • 2
    Regular expressions are not the solution to every problem. In fact, you might find you have n + 1 problems after using regular expressions. ;-) Commented Feb 2, 2016 at 17:50

2 Answers 2

3

I wouldn't use a regex but instead split:

#!/usr/bin/perl

use warnings;
use strict;

use Data::Dumper; 

my @month_words;
my @month_dates;
my %month_lookup;

while ( <DATA> ) {
   my ( $mon, $date, $day, $time ) = split; 
   push ( @month_words, $mon );
   push ( @month_dates, $date ); 
   $month_lookup{$mon} = $date; 
}

print Dumper \@month_words, \@month_dates, \%month_lookup;

__DATA__
Jan    2016-01-01 Friday   12:00
Feb    2016-02-01 Monday   23:45
Mar    2016-03-01 Tuesday  15:30

This prints the two arrays, and the hash:

$VAR1 = [
          'Jan',
          'Feb',
          'Mar'
        ];
$VAR2 = [
          '2016-01-01',
          '2016-02-01',
          '2016-03-01'
        ];
$VAR3 = {
          'Mar' => '2016-03-01',
          'Feb' => '2016-02-01',
          'Jan' => '2016-01-01'
        };
Sign up to request clarification or add additional context in comments.

3 Comments

So split in this case just "splits" the entire line into individual pieces/words? I assume that this approach would be the way to go when every line has the same format? and that I should instead use regular expressions when every line in the textfile is not of the same format?
@NinjaAnte split is good when you have regular delimited data (comma, tab, semicolon, etc.), which is what you appear to have here. It doesn't work when fields can contain the delimiter (e.g. a CSV where fields can contain commas or a fixed-width format where fields are separated by spaces but can also contain spaces), or when the data isn't delimited, etc.
split takes a delimiter and turns the string into an array based on that. The delimiter can be a regular expression (the 'default' is any whitespace, which is often useful). Regular expressions are good if you need to validate (or discard) the line, or have variable field separators. (E.g. a date that is "Monday 4 00:23:22" ).
0

Create capture groups () to extract information from a matching regular expression:

#!/usr/bin/perl
use warnings;
use strict;

my (@months, @dates);
while (<DATA>) {
    if (my ($month, $date) = /^(...) \s+ ([0-9-]+)/x) {
        push @months, $month;
        push @dates, $date;
    }
}
print "@months\n@dates\n";

__DATA__
Jan    2016-01-01 Friday   12:00
Feb    2016-02-01 Monday   23:45
Mar    2016-03-01 Tuesday  15:30

If you want to only accept month names, you can change the first group from (...) to (A(?:pr|ug)|Dec|Feb|J(?:an|u[ln])|Ma[ry]|Nov|Oct|Sep).

2 Comments

Wouldn't it be cleaner to simply split on whitespace? my ($month, $date, $weekday, $time) = split(/\s+/, $_);
OP requested a regex solution.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.