modifying multiple elements in array perl

Question

Say I were to have a file with a name field and 3 date fields and wanted to reformat the dates. I could go like this:

while (<DATA>) {
  my @lines = split(/\|/); ##splitting DATA by '|'

  my @dates = split( /\/|[-]/, $lines[0] ); #splitting only the first element of array and performing modifications below.
  if ( $dates[2] =~ /^[0-1][0-9]$/gi ) { $dates[2] = $dates[2] + 2000 }
  elsif ( $dates[2] =~ /^[2-9][0-9]$/gi ) {
    $dates[2] = $dates[2] + 1900;
  }
  if ( $dates[1] =~ /^\d$/gi ) { $dates[1] = "0" . $dates[1] }
  if ( $dates[0] =~ /^\d$/gi ) { $dates[0] = "0" . $dates[0] }
  my $date = join "-", @dates[ 2, 0, 1 ]; #joining the dates to be in yyyy-mm-dd format.
  print $date, "\n"; #double check
  print $date, ",", ( join ",", @lines[ 1 .. $#lines ] ), "\n"; appending date to print the join of @lines.
}

Instead of having to split and join each $line[0] through $lines[2], is there a way to perform the modifications on all of the deisired fields at once? ($lines[0] through $lines[2]).

__DATA__
12/23/2014|2/20/1995|3/25/1905|josh

Incidentally, you've got some cargo-cult stuff in there. /gi where it's matching anchored numbers and brackets around your regex hyphen on your @dates split. And most of your elses don't seem to be doing anything useful at all. — Jim Davis
– Jim Davis, Commented Mar 26, 2015 at 19:09
@JimDavis The else s were useless so I removed them. I am using anchors however to signify only one digit instead of it matching a digit. — JDE876
– JDE876, Commented Mar 26, 2015 at 19:19
A more appropriate tool may be to use sprintf (perldoc.perl.org/functions/sprintf.html) to normalize the digits with leading 0s if required. - $dates[0] = sprintf("%02d", $dates[0]);, and ditch the if and regex. That said, the comment about /gi was that you need neither case insensitivity or multiple matches when you're matching numerical digits with start and end anchors. — Oesor
– Oesor, Commented Mar 26, 2015 at 19:25
@JDE876: The anchors are fine, but you don't need the /g switch to match one item, and the /i to ignore case doesn't affect a regex that only matches numbers. They look like inherited cruft. — Jim Davis
– Jim Davis, Commented Mar 26, 2015 at 19:37

Sinan Ünür · Accepted Answer · 2015-03-26 19:54:58Z

1

Your script gives nasty output with the input you provided. The output of the following seems much more logical to me:

#!/usr/bin/env perl

use strict;
use warnings;

while (my $line = <DATA>) {
    next unless $line =~ /\S/;
    my ($name, @dates) = reverse split qr{\|}, $line;
    @dates = reverse map sprintf('%04d-%02d-%02d', (split qr{/})[2,0,1]), @dates;
    print join(',', @dates, $name), "\n";
}
__DATA__
12/23/2014|2/20/1995|3/25/1905|josh

Output:

2014-12-23,1995-02-20,1905-03-25,josh

If this not the output you want then describe the exact output you are trying to get.

A few points:

while (<DATA>) reads a single line from DATA. Assign it to a meaningful variable to make your code easier to read.
Skip processing on empty lines
Don't fall victim to LTS: /\|/ is much harder to distinguish than qr{\|} or even qr{ \| }x.
The reverses make the code easier to read, but if you have a ton of fields, they may become a real bottleneck. In that case, pop and push.

answered Mar 26, 2015 at 19:54

Sinan Ünür

118k15 gold badges201 silver badges347 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

JDE876 Over a year ago

How could I access one of the elements of @dates from there?

JDE876 Over a year ago

Yeah of course! I toyed around a little and since the array is reverse you access them in reverse i.e. 0 is the last element as opposed to the first. Thanks! Your answer is the most compact and dynamic.

Sinan Ünür Over a year ago

But the elements of the array are transformed and put into correct order just on the following line, so you should not have to worry about that. If it bothers you, use pop and push. Thanks for accepting, btw.

David W. · Accepted Answer · 2015-03-26 20:52:28Z

You can use regular expressions to help match your data line various bits and pieces and do away with the multiple splits. Using regular expressions can also verify your line format. Do you have a 3 digit month? Do you have three dates? It's always a good idea to verify your input:

#! /usr/bin/env perl
#
use strict;
use warnings;
use feature qw(say);

my $date_re = qr(
        ^(?<month1>\d{1,2})/
        (?<day1>\d{1,2})/
        (?<year1>\d{2,4})
        \|                    # Separator between date1 and date2
        (?<month2>\d{1,2})/
        (?<day2>\d{1,2})/
        (?<year2>\d{2,4})
        \|                    # Separator between date2 and date3
        (?<month3>\d{1,2})/
        (?<day3>\d{1,2})/
        (?<year3>\d{2,4})
        \|                    # Separator between date3 and name
        (?<name>.*)
    )x;
while ( my $line = <DATA> ) {
    my @array;
    if ( not @array = $line =~ m^$date_re^ ) {
        say "Something's wrong";
    }
    else {
        say "First Date: Year = $+{year1}  Month = $+{month1}  Day = $+{day1}";
        say "Second Date: Year = $+{year2}  Month = $+{month2}  Day = $+{day2}";
        say "Third Date: Year = $+{year3}  Month = $+{month3}  Day = $+{day3}";
        say "Name = $+{name}";
    }
}

__DATA__
12/23/2014|2/20/1995|3/25/1905|josh

Running this program prints out:

First Date: Year = 2014  Month = 12  Day = 23
Second Date: Year = 1995  Month = 2  Day = 20
Third Date: Year = 1905  Month = 3  Day = 25
Name = josh

This is using some advanced features of regular expressions:

qr/.../ can be used to define regular expressions. Since you have slashes in the regular expression, I decided to use parentheses to delimit my regular expression, so it's qr(...).
The )x at the end means I can use white space to make my regular expression easier to understand. For example, I broke out each of the dates onto three lines (month, day, year).
(?<name>...) names your capture group which makes it easier to refer back to a particular capture group. I can use the %+ hash to recall my capture groups. For example (?<month1>\d{1,2}) says that I expect a 1 to two digit month. I store this in the capture group month1, and I can refer back to this by using $+{month1}.

One of the nice things about using named capture groups is that it documents what you're attempting to capture.
The {M,N} is a repeat. I expect the previous regular expression to happen from M to N times. \d{1,2} means I'm expecting one or two digits.

Toto · Accepted Answer · 2015-03-26 19:38:17Z

0

You have to keep the spliting of the fields and the join, but you may reduce the substitutions to:

$dates[2] =~ s/^([01]\d)$/20$1/;
$dates[2] =~ s/^([2-9]\d)$/19$1/;
$dates[1] =~ s/^(\d)$/0$1/;
$dates[0] =~ s/^(\d)$/0$1/;

answered Mar 26, 2015 at 19:38

Toto

91.7k63 gold badges97 silver badges135 bronze badges

Comments

Jim Davis · Accepted Answer · 2015-03-26 19:41:25Z

0

It's ugly, but it does it all the fields in one pass as you requested.

while (<DATA>) {
    s/
      (?:^|\|)\K # start after a leading start-of-line or pipe
      (\d{1,2})
      [\/-]
      (\d{1,2})
      [\/-]
      (\d\d(?:\d\d)?)
      (?=\||\z) # look-ahead to see trailing pipe or end-of-string
     /
        sprintf('%04d-%02d-%02d',
            $3 <  20 ? $3 + 2000
          : $3 < 100 ? $3 + 1900
          : $3,
            $1,
            $2
        )
     /gex;
    print;

}

__DATA__
1/23/14|2/20/95|3/25/1905|josh

answered Mar 26, 2015 at 19:41

Jim Davis

5,3091 gold badge30 silver badges25 bronze badges

Collectives™ on Stack Overflow

modifying multiple elements in array perl

4 Answers 4

3 Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

3 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related