3

I have a file that contains parameters using this syntax

RANGE {<value> | <value>-<value>} [ , ...]

where value s are numbers.

for example, all these are valid syntax

RANGE 34
RANGE 45, 234
RANGE 2-99
RANGE 3-7, 15, 16, 2, 54

How can I parse the values to an array in Perl?

For example for the last example, I want my array to have 3, 4, 5, 6, 7, 15, 16, 2, 54. The ordering of elements does not matter.


The most basic way is to check for a - symbol to determine whether there is a range or not, parse the range using a loop and then parse the rest of the elements

my @arr;
my $fh, "<", "file.txt" or die (...);
while (<$fh>) {
    if ($_ =~ /RANGE/) {
        if ($_ =~ /-/) { # parse the range
            < how do I parse the lower and upper limits? >
            for($lower..$upper) {
                $arr[++$#arr] = $_;
            }
        } else { # parse the first value
            < how do I parse the first value? >
        }

        # parse the rest of the values after the comma
        < how do I parse the values after the comma? >
    }
}
  • I need help parsing the numbers. For parsing, one way I can think of is to use successive splits (on -, , and ). Is there some better (clean and elegant, using regex maybe?) way?

  • Also, comments/suggestions on the overall program structure are welcome.

2
  • opening your input file for writing is probably a bad first step ;) Commented Sep 21, 2010 at 19:47
  • Your code will also not run because of the <fh>, which needs to be changed to <$fh>. Commented Sep 21, 2010 at 20:22

7 Answers 7

5

Take a look at Text::NumericList module from CPAN. It can convert strings to array in similar way you need:

use Text::NumericList;
my $list = Text::NumericList->new;

$list->set_string('1-3,5-7');
my @array = $list->get_array;     # Returns (1,2,3,5,6,7)

You can at least look at its source code for ideas.

Sign up to request clarification or add additional context in comments.

Comments

4

I would suggest parsing the line into a separate variable, as $_ tends to get clobbered by other function calls. You can remove the trailing newline at the same time, with chomp.

while (<$fh)>
{
    chomp (my $line = $_);
    # ...
}

Next, you need to detect the 'RANGE' indicator, and extract the numbers that follow. If there is no such indicator, you can just skip to the next line:

next if $line !~ /^RANGE (.*)$/;

Now, you can start extracting the numbers, splitting on the comma delimiter:

my @ranges = split /, /, $1;

Now you can extract the dashes and translate those into ranges. This is the tricky part -- if the value has a dash in it, get the first and second numbers, and turn them into a range with the .. operator; otherwise, leave the number alone:

@ranges = map { /(\d+)-(\d+)/ ? ($1 .. $2) : $_ } @ranges;

Putting all that together, and combining expressions, gives us:

my @numbers;
while (<$fh)>
{
    chomp (my $line = $_);
    next if $line !~ /^RANGE (.*)$/;

    push @numbers, map { /(\d+)-(\d+)/ ? ($1 .. $2) : $_ } (split /, /, $1);
}

1 Comment

map with this: { /(\d+)-(\d+)/ ? ($1 .. $2) : $_ }
3

What about this?

First split the line into elements separated by values and then check whether there is a '-' sign to create ranges:

if ($line =~ /RANGE ([\d\,\- ]+)/) {
    my $paramtxt = $1;
    my @elements = split(/\,/, $paramtxt);
    for my $element (@elements) {
        if ($element =~ /(\d+)\-(\d+)/) {
            $lower = $1;
            $upper = $2;
            push @arr, $lower .. $upper;
        } elsif ($element =~ /(\d+)/) {
            $solo = $1;
            push @arr, $solo;
        }
    } 
}

Comments

2

I like using Perl's range and || operators for a problem like this:

map {  my($x,$y)=split/-/; $x..$y||$x } split /\s*,\s*/;

If the token contains a -, the split/-/ statement will set both $x and $y and add the range from $x to $y to the map output. Otherwise, it will just set $x and just add $x to the output.

Comments

1

Filter duplicates with a hash:

#! /usr/bin/perl

use warnings;
use strict;

use 5.10.0;

my @tests = (
  "RANGE 34",
  "RANGE 45, 234",
  "RANGE 2-99",
  "RANGE 3-7, 15, 16, 2, 54",
);

for (@tests) {
  my %hits;
  @hits{$1 .. $2 // $1} = ()
    while /(\d+)(?:-(\d+))?/g;

  my @array = sort { $a <=> $b } keys %hits;
  print "@array\n";
}

Output:

34
45 234
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
2 3 4 5 6 7 15 16 54

Comments

1

Along the same lines as other answers:

#!/usr/bin/perl

use strict; use warnings;

my $number = '[0-9]+';
my $range  = "$number(:?-$number)?";
my $ranges = "$range(:?, $range)*";
my $pattern = qr/^RANGE ($ranges)$/;


while ( my $range = <DATA> ) {
    next unless $range =~ $pattern;
    my $expanded = expand_ranges($1);
    print "@$expanded\n\n";
}

sub expand_ranges {
    my ($ranges) = @_;
    my @terms = split /, /, $ranges;
    my @expanded;

    for my $term ( @terms ) {
        my ($lo, $hi) = split /-/, $term;
        push @expanded, defined( $hi ) ? $lo .. $hi : $lo .. $lo;
    }

    return \@expanded;
}


__DATA__
RANGE 34
RANGE 45, 234
RANGE 2-99
RANGE 3-7, 15, 16, 2, 54

Output:

34

45 234

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 3
1 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57
58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84
 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99

3 4 5 6 7 15 16 2 54

Comments

0

Here's my effort:

sub parse_range {
    my $str = shift;
    return unless $str =~ /^RANGE /g;

    my @array;
    while ($str =~ / \G \s* ( \d+ ) ( - ( \d+ ) ) ? \s* (?: , | $ ) /gxc) {
        push @array, $2 ? $1 .. $3 : $1;
    }

    return $str =~ /\G$/ ? @array : ();

}

It returns an empty list if the string parameter doesn't conform to the basic format you laid out.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.