1

i have a text file which looks like this.

    Parameter 0:
    Field 1           : 100
    Field 2           : 0
    Field 3           : 4

    Parameter 1:
    Field 1           : 873
    Field 2           : 23
    Field 3           : 89

I want to write a perl script that parses this file in the following format

     Parameter Field1 Field2 Field3
       0          100     0      4
       1          873     23     89

Can anyone help me with this. Any help will be greatly appreciated. i have tried the following so far

my %hash = ();
my $file = "sample.txt";

open (my $fh, "<", $file) or die "Can't open the file $file: ";

while (my $line =<$fh>)
{
    chomp ($line);
    my($key) = split(" : ", $line);
    $hash{$key} = 1;
}

foreach my $key (sort keys %hash)
{
    print "$key\n";
}
2
  • @RenéNyffenegger: but i am getting error in this code and its not exactly givine the output that i want. Commented Mar 19, 2014 at 6:54
  • 1
    it will certainly help if you edited you question and added the code that you pasted as a comment as well as the error message that you have received. Commented Mar 19, 2014 at 6:57

4 Answers 4

2

This Perl program does what you ask. It allows for any number of fields for each parameter (although there must be the same number of fields for every parameter) and takes the header labels for the fields from the data itself.

use strict;
use warnings;

my $file = 'sample.txt';

open my $fh, '<', $file or die qq{Can't open "$file" for input: $!};

my %data;
my @params;
my @fields;

while (<$fh>) {
  next unless /\S/;
  chomp;

  my ($key, $val) = split /\s*:\s*/;
  if ($val =~ /\S/) {
    push @fields, $key if @params == 1;
    push @{ $data{$params[-1]} }, $val if @params;
  }
  else {
    die qq{Unexpected parameter format "$key"} unless $key =~ /parameter\s+(\d+)/i;
    push @params, $1;
  }
}

my @headers = ('Parameter', @fields);
my @widths = map length, @headers;
my $format = join(' ', map "%${_}s", @widths) . "\n";

printf $format, @headers;
for my $param (@params) {
  printf $format, $param, @{ $data{$param} };
}

output

Parameter Field 1 Field 2 Field 3
        0     100       0       4
        1     873      23      89
Sign up to request clarification or add additional context in comments.

Comments

0
use warnings; use strict;

my $file = "sample.txt";
open (my $fh, "<", $file) or die "Can't open the file $file: ";

print "Parameter Field1 Field2 Field3\n";

while (my $line=<$fh>) {

  process_parameter($1) if $line =~ /Parameter (\d+):/;

}

sub process_parameter {

  my $parameter = shift;

  my ($field_1) = (<$fh> =~ /(\d+) *$/);
  my ($field_2) = (<$fh> =~ /(\d+) *$/);
  my ($field_3) = (<$fh> =~ /(\d+) *$/);

  printf "  %-2d         %-6d  %-6d %-6d\n", $parameter, $field_1, $field_2, $field_3;
}

2 Comments

I don't get the point in isolating four lines of code in a subroutine. Could that not just as well be put in the while loop?
@Borodin Technically yes.
0
#!/usr/bin/perl

my %hash = ();
my %fields;

my $param;

while ( chomp( my $line = <DATA> ) ) {
    if ( $line =~ /Parameter (\d+):/ ) {
        $param = $1;
    }
    next unless ( defined $param );

    if ( my ( $f, $v ) = $line =~ /(Field \d+)[\s\t]*: (\d+)/ ) {
        $hash{$param} ||= {};

        $hash{$param}->{$f} = $v;

        $fields{$f} ||= 1;
    }

}

my @fields = sort keys %fields;
print join( ',', 'Parameter', @fields ), "\n";

foreach my $param ( sort { $a <=> $b } keys %hash ) {
    print join( ',', $param, @{ $hash{$param} }{@fields} ), "\n";
}

__DATA__
Parameter 0:
Field 1           : 100
Field 2           : 0
Field 3           : 4

Parameter 1:
Field 1           : 873
Field 2           : 23
Field 3           : 89

1 Comment

Always use strict and use warnings. There is no need for a character class like [\s\t] as \s matches tabs as well as spaces (and linefeed, carriage return and formfeed). There is no need to initialise a variable to an empty hash before setting elements of that hash: Perl will autovivify a hash for you. The hash %fields is inappropriate since you aren't interested in the values of the elements; you should use an array of field names instead.
0

Here is a way that accepts any number of fields for each parameter:

my $par;
my %out;
my $max = 0;
while(<DATA>) {
    chomp;
    next if /^\s*$/;
    if (/Parameter\s*(\d+)/) {
        $par = $1;
        next;
    }
    my ($k, $v) = $_ =~/Field\s+(\d+)\s*:\s*(\d+)/;
    $out{$par}[$k] = $v;
    $max = $k if $k > $max;
}
my $cols = 'Param';
$cols .= "\tField $_" for (1..$max);
say $cols;
foreach my $par(sort (keys %out))  {
    my $out = $par;
    $out .= "\t".($out{$par}[$_]//' ') for (1..$max);
    say $out;
}

__DATA__
Parameter 0:
    Field 1           : 100
    Field 2           : 0
    Field 3           : 4
    Field 5 :18

    Parameter 1:
    Field 1           : 873
    Field 2           : 23
    Field 3           : 89
    Field 4     : 123

output:

Param   Field 1 Field 2 Field 3 Field 4 Field 5
0       100     0       4               18
1       873     23      89      123 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.