Perl: Read columns and convert to array

Question

I am new to perl, trying to read a file with columns and creating an array. I am having a file with following columns.

file.txt
A    15
A    20
A    33 
B    20 
B    45
C    32
C    78

I wanted to create an array for each unique item present in A with its values assigned from second column. eg:

@A = (15,20,33)
@B = (20,45)
@C = (32,78)

Tried following code, only for printing 2 columns

use strict;
use warnings;
my $filename = $ARGV[0];
open(FILE, $filename) or die "Could not open file '$filename' $!";
my %seen;
while (<FILE>)
{
    chomp;
    my $line = $_;
    my @elements = split (" ", $line);
    my $row_name = join "\t", @elements[0,1];
    print $row_name . "\n" if ! $seen{$row_name}++;

}
close FILE;

Thanks

Dave Cross · Accepted Answer · 2018-07-12 08:49:34Z

3

Firstly some general Perl advice. These days, we like to use lexical variables as filehandles and pass three arguments to open().

open(my $fh, '<', $filename) or die "Could not open file '$filename' $!";

And then...

while (<$fh>) { ... }

But, given that you have your filename in $ARGV[0], another tip is to use an empty file input operator (<>) which will return data from the files named in @ARGV without you having to open them. So you can remove your open() line completely and replace the while with:

while (<>) { ... }

Second piece of advice - don't store this data in individual arrays. Far better to store it in a more complex data structure. I'd suggest a hash where the key is the letter and the value is an array containing all of the numbers matching that letter. This is surprisingly easy to build:

use strict;
use warnings;
use feature 'say';

my %data; # I'd give this a better name if I knew what your data was

while (<>) {
  chomp;
  my ($letter, $number) = split; # splits $_ on whitespace by default
  push @{ $data{$letter} }, $number;
}

# Walk the hash to see what we've got
for (sort keys %data) {
  say "$_ : @{ $data{$_ } }";
}

answered Jul 12, 2018 at 8:49

Dave Cross

69.5k3 gold badges55 silver badges101 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

shawnhcorey Over a year ago

Storing data by a name in the data is called symbolic referencing, in case you want to research it. This is bad because the wrong name in the data could clobber a Perl variable. Using a hash with the name as a key is much better.

Nick P · Accepted Answer · 2018-07-12 06:26:50Z

1

Change the loop to be something like:

while (my $line = <FILE>)
{
    chomp($line);
    my @elements = split (" ", $line);
    push(@{$seen{$elements[0]}}, $elements[1]);
}

This will create/append a list of each item as it is found, and result in a hash where the keys are the left items, and the values are lists of the right items. You can then process or reassign the values as you wish.

answered Jul 12, 2018 at 6:26

Nick P

7695 silver badges25 bronze badges

2 Comments

StarCoder17 Over a year ago

Thanks Nick, How can i store or print this hash value to verify whether its taking only unique values from column 1 and respective values from column 2.

Nick P Over a year ago

@DanishSheikh Data::Dumper is the typical method, it uses references so something like print Dumper \%seen;

Collectives™ on Stack Overflow

Perl: Read columns and convert to array

2 Answers 2

1 Comment

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related