1

I'm trying to write a script that will separate out columns of data from a (not terribly large) .csv into individual lists for use later using the Text::CSV_XS library. I have no problem getting individual columns, but I seem to be unable to iterate through a list of columns using a foreach loop.

#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV_XS;
use 5.18.2;

my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
open my $fh, "<", "/users/whoever/test_csv.csv" or die "$!";

sub column_grabber {
        foreach my $column (@_) {
                my @data = map { $_->[$column] } @{$csv->getline_all ($fh)};
                return @data;
        }
}

my @column_numbers = (1,2,3,4);

my @collected_data = column_grabber(@column_numbers);

close $fh or die "$!";

Calling this subroutine for a list of columns gives me only the first column of the list as anticipated, but none of the following columns from the list. A bit of troubleshooting shows that @_ is seeing the entire list I pass.

By omitting the return statement, the foreach loop carries through all of the columns passed in @ids, but I get no output from @data.

Is there some behavior in the loop I'm not accounting for? Perhaps it has something to do with the map() statement?

Edit / Solution

So after playing around with this for a while and rethinking things a bit, I've solved my problem.

  • First, opening and closing the filehandle from inside the loop seems to have cleared up a lot of headaches.
  • Second, it's a lot easier to just parse @column_numbers outside the subroutine and pass scalars to &column_grabber instead. This saves me from getting lost in a sea of references when I don't really need to worry about it for this small script.

So now my functioning script looks like this:

#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV_XS;
use 5.18.2;

sub column_grabber {
    my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 });
    open my $fh, "<", "/users/whoever/test_csv.csv" or die "$!";

    my $column = shift @_;
    my @data = map { $_->[$column] } @{$csv->getline_all ($fh)};
    return @data;

    close $fh or die "$!";
}

my @column_numbers = (1,2,3,4);
foreach my $column(@column_numbers){
    my @collected_data = &column_grabber($column);
...
}

Thanks for the input and help from commenters.

2
  • The function returns from inside the loop, in the first iteration, so it never goes through columns other than the first. On the side, I'd advise to get into a habit of always unpacking arguments to a function, unless you have some ultimate need for speed. (Even then, if it comes down to that one should probably look at the design again.) Since the input from the caller is aliased in @_ there's a good chance for nasty bugs when it's used directly. Another good habit is to pass lists (arrays and hashes) by reference (unless they are merely short collections of independent variables). Commented May 19, 2018 at 1:48
  • Thanks for you help @zdim, I'll keep those pointers in mind going forward. Commented May 31, 2018 at 21:42

1 Answer 1

2

Keep in mind that each element of @data (hereby renamed @rows or $rows) should be a reference to an array of the selected fields.

my @rows;
while ( my $row = $csv->getline($fh) ) {
   push @rows, [ @{ $row }[@column_numbers] ];
}

or

my $rows = $csv->getline_all($fh);
@_ = @{ $_ }[@column_numbers] for @$rows;

or

my @rows = map { [ @{ $_ }[@column_numbers] ] } @{ $csv->getline_all($fh) };
Sign up to request clarification or add additional context in comments.

2 Comments

Your code has been helping me out a bit, and I've been working with references for the first time. The only problem I'm having is dereferencing @rows from that first code example. I run into an error that tells me I can't use a string as an array reference. Should I be doing something differently to get my data back from @rows? Again, this is my first time working with references, so I might be missing something important.
huh? It's impossible to dereference @rows. It's an array, not a reference. (It's an array of references, and the references are references to arrays with as many elements as @column_numbers)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.