2

So I'm trying to make a hash of arrays based on a regex inside a foreach.

I'm getting some file paths, and they are of the format:

longfilepath/name.action.gz

so basically there will be files with the same name but diffent actions, so I want to make a hash with keys of name that are arrays of actions. I'm apparently doing something wrong as I keep getting this error when I run the code:

Not an ARRAY reference at ....the file I'm writing in

Which I don't get since I'm checking to see if its set, and if not declaring it as an array. I'm still getting used to perl, so I'm guessing my problem is something simple.

I should also say, that I've verified my regex is generating both the 'name' and 'action' strings properly so the problem is definitely in my foreach;

Thanks for your help. :)

My code is thus.

my %my_hash;
my $file_paths = glom("/this/is/mypath/*.*\.gz"); 
foreach my $path (@$bdr_paths){

    $path =~ m"\/([^\/\.]+)\.([^\.]+)\.gz";

    print STDERR "=>".Dumper($1)."\n\r";
    print STDERR "=>".Dumper($2)."\n\r";

    #add the entity type to a hash with the recipe as the key
    if($my_hash{$1})
    {
        push($my_hash{$1}, $2);
    }
    else
    {
        $my_hash{$1} = ($2);
    }

}

2 Answers 2

2

It’s glob, not glom. In glob expressions, the period is no metacharacter. → glob '/this/is/mypath/*.gz'.

The whole reason of using alternate regex delimiters is to avoid unneccessary escapes. The forward slash is no regex metacharacter, but a delimiter. Inside charclasses, many operators loose their specialness; no need to escape the period. Ergo m!/([^/.]+)\.([^.]+)\.gz!.

Don't append \n\r to your output. ① The Dumper function already appends a newline. ② If you are on a OS that expects a CRLF, then use the :crlf PerlIO layer, which transforms all \ns to a CRLF. You can add layers via binmode STDOUT, ':crlf'. ③ If you are doing networking, it might be better to specify the exact bytes you want to emit, e.g. \x0A\x0D or \012\015. (But in this case, also remove all PerlIO layers).

Using references as first arg to push doesn't work on perls older than v5.14.

Don't manually check whether you populated a slot in your hash or not; if it is undef and used as an arrayref, an array reference is automatically created there. This is known as autovivification. Of course, this requires you to perform this dereference (and skip the short form for push).

In Perl, parens only sort out precedence, and create list context when used on the LHS of an assignment. They do not create arrays. To create an anonymous array reference, use brackets: [$var]. Using parens like you do is useless; $x = $y and $y = ($y) are absolutely identical.

So you either want

push @{ $my_hash{$1} }, $2;

or

if ($my_hash{$1}) {
  push $my_hash{$1}, $2;
} else {
  $my_hash{$1} = [$2];
}

Edit: Three things I overlooked.

If glob is used in scalar context, it turns into an iterator. This is usually unwanted, unless when used in a while(my $path = glob(...)) { ... } like fashion. Otherwise it is more difficult to make sure the iterator is exhausted. Rather, use glob in list context to get all matches at once: my @paths = glob(...).

Where does $bdr_paths come from? What is inside?

Always check that a regex actually matched. This can avoid subtle bugs, as the captures $1 etc. keep their value until the next successful match.

Sign up to request clarification or add additional context in comments.

3 Comments

Its working now, thanks. Any idea why this foreach is adding one extra element to the hash with a key of hash => undef? There are only 2 files currently, and the loop only dumps the variables twice so it doesn't make sense to me. I used your first suggestion and got rid of the if statements. Actually, this happens in both of the methods. I can filter it but I'd rather figure why it happens in the first place.
nevermind. it was because in my original code not posted I was declaring my hash as %my_hash = {}
@Rooster That means you didn't have use strict; use warnings activated. Otherwise you would have gotten a warning “Reference found where even-sized list expected at line ...”.
1

When you say $my_hash{$1} = ($2); it evaluates it in list context and stores the last object of the list in the hash.

my %h;
$h{a} = ('foo');
$h{b} = ['bar'];
$h{c} = ('foo', 'bar', 'bat'); # Will cause warning if 'use warnings;'
print Dumper(\%h);

Gives

$VAR1 = {
          'c' => 'bat',
          'b' => [
                   'bar'
                 ],
          'a' => 'foo'
        };

You can see that is stored as the value and not an array reference. So you can store an anonymous array ref with $my_hash{$1} = [$2]; Then you push onto it with push( @{ $my_hash{$1} }, $2);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.