1

I have an array of hashes and I need to get unique values for the college_name from this data structure.

I have achieved the same but looks like a long process.

use strict;
use warnings;

use Data::Dumper;
use List::MoreUtils qw(uniq);

my %col_hash    = ();

my $college_ids = [
  {
    'term'         => 'SPRING',
    'city_code'    => '530233',
    'college_id'   => '200',
    'college_name' => 'Arts',
    'course_name'  => 'Drawing',
  },
  {
    'term'         => 'SUMMER',
    'city_code'    => '534233',
    'college_id'   => '300',
    'college_name' => 'COMMERCE',
    'course_name'  => 'FINANCE',
  }
];

foreach my $elem (@$college_ids) {
  if (exists $col_hash{'college_name'}) {
    push(@{ $col_hash{'college_name'} }, $elem->{'college_name'});
  }
  else {
    $col_hash{'college_name'} = [$elem->{'college_name'}];
  }
}

my @unique_college_names = uniq @{ $col_hash{'college_name'} };
warn Dumper(" LONG METHOD  = ", @unique_college_names);

I have to do the same for Term, College_name, City code.

Is there an alternate method to achieve the same functionality?

3 Answers 3

1

Unlike most languages, Perl will allow you to push to a variable that is currently undefined. It will autovivify an array and set the variable to refer to it.

Here's a short program that demonstrates the feature

my $list;
push @$list, qw/ a b c /;
print $list->[1];

output

b

So there is no need to pre-define $list with something like my $list = [].

That means you can reduce your for loop to just

for my $elem (@$college_ids) {
    $col_hash{college_name} = [ $elem->{college_name} ];
}

However I think it is simplest to use a hash of hashes to keep track of the unique values for each category. This program uses autovivication again to increment what may be a non-existent hash element. After the loop the values of the hash are equal to the number of incidences of that value for the category, but in this case you are not interested in the counts -- it is necessary only to list the (unique) keys of the hash for each category.

use strict;
use warnings;

my %col_hash;

my $college_ids = [
  {
    'term'         => 'SPRING',
    'city_code'    => '530233',
    'college_id'   => '200',
    'college_name' => 'Arts',
    'course_name'  => 'Drawing',
  },
  {
    'term'         => 'SUMMER',
    'city_code'    => '534233',
    'college_id'   => '300',
    'college_name' => 'COMMERCE',
    'course_name'  => 'FINANCE',
  }
];

my %unique;

for my $elem (@$college_ids) {
  while (my ($key, $val) = each %$elem) {
    ++$unique{$key}{$val};
  }
}

for my $field ( qw/ term college_name city_code / ) {
  print "$field\n";
  print "  $_\n" for sort keys %{ $unique{$field} };
  print "\n";
}

output

term
  SPRING
  SUMMER

college_name
  Arts
  COMMERCE

city_code
  530233
  534233
Sign up to request clarification or add additional context in comments.

Comments

1

Borodin's answer is nearly there, but it's best to avoid using each

In this case removing each can make it shorter:

use strict;
use warnings;

my $college_ids = [
  {
    'term'         => 'SPRING',
    'city_code'    => '530233',
    'college_id'   => '200',
    'college_name' => 'Arts',
    'course_name'  => 'Drawing',
  },
  {
    'term'         => 'SUMMER',
    'city_code'    => '534233',
    'college_id'   => '300',
    'college_name' => 'COMMERCE',
    'course_name'  => 'FINANCE',
  }
];

my %unique;
for my $elem (@$college_ids) {
  ++$unique{$_}{$elem->{$_}} for keys %$elem;
}

for my $field (qw(term college_name city_code)) {
  print "$field\n";
  print "  $_\n" for sort keys %{ $unique{$field} };
  print "\n";
}

2 Comments

thanks for your inputs. I found one solution which I have pasted above.
It is foolish to proscribe a construct just because it has problems in specific situations. each is very useful for iterating over an entire hash without the additional time and noise that it takes to extract the hash value corresponding to each key. The link that you offer calls out issues with modifying a hash within a loop that uses each. Note that a simple for loop has the same problem, and in most other languages as well as Perl. More important is that each isn't reentrant. So yes, care must be taken to avoid nesting loops that use each, and it should never be used in callable c
-1

I did it with this one line. No loops.

my %uniq_colleges = map { $_->{'college_name'} => 1 } @$college_ids;

Later keys %uniq_colleges will give me the list of unique colleges.

Thanks

2 Comments

map is no less a looping structure than for. Also, you don't seem to be answering your own question, which asks about how to extract all unique values for a number of different keys.
@Borodin Thanks for the question, Since I am storing the College_names into the Hash, Hash keys being unique; solves the problem.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.