0

I'm trying to use a foreach loop to loop through an array and then use a nested while loop to loop through each line of a text file to see if the array element matches a line of text; if so then I push data from that line into a new array to perform calculations.

The outer foreach loop appears to be working correctly (based on printed results with each array element) but the inner while loop is not looping (same data pushed into array each time).

Any advice?

The code is below

#! /usr/bin/perl -T

use CGI qw(:cgi-lib :standard);

print "Content-type: text/html\n\n";

my $input       = param('sequence');
my $meanexpfile = "final_expression_complete.txt";

open(FILE, $meanexpfile) or print "unable to open file";

my @meanmatches;

@regex = (split /\s/, $input);

foreach $regex (@regex) {

    while (my $line = <FILE>) {
        if ( $line =~ m/$regex\s(.+\n)/i ) {
            push(@meanmatches, $1);
        }
    }

    my $average                  = average(@meanmatches);
    my $std_dev                  = std_dev($average, @meanmatches);
    my $average_round            = sprintf("%0.4f", $average);
    my $stdev_round              = sprintf("%0.4f", $std_dev);
    my $coefficient_of_variation = $stdev_round / $average_round;
    my $cv_round                 = sprintf("%0.4f", $coefficient_of_variation);

    print font(
        { color => "blue" }, "<br><B>$regex average: $average_round   
&nbspStandard deviation: $stdev_round&nbspCoefficient of 
variation(Cv): $cv_round</B>"
    );
}

sub average {
    my (@values) = @_;
    my $count    = scalar @values;
    my $total    = 0;

    $total += $_ for @values;

    return $count ? $total / $count : 0;
}

sub std_dev {
    my ($average, @values) = @_;
    my $count       = scalar @values;
    my $std_dev_sum = 0;

    $std_dev_sum += ($_ - $average)**2 for @values;

    return $count ? sqrt($std_dev_sum / $count) : 0;
}
6
  • The way you calculate the averages is flawed, and your standard deviation is going to fail catastrophically. See Algorithms for calculating variance. While you are at it, you probably want the unbiased standard deviation estimator. Commented Mar 28, 2015 at 22:06
  • Also, where is this CGI script running? It seems to me anyone who can invoke this script can do interesting things with it. Commented Mar 28, 2015 at 22:56
  • I have laid out your Perl code so that it is more readable. It is essential that you do that yourself so that you can spot any scoping problems, and it is only polite to post readable code when you are hoping for help to debug it. For the same reasons you must always use strict and use warnings at the top of every Perl program that you write. They are invaluable aids to finding problems with your code, and without use strict there is little point in using my to declare anything. In this case, use strict reveals that you have failed to declare either @regex or $regex. Commented Mar 29, 2015 at 10:05
  • @SinanÜnür: Your blog post is fascinating. I have often used something similar to calculate rolling statistics, but had no thought that the accuracy would be improved. (I might have known that Don Knuth was involved.) But I think you overstate the importance of doing things "properly". As you say yourself, "One saving grace of the real world is the fact that a given variable is unlikely to contain values with such an extreme range" and I hope it would be apparent to anyone who had to write for such an unusual data set that special methods were necessary. Commented Mar 29, 2015 at 10:17
  • @Borodin Did you try the variance experiment at the end? That is small set of numbers that are close to each other. One must resort to somewhat contrived examples to illustrate the problem in one step, but in real life, you tend to do a lot more summations. It is important to calculate things stably. Commented Mar 29, 2015 at 13:02

2 Answers 2

1

Yes, my advice would be:

  • Turn on strict and warnings.
  • perltidy your code,
  • use 3 argument open: open ( my $inputfile, "<", 'final_expression.txt' );
  • die if it doesn't open - the rest of your program is irrelevant.
  • chomp $line
  • you are iterating your filehandle, but once you've done this you're at the end of file for the next iteration of the foreach loop so your while loops becomes a null operation. Simplistically, reading the file into an array my @lines = <FILE>; would fix this.

So with that in mind:

#!/usr/bin/perl -T

use strict;
use warnings;

use CGI qw(:cgi-lib :standard);
print "Content-type: text/html\n\n";

my $input       = param('sequence');
my $meanexpfile = "final_expression_complete.txt";
open( my $input_file, "<", $meanexpfile ) or die "unable to open file";
my @meanmatches;

my @regex = ( split /\s/, $input );
my @lines = <$input_file>;
chomp (@lines);
close($input_file) or warn $!;

foreach my $regex (@regex) {
    foreach my $line (@lines) {
        if ( $line =~ m/$regex\s(.+\n)/i ) {
            push( @meanmatches, $1 );
        }
    }

    my $average                  = average(@meanmatches);
    my $std_dev                  = std_dev( $average, @meanmatches );
    my $average_round            = sprintf( "%0.4f", $average );
    my $stdev_round              = sprintf( "%0.4f", $std_dev );
    my $coefficient_of_variation = $stdev_round / $average_round;
    my $cv_round = sprintf( "%0.4f", $coefficient_of_variation );

    print font(
        { color => "blue" }, "<br><B>$regex average: $average_round   
&nbspStandard deviation: $stdev_round&nbspCoefficient of 
variation(Cv): $cv_round</B>"
    );
}

sub average {
    my (@values) = @_;
    my $count    = scalar @values;
    my $total    = 0;
    $total += $_ for @values;

    return $count ? $total / $count : 0;
}

sub std_dev {
    my ( $average, @values ) = @_;

    my $count       = scalar @values;
    my $std_dev_sum = 0;
    $std_dev_sum += ( $_ - $average )**2 for @values;

    return $count ? sqrt( $std_dev_sum / $count ) : 0;
}
Sign up to request clarification or add additional context in comments.

1 Comment

Believe it or not, my first draft had that, but I lost it again :).
0

The problem here is that starting from the second iteration of foreach you are trying to read from already read file handle. You need to rewind to the beginning to read it again:

foreach $regex (@regex) {
    seek FILE, 0, 0;
    while ( my $line = <FILE> ) {

However that does not look very performant. Why read file several times at all, when you can read it once before the foreach starts, and then iterate through the list:

my @lines;
while (<FILE>) {
    push (@lines, $_);
}

foreach $regex (@regex) {
    foreach $line (@lines) {

Having the latter, you might also what to consider using grep instead of the while loop.

1 Comment

If all you're doing is pushing the lines onto an array, you can drop the while loop and simply do my @lines = <FILE>; Don't forget to chomp!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.