0

I have a directory dir1 containing several hundreds of files, which are to be iteratively processed by a speech program called HRest. The program is supposed to take each file one by one, process it and put it in a new directory (as dir2 for first iteration) to be used in next iteration. My problem is that i don't know if the way I've employed to loop through the files in dir1, and also the way I am running the script (trainhmms.pl dir1 1) is correct.

If the files in dir1 are L1, L2, L3, ..., L500, I want HRest to be executed as

HRest -T 1 -I timedlabels_train.mlf -t -i 20 -l dir1/L1 -M dir2 -S train.scp

for the first file, and as

HRest -T 1 -I timedlabels_train.mlf -t -i 20 -l dir1/L2 -M dir2 -S train.scp

for the next file, and so on for all files. Then in next call of the script, I want it to be changed to

HRest -T 1 -I timedlabels_train.mlf -t -i 20 -l dir2/L1 -M dir3 -S train.scp

for the first file, and so on..

Here is the script for the first iteration:

#!/usr/bin/perl
use File::Slurp;

# Usage: trainhmms.pl dir1 1
# dir1:  Folder containing models after being initialised by HInit (L1,L2,..,L512)

$file = $ARGV[0];
$iter = $ARGV[1];


my @files = read_dir '/Users/negarolfati/Documents/Detection_rerun/AF_TIMIT/1_state//trainHMMs/dir1';

for my $file ( @files ) {


    $iter2 = $iter+1;
    $cmd = "HRest -T 1 -I timedlabels_train.mlf -t -i 20 -l '$dir[$iter]/$file' -M '$dir[$iter2]' -S train.scp ";

    system("$cmd");

}
7
  • It's not clear what you mean by $dir[$iter] and $dir[$iter2]. They access an array called @dir which doesn't exist. Commented Feb 1, 2015 at 14:43
  • by $dir[$iter], I want to access to the folder dir1 during the first iteration, process all files in that folder, and then put store the processed files in dir2. Commented Feb 1, 2015 at 14:47
  • But what are those directories? Commented Feb 1, 2015 at 14:48
  • If the files in dir1 are L1, L2, L3, ..., L500, I want the HRest to be executed as: HRest -T 1 -I timedlabels_train.mlf -t -i 20 -l dir1/L1' -M dir2 -S train.scp for the first file, and as HRest -T 1 -I timedlabels_train.mlf -t -i 20 -l dir1/L2 -M dir2 -S for the next file, and so on for all files. then in next call of the script, I want it to be changed to HRest -T 1 -I timedlabels_train.mlf -t -i 20 -l dir2/L1' -M dir3 -S train.scp for the first file, and so on.. Commented Feb 1, 2015 at 14:50
  • I hope its clear now Commented Feb 1, 2015 at 14:51

2 Answers 2

3

You can't just use readdir on a directory string. You have to opendir the string, then readdir from the directory handle that you get, and finally closedir the handle.

You must also remember that readdir returns directory names as well as file names, and the pseudo-directories . and .. too. To filter out just the files, you can use the -f test operator. And it is usually most convenient to chdir to the directory you are reading so that you don't have to append the path to each file name that readdir returns before you do the test.

I don't know what HRest is, but if your command line must be executed from a specific working directory (perhaps to acccess timedlabels_train.mlf and train.scp) then please say so. I will have to remove the chdir statement.

Something like this should get you going. I have used autodie, which does automatic checks on file system operations. It saves having to check chdir and opendir explicitly each time with or die $!.

#!/usr/bin/perl

use strict;
use warnings;
use autodie;

use File::Spec::Functions 'catdir';

my ($file, $iter) = @ARGV;

my $root = '/Users/negarolfati/Documents/Detection_rerun/AF_TIMIT/1_state/trainHMMs';
my $dir1 = catdir $root, 'dir'.$iter;
my $dir2 = catdir $root, 'dir'.($iter+1);

chdir $dir1;

opendir my ($dh), '.';
my @files = grep -f, readdir $dh;
closedir $dh;

for my $file ( @files ) {

    my $cmd = "HRest -T 1 -I timedlabels_train.mlf -t -i 20 -l '$dir1/$file' -M '$dir2' -S train.scp";

    system($cmd);
}

Update

Here is an alternative version that avoids chdir so that the current working directory remains unchanged.

I have added the secondary loop that was in your bash script. I have also added a print statement so that you can see each command before it is executed.

To allow the system call to go ahead, just delete or comment out the next statement.

#!/usr/bin/perl

use strict;
use warnings;
use autodie;

use File::Spec::Functions qw/ catdir catfile /;

STDOUT->autoflush;

my $root = '/Users/negarolfati/Documents/Detection_rerun/AF_TIMIT/1_state/trainHMMs';

for my $iter (1 .. 4) {

  my $dir1 = catdir $root, 'dir'.$iter;
  my $dir2 = catdir $root, 'dir'.($iter+1);

  opendir my ($dh), $dir1;

  while (my $node = readdir $dh) {
    my $file = catfile($dir1, $node);
    next unless -f $file;

    my $cmd = "HRest -T 1 -I timedlabels_train.mlf -t -i 20 -l '$file' -M '$dir2' -S train.scp";
    print $cmd, "\n";
    next;               # Remove for full functionality

    system($cmd);
  }

  closedir $dh;
}
Sign up to request clarification or add additional context in comments.

6 Comments

Thank you for such an impressive answer Borodin! But when I try it, I get the following error: InitSource: Cannot open source file /Users/negarolfati/Documents/Detection_rerun/AF_TIMIT/1_state/trainHMMs/dir1//Users/negarolfati/Documents/Detection_rerun/AF_TIMIT/1_state/trainHMMs/dir1/L505 ....there the $root is repeated twice but i can not figure out where to change, could you kindly help me?
Sorry, my mistake! Its working now, thank you so much for such a great help! :)
@Borodin, the question uses File::Slurp's read_dir which does operate on a string containing a directory name and doesn't return . or ...
@Jim Davis, so you mean the way I used read_dir is also correct?
Yep. You still need to prefix the result with the directory name to get the full path, though
|
-1

You can do this:

my @files = <$path/*>;
foreach my $filename ( reverse(@files) ) {
...
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.