3

I have been learning perl for the past two weeks. I have been writing some perl scripts for my school project. I need to parse a text file for multiple strings. I searched perl forums and got some information.The below function parses a text file for one string and returns a result. However I need the script to search the file for multiple strings.

use strict;
use warnings;


sub find_string {
    my ($file, $string) = @_;
    open my $fh, '<', $file;
    while (<$fh>) {
        return 1 if /\Q$string/;
    }
    die "Unable to find string: $string";
}

find_string('filename', 'string');

Now for instance if the file contains multiple strings with regular expressions as listed below

"testing"
http://www.yahoo.com =1
http://www.google.com=2

I want the function to search for multiple strings like

find_string('filename', 'string1','string2','string3');

Please can somebody explain me how i need to do that.It would be really helpful

3 Answers 3

2

Going through this very quickly here:

You right now pass the name of a file, and one string. What if you pass multiple strings:

 if ( find_string ( $file, @strings ) ) {
    print "Found a string!\n";
}
else {
    print "No string found\n";
}


..

sub find_string {
    my $file    = shift;
    my @strings = @_;
    #
    # Let's make the strings into a regular expression
    #
    my $reg_exp = join "|" ,@strings;   # Regex is $string1|$string2|$string3...

    open my $fh, "<", $file or die qq(Can't open file...);
    while ( my $line = <$fh> ) {
       chomp $line;
       if ( $line =~ $reg_exp ) {
           return 1;     # Found the string
       }
    }
    return 0;            # String not found
}

I am about to go into a meeting, so I haven't really even tested this, but the idea is there. A few things:

  • You want to handle characters in your strings that could be regular expression characters. You can use either the quotemeta command, or use \Q and \E before and after each string.
  • Think about using use autodie to handle files that can't be open. Then, you don't have to check your open statement (like I did above).
  • There are limitations. This would be awful if you were searching for 1,000 different strings, but should be okay with a few.
  • Note how I use a scalar file handle ($fh). Instead of opening your file via the subroutine, I would pass in a scalar file handle. This would allow you to take care of an invalid file issue in your main program. That's the big advantage of scalar file handles: They can be easily passed to subroutines and stored in class objects.

Tested Program

#! /usr/bin/env perl
#

use strict;
use warnings;
use autodie;
use feature qw(say);

use constant {
    INPUT_FILE =>       'test.txt',
};


open my $fh, "<", INPUT_FILE;

my @strings = qw(foo fo+*o bar fubar);

if ( find_string ( $fh, @strings ) ) {
    print "Found a string!\n";
}
else {
    print "No string found\n";
}

sub find_string {
    my $fh    = shift;          # The file handle
    my @strings = @_;           # A list of strings to look for

    #
    # We need to go through each string to make sure there's
    # no special re characters
    for my $string ( @strings ) {
        $string = quotemeta $string;
    }

    #
    # Let's join the stings into one big regular expression
    #
    my $reg_exp = join '|', @strings;   # Regex is $string1|$string2|$string3...
    $reg_exp = qr($reg_exp);            # This is now a regular expression

    while ( my $line = <$fh> ) {
        chomp $line;
        if ( $line =~ $reg_exp ) {
            return 1;     # Found the string
        }
    }
    return 0;            # String not found
}
  • autodie handles issues when I can't open a file. No need to check for it.
  • Notice I have three parameters in my open. This is the preferred way.
  • My file handle is $fh which allows me to pass it to my find_string subroutine. Open the file in the main program, and I can handle read errors there.
  • I loop through my @strings and use the quotemeta command to automatically escape special regular expression characters.
  • Note that when I change $string in my loop, it actually modifies the @strings array.
  • I use qr to create a regular expression.
  • My regular expression is /foo|fo\+\*o|bar|fubar/.
  • There are a few bugs For example, the string fooburberry will match with foo. Do you want that, or do you want your strings to be whole words?
Sign up to request clarification or add additional context in comments.

3 Comments

thanks for the response. Where do i need to specify the strings that i need to search..I want to search string1,string2,string3.How do i make the function call. Pardon my ignorance. I have taken a programming course for the first time.
You pass the strings to the function call like you originally did. Notice that find_string does my @strings = @_;. This takes everything else off of the list. You originally assumed that the @_ array will have two elements. I am assuming somewhere between two and infinity.
See the tested program in my answer
0

I'm happy to see use strict and use warnings in your script. Here is one basic way to do it.

use strict;
use warnings;


sub find_string {

    my ($file, $string1, $string2, $string3) = @_;

    my $found1 = 0;
    my $found2 = 0;
    my $found3 = 0;

    open my $fh, '<', $file;
    while (<$fh>) {
        if ( /$string1/ ) {
            $found1 = 1;
        }
        if ( /$string2/ ) {
            $found2 = 1;
        }
        if ( /$string3/ ) {
            $found3 = 1;
        }
    }

    if ( $found1 == 1 and $found2 == 1 and $found3 == 1 ) {
        return 1;
    } else {
        return 0;
    }
}

my $result = find_string('filename', 'string1'. 'string2', 'string3');

if ( $result == 1 ) {
    print "Found all three strings\n";
} else {
    print "Didn't find all three\n";
}

2 Comments

Thanks for the reply. I need to find 5-10 strings.In that case i need to create $found 1 ...$found n .So is there any other way to find it.
It would have been helpful if you just put that in your question to begin with. But yeah there are lots of ways to do it.
0

I think you can store the file content in an array first, then grep the input in the array.

use strict;
use warnings;

sub find_multi_string {
    my ($file, @strings) = @_; 
    my $fh;
    open ($fh, "<$file");
    #store the whole file in an array
    my @array = <$fh>;

    for my $string (@strings) {
        if (grep /$string/, @array) {
            next;
        } else {
            die "Cannot find $string in $file";
        }   
    }   

    return 1;
}

1 Comment

.How do i make the function call for this .find_multi_string('filename', 'string1','string2','string3'); Will this work??

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.