1

I am having a terrible time with learning Perl regular expressions. I am trying to :

  • Replace all occurrences of a single # at the beginning of a line with: #####.
  • Replace all occurrences of a full line of # characters (ignoring leading or trailing spaces) with
    # ---------- #.

I know its s/# but that's all I know and all I can find. Any suggestions.

1
  • possible duplicate of stackoverflow.com/questions/1030787/… I think you need to also define what is after the lines that already contains # is it followed by space or alphabet since what you do not want to do is replace those that already are ##### with duplicate ##### Commented May 2, 2013 at 13:54

2 Answers 2

4

The beginning of a line is matched by ^. Therefore, a line starting with a # is matched by

/^#/

If you want the # to be single, i.e. not followed by another #, you must add a negative character class:

/^#[^#]/

We do not want to replace the character following the #, so we will replace it with a non matching group (called negative look-ahead):

/^#(?!#)/

To add the replacement, just change it to

s/^#(?!#)/#####/

The full line can be matched by the following regular expression:

/^#+$/

Plus means "once or more", ^ and $ have already been explained. We just have to ignore the leading and trailing spaces (* means "zero or more"):

/^ *#+ *$/

We do not want the spaces to be replaced, so we have to keep them. Parentheses create "capture groups" that are numbered from 1:

s/^( *)#+( *)$/$1# ---------- #$2/
Sign up to request clarification or add additional context in comments.

3 Comments

Don't you need to pick up the second character and add it to the replacement (or it will get replaced)? s/^#([^#])|^#$/#####\1/
@KlasLindbäck: Indeed. I started with look-ahead but removed it later to simplify things :) So back to the original version.
/^#(?!#)/ is more correct for the first one than /^#(?=[^#])/
2

For your first replacement:

$line =~ s/^#/#####/;

The idea here is that you want any line that starts with a '#'. The '^' in the regex says that what follows must be at the beginning of the string.

And for your second replacement:

$line =~ s/^#+$/# ---------- #/;

This uses '^' again and '$'. The '$' at the end says that what comes before must go to the end of the string. '#+' says that there must be one or more '#' characters. So, in other words, the entire string must consist of '#'.

Here's a test script and run:

$ cat foo.pl
#! /usr/bin/perl

use strict;
use warnings;

my @lines = (
        "foo line",
        "# single comment",
        "another line",
        "#############",
        "# line",
        "############",
);

foreach my $line( @lines ){
        print "ORIGINAL:  $line\n";
        $line =~ s/^#/#####/;
        $line =~ s/^#+$/# ---------- #/;
        print "NEW:       $line\n";
        print "\n";
}

$ ./foo.pl
ORIGINAL:  foo line
NEW:       foo line

ORIGINAL:  # single comment
NEW:       ##### single comment

ORIGINAL:  another line
NEW:       another line

ORIGINAL:  #############
NEW:       # ---------- #

ORIGINAL:  # line
NEW:       ##### line

ORIGINAL:  ############
NEW:       # ---------- #

2 Comments

Not sure if the assignment was supposed to be this hard, but ## Double comment shouldn't be replaced if I understand the question correctly. If this is supposed to be a "comment beautifier" then the person writing the question might not have considered the possibility of line starting with two hashes.
true. i think i mentally distinguished "single comment" from "lone comment" when i answered. oh natural language. i think @choroba sufficiently covered what's necessary though.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.