writing shell script

Question

I want to write a shell script that will read a file from standard input, remove all string and empty line character, and write the output to the standard output. the file look like this:

#some lines that do not contain <html> in here
<html>a<html>
<tr><html>b</html></tr>
#some lines that do not contain <html> in here
<html>c</html>

So, the output file should contain:

#some lines that do not contain <html> in here
a
<tr>b</html></tr>
#some lines that do not contain <html> in here
c</html>

I try to write this shell script:

read INPUT #read file from std input
tr -d '[:blank:]'
grep "<html>" | sed -r 's/<html>//g'
echo $INPUT

however this script isn't working at all. any idea? thx

You might want to try this in Perl (or something other than a certain shell,) if possible: check out the answer(s) on this other question — summea
– summea, Commented Mar 19, 2013 at 19:50
I guess I don't understand why you have multiple <html></html> pairs in one document, as well... — summea
– summea, Commented Mar 19, 2013 at 19:56
I don't know it either. it just some random file that my teacher give to us — Hanna Gabby
– Hanna Gabby, Commented Mar 19, 2013 at 19:59

Fredrik Pihl · Accepted Answer · 2013-03-19 20:17:33Z

1

Pure bash:

#!/bin/bash

while read line
do
    #ignore comments
    [[ "$line" = "\#" ]] && continue
    #ignore empty lines
    [[ $line =~ ^$ ]] && continue
    echo ${line//\<html\>/}
done < $1

Output:

$ ./replace.sh input
#some lines that do not contain in here
a
<tr>b</html></tr>
#some lines that do not contain in here
c</html>

Pure sed:

sed -e :a -e '/^[^#]/N; s/<html>//; ta' input | sed '/^$/d'

edited Mar 19, 2013 at 20:17

answered Mar 19, 2013 at 20:03

Fredrik Pihl

45.9k7 gold badges89 silver badges133 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Hanna Gabby Over a year ago

what [[ "$line" = "\#" ]] mean? and I can't only use grep and sed

Fredrik Pihl Over a year ago

see comments in source above

Hanna Gabby Over a year ago

so the first sed will remove <html>, but what the second sed do?

Fredrik Pihl Over a year ago

second one deletes empty lines

Zsolt Botykai · Accepted Answer · 2013-03-19 19:54:48Z

1

Awk can do it easily:

awk '/./ {gsub("<html>","");print}' INPUTFILE

First it operates on every line with at least one character (so empty lines are discarded), and it replaces "<html>" globally with an empty string on the line, then prints it.

answered Mar 19, 2013 at 19:54

Zsolt Botykai

52k14 gold badges90 silver badges111 bronze badges

3 Comments

Fredrik Pihl Over a year ago

OP needs comments to be preserved

Hanna Gabby Over a year ago

I can only use grep and sed. but what is /./ mean? is it mean the current directory?

Fredrik Pihl Over a year ago

@HannaGabby - /./ is a regular expression that means one character [any]

Collectives™ on Stack Overflow

writing shell script

2 Answers 2

4 Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related