Bash, grep between two lines with specified string

Question

Example:

a43
test1
abc
cvb
bnm
test2
kfo

I need all lines between test1 and test2. Normal grep does not work in this case. Do you have any propositions?

This answer might also be applicable: stackoverflow.com/a/48022994/2026975 — imriss
– imriss, Commented Dec 29, 2017 at 16:55

Jotne · Accepted Answer · 2019-06-27 10:21:12Z

83

Print from test1 to test2 (Trigger lines included)

awk '/test1/{f=1} /test2/{f=0;print} f'
awk '/test1/{f=1} f; /test2/{f=0}' 
awk '/test1/,/test2/'

test1
abc
cvb
bnm
test2

Prints data between test1 to test2 (Trigger lines excluded)

awk '/test1/{f=1;next} /test2/{f=0} f' 
awk '/test2/{f=0} f; /test1/{f=1}'

abc
cvb
bnm

edited Jun 27, 2019 at 10:21

answered Mar 6, 2014 at 10:50

Jotne

41.7k13 gold badges54 silver badges58 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

user5059264 Over a year ago

Hello sir , do you know how can use your same script in shell language

Mr. Developerdude Over a year ago

If my input is in a string, how would I use this to process that string?

Jotne Over a year ago

@LennartRolland See my post here about how to get data inn to awk from variables. stackoverflow.com/questions/19075671/…

Eddie Over a year ago

How would you modify this to include one trigger line, but not the other? i.e., include test1, but not test2

polynomial_donut Over a year ago

Would be better if you could explain what the commands here mean

|

devnull · Accepted Answer · 2014-03-06 10:19:58Z

60

You could use sed:

sed -n '/test1/,/test2/p' filename

In order to exclude the lines containing test1 and test2, say:

sed -n '/test1/,/test2/{/test1/b;/test2/b;p}' filename

edited Mar 6, 2014 at 10:19

answered Mar 6, 2014 at 10:13

devnull

124k33 gold badges247 silver badges234 bronze badges

7 Comments

cp.engr Over a year ago

Or, if you are OK with shell expansion in your sed string, you can do start="test1"; end="test2"; sed -n "/$start/,/$end/{/$start/b;/$end/b;p}" filename. That way you only have to type each search pattern once.

123 Over a year ago

To exclude the lines it can be shortened to just sed -n '/test1/,/test2/{//b;p}'

wisbucky Over a year ago

What does //b mean? I just know b means branch, but I don't know about //

Rohan Ghige Over a year ago

Is it possible to print few lines before 'test1' string and few lines after 'test2' string? What modifications are required?

polynomial_donut Over a year ago

Would be better with semantics explained

|

philshem · Accepted Answer · 2014-03-06 10:18:55Z

16

If you can only use grep:

grep -A100000 test1 file.txt | grep -B100000 test2 > new.txt

grep -A and then a number gets the lines after the matching string, and grep -B gets the lines before the matching string. The number, 100000 in this case, has to be large enough to include all lines before and after.

If you don't want to include test1 and test2, then you can remove them afterwards by grep -v, which prints everything except the matching line(s):

egrep -v "test1|test2" new.txt > newer.txt

or everything in one line:

grep -A100000 test1 file.txt | grep -B100000 test2 | egrep -v "test1|test2" > new.txt

answered Mar 6, 2014 at 10:18

philshem

25.5k8 gold badges66 silver badges136 bronze badges

3 Comments

cp.engr Over a year ago

I didn't know about egrep, interesting, thanks. Details here. unix.stackexchange.com/questions/17949/…

Mr. Developerdude Over a year ago

just want to point out the obvious that this might fail if test1/test2 pairs occurs more than once in the input.

Aidin Over a year ago

@LennartRolland in which case, it takes the first occurrence of test1, and the first occurrence of test2 after that, which happened to be exactly what I wanted. :)

Avinash Raj · Accepted Answer · 2015-01-21 05:39:20Z

8

Yep, normal grep won't do this. But grep with -P parameter will do this job.

$ grep -ozP '(?s)test1\n\K.*?(?=\ntest2)' file
abc
cvb
bnm

\K discards the previously matched characters from printing at the final and the positive lookahead (?=\ntest2) asserts that the match must be followed by a \n newline character and then test2 string.

answered Jan 21, 2015 at 5:39

Avinash Raj

175k32 gold badges247 silver badges289 bronze badges

2 Comments

user107172 Over a year ago

Which grep flavor is this? I'm on mac os and there is no -P. What meaning does -P have in the grep context above? It would also help to explain -oz if they are part of the solution. In mac os grep -o is --only-matching and -z is --decompress. The former might be relevant but the latter doesn't seem relevant.

Avinash Raj Over a year ago

P represents perl regex. ie, we can use perl regex on grep. Unfortunately grep on Mac won't support this option.

pratpor · Accepted Answer · 2017-03-30 08:48:02Z

1

You can do something like this too. Lets say you this file test.txt with content:

a43
test1
abc
cvb
bnm
test2
kfo

You can do

cat test.txt | grep -A10 test1 | grep -B10 test2

where -A<n> is to get you n lines after your match in the file and -B<n> is to give you n lines before the match. You just have to make sure that n > number of expected lines between test1 and test2. Or you can give it large enough to reach EOF.

Result:

test1
abc
cvb
bnm
test2

answered Mar 30, 2017 at 8:48

pratpor

2,1441 gold badge30 silver badges47 bronze badges

1 Comment

aprodan Over a year ago

this is cool if you know number of lines, but is not so in real world when you need to get all lines between two markers. For example a list of files are packed into a tar, but you do not know which ones and you want full list. Then your solution will not work. Jotne's solution will work perfectly.

Tweeks · Accepted Answer · 2017-04-18 18:38:25Z

1

The answer by PratPor above:

cat test.txt | grep -A10 test1 | grep -B10 test2

is cool.. but if you don't know the file length:

cat test.txt | grep -A1000 test1 | grep -B1000 test2

Not deterministic, but not too bad. Anyone have better (more deterministic)?

answered Apr 18, 2017 at 18:38

Tweeks

294 bronze badges

Comments

Dan · Accepted Answer · 2020-01-06 22:57:20Z

1

To make it more deterministic and not having to worry about size of file, use the wc -l and cut the output.

grep -Awc -l test.txt|cut -d" " -f1 test1 test.txt | grep -Bwc -l test.txt|cut -d" " -f1 test2

To make it easier to read, assign it to a variable first.

fsize=wc -l test.txt|cut -d" " -f1; grep -A$fsize test1 test.txt | grep -B$fsize test2

answered Jan 6, 2020 at 22:57

Dan

111 bronze badge

Comments

Brad Parks · Accepted Answer · 2023-12-07 19:43:50Z

0

The following script wraps up this process. More details in this similar StackOverflow post

get_text.sh

#!/usr/bin/env bash
function show_help()
{
  ME=$(basename "$0")
  IT=$(cat <<EOF

  $ME: extracts lines in a file between two tags

  usage: FILENAME {TAG_PREFIX|START_TAG} {END_TAG}

  examples:
    $ME 1.txt AA     => extracts lines in file 1.txt between AA_START and AA_END
    $ME 1.txt AA BB  => extracts lines in file 1.txt between AA and BB
EOF
)
  echo "$IT"
  echo
  exit
}

if [ "$1" == "help" ]
then
  show_help
fi
if [ -z "$2" ]
then
  show_help
fi

function doMain()
{
  FILENAME=$1
  if [ ! -f $FILENAME ]; then
      echo "File not found: $FILENAME"
      exit;
  fi

  if [ -z "$3" ]
  then
    START_TAG=$2_START
    END_TAG=$2_END
  else
    START_TAG=$2
    END_TAG=$3
  fi

  CMD="cat $FILENAME | awk '/$START_TAG/{f=1;next} /$END_TAG/{f=0} f'"
  eval $CMD
}

doMain "$@"

edited Dec 7, 2023 at 19:43

answered Feb 10, 2016 at 17:34

Brad Parks

73.3k71 gold badges297 silver badges374 bronze badges

2 Comments

Ninja Over a year ago

I love this answer, small and strong!

Aidin Over a year ago

It just encouraged me to trust the awk answer!

Arav Stark · Accepted Answer · 2024-06-17 14:55:38Z

I wrote a script which even works for the lines that might have any special character.

#!/bin/bash
if  [ "$#" -ne 1 ]; then
    echo "Usage: $0 <logfile_name>"
    exit 1
fi

log_file="$1"

if [ ! -f "$log_file" ]; then
    echo "File not found: $log_file"
    exit 1
fi

# User Input
read -p "Enter 1st string: " str1
read -p "Enter 2nd string: " str2

# Escape special characters in str1 and str2 for sed
escaped_str1=$(sed 's/[^^]/[&]/g; s/\^/\\^/g' <<< "$str1")
escaped_str2=$(sed 's/[^^]/[&]/g; s/\^/\\^/g' <<< "$str2")

# Check if str1 and str2 exist in the file
if ! grep -q "$escaped_str1" "$log_file"; then
    echo -e "\e[31mString \"$str1\" not found in \"$log_file\"\e[0m"
    exit 1
fi

if ! grep -q "$escaped_str2" "$log_file"; then
    echo -e "\e[31mString \"$str2\" not found in \"$log_file\"\e[0m"
    exit 1
fi

# Using sed to extract lines between str1 and str2
if [[ "$escaped_str1" < "$escaped_str2" ]]; then
    echo -e "\e[31mError: Second string \"$str2\" appears before first string \"$str1\" in the file $log_file\e[0m"
    echo -e "\e[32mSwapping strings: \"$str1\" and \"$str2\"\e[0m"
    tmp="$escaped_str1"
    escaped_str1="$escaped_str2"
    escaped_str2="$tmp"
    
fi

sed -n "/$escaped_str1/,/$escaped_str2/p" "$log_file"

This script accepts a file as an argument and prompts the user to enter two lines of strings one by one and extracts the lines in between them.

Note: the input lines of strings are also included.

Collectives™ on Stack Overflow

Bash, grep between two lines with specified string

9 Answers 9

9 Comments

7 Comments

3 Comments

2 Comments

1 Comment

Comments

Comments

get_text.sh

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

9 Answers 9

9 Comments

7 Comments

3 Comments

2 Comments

1 Comment

Comments

Comments

get_text.sh

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related