3

Is there an option or command similar to grep's -m1 or awk's nextfile for sed, which would allow sed to immediately stop processing the current input file when a match is found (while continuing to process subsequent input files)?

For example:

find . -type f -exec sed -ns '/pattern/{<do stuff>; p; <next>}' {} +

where <next> would be a command to cease reading the current input file. The quit command (q) is not suitable, since it simply causes sed to exit (abandoning subsequent input files), and would therefore find at most one match per batch of input files.

4
  • 1
    how about using \; instead of + so that sed sees only one file input at a time? Commented Jan 21, 2020 at 4:54
  • @Sundeep: I had thought about that, but I prefer to use + for performance reasons. Commented Jan 21, 2020 at 17:35
  • in that case perhaps use awk or perl as they do have ability to skip rest of the lines... if your input is ASCII, using LC_ALL=C will give you speed boost.. perhaps use xargs to parallelize.. and do at least try \; with q.. it isn't always easy to know speed results without actually performing the tests Commented Jan 22, 2020 at 3:33
  • @Sundeep: Thanks, LC_ALL=C is a good point in terms of performance. Commented Jan 22, 2020 at 7:28

1 Answer 1

1

Example wit search and replace:

GNU sed

stops processing file/input (thanks to sed's -s option and find's +) after first occurrence of pattern.

find . -type f -exec sed -ns '0,/pattern/s/patter/replacement/p' "{}" +

BSD sed

BSD sed seems to lack -s option. So I'm using Sundeep's suggestion.

Quit sed after first occurrence of pattern and find will execute sed with next file.

find . -type f -exec sed -n '0,/pattern/p;s/pattern/replacement/p;q' "{}" \;
7
  • Thanks. The address range (0,/pattern/) does work, though it requires searching for the pattern twice (in the general case, <do stuff> might not involve a substitution on the pattern). Commented Jan 21, 2020 at 17:59
  • It should work with different things than substitution as well. Commented Jan 21, 2020 at 19:06
  • To see that it does not generalize, compare the output of seq 10 | sed -n '0,/9/ s/9/A/p' and seq 10 | sed -n '0,/9/ ='. Also, the address range only restricts execution of those commands, but processing will continue until EOF, which may be undesirable for large input files. Commented Jan 21, 2020 at 22:51
  • 1
    There's one more option how to possibly speed it up. You can find . -type f -print 0 | xargs -0 -P XX -L YY sed … this will run max XX parallel seds each with up to YY arguments from find. But it's possible that your <do stuff> isn't a good case for this. Commented Jan 22, 2020 at 13:32
  • Thanks, that's a good point as well. In this case, unmixed output is preferred, so parallel could be used. Commented Jan 22, 2020 at 17:51

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.