2

I created a file like this

echo "test 1", Hello, foo, bar, world, "test 2" > test.txt

and the result is this:

test 1
Hello
foo
bar
a better world
test 2

I need to remove all the text starting with the keyword "Hello" and ending with "world", including both keywords.

Something like this

test 1
test 2

I tried

$pattern='(?s)(?<=/Hello/\r?\n).*?(?=world)'
(Get-Content -Path .\test.txt -Raw) -replace $pattern, "" | Set-Content -Path .\test.txt

but nothing happend. What can I try?

11
  • 3
    it seems like you could do it with -replace '(?s)\s*Hello.*world' Commented Jan 11, 2023 at 16:36
  • 1
    @Leo Your post say "the text between the keywords", please update your question to reflect what you actually want Commented Jan 11, 2023 at 16:38
  • 1
    Nicely done, @Santiago - I suggest posting that as an answer (the only consideration worth mentioning is whether the .* should be greedy or not). Commented Jan 11, 2023 at 16:40
  • 1
    thanks @mklement0 but im honestly still unclear on what OP wants Commented Jan 11, 2023 at 16:40
  • 2
    @MathiasR.Jessen, sorry, I got confused: yes, my answer removes the keywords, because I believe that to be the OP's intent ("including both keywords"). Commented Jan 11, 2023 at 16:44

3 Answers 3

3

Assuming you want to remove the starting and ending keywords you could use either (?s)\s*Hello.*world or (?s)\s*Hello.*?world depending on if you want .* to be greedy or lazy.

(Get-Content path\to\file.txt -Raw) -replace '(?s)\s*Hello.*world' |
    Set-Content path\to\result.txt

Use -creplace for case sensitive matching of the keywords.

Sign up to request clarification or add additional context in comments.

1 Comment

I also like the -NoNewLine option after the Set-Content command, just to avoid the new empty line at the end of the file
3

Leaving aside that there are extraneous / in your regex, reformulate it as follows:Tip of the hat to Santiago Squarzon.

$pattern = '(?sm)^Hello\r?\n.*?world\r?\n'

(Get-Content -Path .\test.txt -Raw) -replace $pattern | 
  Set-Content -Path .\test.txt

This removes the line starting with Hello all the way through the (first) subsequent line that ends in world, including the next newline. This yields the desired output, as shown in your question.


As for what you tried:

Aside from the extraneous / chars., your primary problem is that you are using look-around assertions ((?<=...), (?=...)), which cause what they match not to be captured as part of the overall match, and are therefore not replaced by -replace.

Comments

0

I think this is a duplicate with How can I deleted lines from a certain position? or any of the included other duplicates:

'test1', 'Hello', 'foo', 'bar', 'world', 'test2' |SelectString -From '(?=Hello)' -To '(?<=world)'

1 Comment

Note that you're doing line-by-line processing, whereas the OP's attempt uses single-string, multi-line processing. I'm sure there are plenty of posts here that are variations of the same theme, though the specifics often warrant separate answers. Your custom SelectString function is a nice alternative approach, but, given that its name looks like Select-String, I suggest making it clear (here too) that a custom function is being used.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.