1

got a html which contains 2 lines of texts.

<!-- START -->
asdf
<!-- END -->

between those 2 marker can stand anything and its changing data so its not same data all the time. Is there a possibility to erase all lines between those 2?

Have tried with regex

(?sm)<!-- START -->.*?(?=^<!-- END -->)

but he always starts with the first line and not below.

Can someone help me to start after with regex and then delete it?

5
  • 2
    Use a parser that understands HTML. A regex doesn't work with HTML. Try, say, html agility pack. Commented Aug 14, 2020 at 8:26
  • But hes stopping right only the beginning is wrong Commented Aug 14, 2020 at 8:28
  • It will not select the second line due to the lookahead (?=^<!-- END -->) You could try a capturing group and use the group in the replacement (?sm)<!-- START -->\r?\n(.*?)\r?\n<!-- END --> regex101.com/r/CGun4i/1 but html and regex is usually not a good combination. Commented Aug 14, 2020 at 8:35
  • yes done that its working $regex=@' (?ms)^(\s*<!-- OPC-ITEM-ENTRIES START -->\s*?\r?\n).*?\r?\n(\s*<!-- OPC-ITEM-ENTRIES END -->\s*) '@ $delete = (Get-Content -raw $file) -replace $regex, '$1$2' $delete |Set-Content C:\Users\marku\Desktop\GEA\Powershell\mdi-opc-items.html Commented Aug 14, 2020 at 8:46
  • 1
    @s0Nic Ah yes, I suggested it the other way around :-) Wiktor Stribiżew provided the right answer with the explanation. Commented Aug 14, 2020 at 14:45

1 Answer 1

1

The main issue here is that you match without capturing the left-hand delimiter.

To match and erase arbitrary content in between two multichar delimiters you need to either put both delimiters inside lookarounds:

-replace '(?<=left_hand_delim).*?(?=right_hand_delim)'

Or, use capturing groups in the regex and backreferences in the replacement:

-replace '(left_hand_delim).*?(right_hand_delim)', '$1$2'

You may use

$regex='(?ms)(?<=^\s*<!-- OPC-ITEM-ENTRIES START -->\s*).*?(?=\s*<!-- OPC-ITEM-ENTRIES END -->)'
(Get-Content -raw $file) -replace $regex, '$1$2' | Set-Content $outfile

See regex demo 1 and regex demo #2 (see Context tab).

You must use -raw option to read in the file contents into a single variable since you need the s singleline flag to let . match any char including newlines.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.