0

I have a number of HTML files and am trying to remove a specific block of text using powershell. This block appears in every table.

      <tr>
        <td colspan="3">
          <div id="reportbody">*TEXT*<a target="_blank" href=*LINK*</a></div>
        </td>
      </tr>

I can do a -replace on the 3rd line to stop the text/link displaying but I see a blank row in the tables. I have tried something similar to this post but I have no unique start/finish markers. Any help greatly appreciated.

1 Answer 1

0

One way:

$regex = 
@'
(?ms)\s*<tr>\s*
\s*<td colspan="3">\s*
\s*<div id="reportbody">\*TEXT\*<a target="_blank" href=\*LINK\*</a></div>\s*
\s*</td>\s*
\s*</tr>\s*
'@



(Get-Content ./file.htm -raw) -replace $regex |
 Set-Content ./newfile.htm
Sign up to request clarification or add additional context in comments.

1 Comment

Excellent many thanks mjolinor. I couldn't get my head round the multi-line regex.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.