Powershell to remove a block of text from an HTML file (or variable)

Question

I have a number of HTML files and am trying to remove a specific block of text using powershell. This block appears in every table.

      <tr>
        <td colspan="3">
          <div id="reportbody">*TEXT*<a target="_blank" href=*LINK*</a></div>
        </td>
      </tr>

I can do a -replace on the 3rd line to stop the text/link displaying but I see a blank row in the tables. I have tried something similar to this post but I have no unique start/finish markers. Any help greatly appreciated.

mjolinor · Accepted Answer · 2015-01-27 13:21:11Z

0

One way:

$regex = 
@'
(?ms)\s*<tr>\s*
\s*<td colspan="3">\s*
\s*<div id="reportbody">\*TEXT\*<a target="_blank" href=\*LINK\*</a></div>\s*
\s*</td>\s*
\s*</tr>\s*
'@



(Get-Content ./file.htm -raw) -replace $regex |
 Set-Content ./newfile.htm

answered Jan 27, 2015 at 13:21

mjolinor

68.7k7 gold badges118 silver badges141 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

user3300840 Over a year ago

Excellent many thanks mjolinor. I couldn't get my head round the multi-line regex.

Collectives™ on Stack Overflow

Powershell to remove a block of text from an HTML file (or variable)

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related