i have some html source code of customer data that needs to be cleaned from html tags before deployed with a line joining string split.
i want to be able to target specific types of information. if for example a customer has a list of categories on his page. each 'category' sits, perched inside of an easily distinguishable tag:
<span _ngcontent-jal-c67="" class="category-name">Cryptocurrency</span>
would it be possible to remove everything else that is not nested inside a similar html tag?
let's say, for exampple i want evrything thats occurs inside of <span *>*</span>. so that every non <span></span> tag and its contents would be removed. the contents of all the <span ***>***</span> would stay, without the tag.
is that something i could do in powershell?
let's avoid paste.exe and cygwin type of stuff. i'm looking for standard native windows approach (cmd or powershell).
again, i want to remove all tags.
just the contents that i don't remove should be limited to those found in a specific tag. like ,<span _ngcontent-jal-c68="" class="category-name">Shopping</span>
everything that fits the <span *>*</span> profile
leave only the contents. no tag.
from: <span _ngcontent-jal-c32="" class="category-name">Home and Graden</span>
to: Home and Graden
i'm really looking for an answer for how to do this in powershell without needing to install anything or to make any interesting changes to the OS (windows10)