What's the easiest way to convert XML from UTF16 to a UTF8 encoded file?
3 Answers
This may not be the most optimal, but it works. Simply load the xml and push it back out to a file. the xml heading is lost though, so this has to be re-added.
$files = get-ChildItem "*.xml"
foreach ( $file in $files )
{
[System.Xml.XmlDocument]$doc = new-object System.Xml.XmlDocument;
$doc.set_PreserveWhiteSpace( $true );
$doc.Load( $file );
$root = $doc.get_DocumentElement();
$xml = $root.get_outerXml();
$xml = '<?xml version="1.0" encoding="utf-8"?>' + $xml
$newFile = $file.Name + ".new"
Set-Content -Encoding UTF8 $newFile $xml;
}
Well, I guess the easiest way is to just not care about whether the file is XML or not and simply convert:
Get-Content file.foo -Encoding Unicode | Set-Content -Encoding UTF8 newfile.foo
This will only work for XML when there is no
<?xml version="1.0" encoding="UTF-16"?>
line.
7 Comments
Jaykul
If you want to do it without creating a new file, you can wrap the get-content in parenthesis: (Get-Content File.foo) | Set-Content -Encoding UTF8 File.foo
stormwild
How do you do this for files in a directory and subdirectories?
Joey
gci -rec -fi * | %{(gc $_ -enc unicode) | set-content -enc utf8 $_.fullname}. Fairly straightforward, actually.Tim Friesen
@Joey, a small correction on your powershell script...
gci -rec -fi * | %{(gc $_.fullname -enc unicode) | set-content -enc utf8 $_.fullname}Joey
No need of using
FullName there. Get-Content knows how to deal with a FileInfo. |
Try this solution that uses a XmlWriter:
$encoding="UTF-8" # most encoding should work
$files = get-ChildItem "*.xml"
foreach ( $file in $files )
{
[xml] $xmlDoc = get-content $file
$xmlDoc.xml = $($xmlDoc.CreateXmlDeclaration("1.0",$encoding,"")).Value
$xmlDoc.save($file.FullName)
}
You may want to look at XMLDocument for more explanation on CreateXmlDeclaration.
2 Comments
gimpf
Thank you very much for caring to provide such a brief, an technically better, answer to such an old question!
LMA1980
I had to get it done and I found this solution even before seeing this question. I felt it was normal to offer it. With little effort someone can even use it to copy while converting the encoding of the files. Regards.