19

What's the easiest way to convert XML from UTF16 to a UTF8 encoded file?

3 Answers 3

16

This may not be the most optimal, but it works. Simply load the xml and push it back out to a file. the xml heading is lost though, so this has to be re-added.

$files = get-ChildItem "*.xml"
foreach ( $file in $files )
{
    [System.Xml.XmlDocument]$doc = new-object System.Xml.XmlDocument;
    $doc.set_PreserveWhiteSpace( $true );
    $doc.Load( $file );

    $root = $doc.get_DocumentElement();
    $xml = $root.get_outerXml();
    $xml = '<?xml version="1.0" encoding="utf-8"?>' + $xml

    $newFile = $file.Name + ".new"
    Set-Content -Encoding UTF8 $newFile $xml;
}
Sign up to request clarification or add additional context in comments.

3 Comments

Shouldn't you explicitly set the encoding to save somewhere?
If I knew how, I would. It appears to be the default though.
@Exotic Hadron: no, unless it's valid XML too.
16

Well, I guess the easiest way is to just not care about whether the file is XML or not and simply convert:

Get-Content file.foo -Encoding Unicode | Set-Content -Encoding UTF8 newfile.foo

This will only work for XML when there is no

<?xml version="1.0" encoding="UTF-16"?>

line.

7 Comments

If you want to do it without creating a new file, you can wrap the get-content in parenthesis: (Get-Content File.foo) | Set-Content -Encoding UTF8 File.foo
How do you do this for files in a directory and subdirectories?
gci -rec -fi * | %{(gc $_ -enc unicode) | set-content -enc utf8 $_.fullname}. Fairly straightforward, actually.
@Joey, a small correction on your powershell script... gci -rec -fi * | %{(gc $_.fullname -enc unicode) | set-content -enc utf8 $_.fullname}
No need of using FullName there. Get-Content knows how to deal with a FileInfo.
|
9

Try this solution that uses a XmlWriter:

$encoding="UTF-8" # most encoding should work
$files = get-ChildItem "*.xml"
foreach ( $file in $files )
{
    [xml] $xmlDoc = get-content $file
    $xmlDoc.xml = $($xmlDoc.CreateXmlDeclaration("1.0",$encoding,"")).Value
    $xmlDoc.save($file.FullName)      
}

You may want to look at XMLDocument for more explanation on CreateXmlDeclaration.

2 Comments

Thank you very much for caring to provide such a brief, an technically better, answer to such an old question!
I had to get it done and I found this solution even before seeing this question. I felt it was normal to offer it. With little effort someone can even use it to copy while converting the encoding of the files. Regards.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.