am getting grey hair over this. I need to convert strings in PowerShell to UTF-8. My reference code is in Java (and works as intended with the bigger application), so I need to reproduce what it does.
In Java, I do:
private static final char[] HEX_ARRAY = "0123456789ABCDEF".toCharArray();
public static String bytesToHex(byte[] bytes) {
char[] hexChars = new char[bytes.length * 2];
for (int j = 0; j < bytes.length; j++) {
int v = bytes[j] & 0xFF;
hexChars[j * 2] = HEX_ARRAY[v >>> 4];
hexChars[j * 2 + 1] = HEX_ARRAY[v & 0x0F];
}
return new String(hexChars);
}
public static void main(String[] args) throws Exception {
System.out.println(bytesToHex("aöß".getBytes("UTF8")));
}
which outputs 61C3B6C39F.
In PowerShell, I do
Write-Output $(([System.Text.UTF8Encoding]::New($false, $true).getBytes("aöß") | ForEach-Object ToString X2) -join '')
which outputs 61C383C2B6C383C5B8
Why are they different? How can I make the PowerShell encoding match the Java one?
I would be very grateful for any insights!
Best eDude
EDIT: Ok, now I am more confused. When running the above command in the PowerShell 5.1 console, it works as expected. When putting it into a script file and executing that, it does not.
EDIT 2: More info, if the script file is saved in UTF-8 encoding, the error appears. If it is saved in another encoding (e.g. Notepad++'s ANSI), it works. Why is the encoding of the script file changing the behavior of the script itself? How can I prevent this and make sure to get consistent results?
61c3b6c39fin both PowerShell 5.1, 7.1 and 7.2. Which version are you using?