As you are already aware, the Visual Basic Editor is a non-Unicode app and uses a system-locale-dependent 8-bit character encoding (ANSI) for the source code. The string literals in the source code are then transcoded from this locale-specific non-Unicode codepage to UTF-16 at some point before or during compilation.
As I can not reproduce the behavior you are describing
...it also shows as Greek for the user in which there is no issue.
I can only offer a workaround in this answer, because as far as I know (and as per my testing) the thing you describe as issue is the expected behavior:
correctString = "Empfänger" 'Expected for users with e.g. German system-locale
incorrectString = "Empfδnger" 'Expected for users with Greek system-locale
If you use string literals in code (for which there are definitely many legitimate use cases) only characters from the ASCII codepoint range will behave consistently across all system locales.
You can therefore work around this problem by using some functions from my library VBA-StringTools.
The functions you need are the following:
'Replaces all occurences of unicode characters outside the codePoint range
'defined by maxNonEncodedCharCode with literals of the following formattings:
' \uXXXX for characters inside the basic multilingual plane
' \uXXXXXXXX for characters outside the basic multilingual plane
'Where:
' Xes are the digits of the codepoint in hexadecimal. (X = 0-9 or A-F)
Public Function EscapeUnicode(ByRef str As String, _
Optional ByVal maxNonEncodedCharCode As Long = &HFF) _
As String
Dim codepoint As Long
Dim i As Long
Dim j As Long: j = 1
Dim result() As String: ReDim result(1 To Len(str))
For i = 1 To Len(str)
codepoint = AscW(Mid$(str, i, 1)) And &HFFFF&
If codepoint >= &HD800& Then codepoint = AscU(Mid$(str, i, 2))
If codepoint > &HFFFF& Then 'Outside BMP
result(j) = "\u" & "00" & Right$("0" & Hex(codepoint), 6)
i = i + 1
ElseIf codepoint > maxNonEncodedCharCode Then 'BMP
result(j) = "\u" & Right$("00" & Hex(codepoint), 4)
Else
result(j) = Mid$(str, i, 1)
End If
j = j + 1
Next i
EscapeUnicode = Join(result, "")
End Function
#If Mac = 0 Then
'Replaces all occurences of unicode literals
'Accepts the following formattings `escapeFormat`:
' efPython = 1 … \uXXXX \u00XXXXXX (4 or 8 hex digits, 8 for chars outside BMP)
' efRust = 2 … \u{XXXX} \U{XXXXXX} (1 to 6 hex digits)
' efUPlus = 4 … u+XXXX u+XXXXXX (4 or 6 hex digits)
' efMarkup = 8 … &#ddddddd; (1 to 7 decimal digits)
'Where:
' - prefixes \u is case insensitive
' - Xes are the digits of the codepoint in hexadecimal. (X = 0-9 or A-F/a-f)
'Example:
' - "abcd au+0062\U0063xy\u{64}", efAll returns "abcd abcxyd"
'Notes:
' - Avoid u+XXXX syntax if string contains literals without delimiters as it
' can be misinterpreted if adjacent to text starting with 0-9 or a-f.
' - This function can be slow for very long input strings with many
' different literals
Private Function UnescapeUnicode(ByRef str As String, _
Optional ByVal allowSingleSurrogates As Boolean = False) _
As String
Const PATTERN_UNICODE_LITERALS As String = _
"\\u00[01][0-9a-f]{5}|\\u[0-9a-f]{4}|" & _
"\\u{[0-9a-f]{1,6}}|u\+[0-9a-f]{4,6}|&#\d{1,7};"
With CreateObject("VBScript.RegExp")
.Global = True
.MultiLine = True
.IgnoreCase = True
.Pattern = PATTERN_UNICODE_LITERALS
Dim mc As Object: Set mc = .Execute(str)
End With
Dim match As Variant
Dim codepoint As Long
Dim dupeCheck As Collection
Set dupeCheck = New Collection
Dim isDuplicate As Boolean
For Each match In mc
Dim mv As String: mv = match.value
On Error Resume Next
dupeCheck.Add 1, mv
isDuplicate = Err.number <> 0
On Error GoTo 0
If Not isDuplicate Then
If Left$(mv, 1) = "&" Then
codepoint = CLng(Mid$(mv, 3, Len(mv) - 3))
Else
If Mid$(mv, 3, 1) = "{" Then
codepoint = CLng("&H" & Mid$(mv, 4, Len(mv) - 4))
Else
codepoint = CLng("&H" & Mid$(mv, 3))
End If
End If
If codepoint < &H110000 Then
If codepoint < &HD800& Or codepoint >= &HE000& _
Or allowSingleSurrogates Then _
str = Replace(str, mv, ChrU(codepoint))
End If
End If
Next match
UnescapeUnicode = str
End Function
#End If
'Returns the given unicode codepoint as standard VBA UTF-16LE string
Public Function ChrU(ByVal codepoint As Long, _
Optional ByVal allowSingleSurrogates As Boolean = False) As String
Const methodName As String = "ChrU"
If codepoint < &H8000 Then Err.Raise 5, methodName, "Codepoint < -32768"
If codepoint < 0 Then codepoint = codepoint And &HFFFF& 'Incase of uInt input
If codepoint < &HD800& Then
ChrU = ChrW$(codepoint)
ElseIf codepoint < &HE000& And Not allowSingleSurrogates Then
Err.Raise 5, methodName, "Range reserved for surrogate pairs"
ElseIf codepoint < &H10000 Then
ChrU = ChrW$(codepoint)
ElseIf codepoint < &H110000 Then
codepoint = codepoint - &H10000
ChrU = ChrW$(&HD800& Or (codepoint \ &H400&)) & _
ChrW$(&HDC00& Or (codepoint And &H3FF&))
Else
Err.Raise 5, methodName, "Codepoint outside of valid Unicode range."
End If
End Function
'Returns a given characters unicode codepoint as long.
'Note: One unicode character can consist of two VBA "characters", a so-called
' "surrogate pair" (input string of length 2, so Len(char) = 2!)
Public Function AscU(ByRef char As String) As Long
AscU = AscW(char) And &HFFFF&
If Len(char) > 1 Then
Dim lo As Long: lo = AscW(Mid$(char, 2, 1)) And &HFFFF&
If &HDC00& > lo Or lo > &HDFFF& Then Exit Function
AscU = (AscU - &HD800&) * &H400& + (lo - &HDC00&) + &H10000
End If
End Function
When developing the macro, you use the EscapeUnicode function to replace all Unicode characters outside the ASCII range with escape sequences. For the string in your question, you would put the following line into the immediate window:
?EscapeUnicode("Empfänger", 127) '127 is the last character in the ASCII range
which will return the escaped string Empf\u00E4nger.
In your actual code, you now use this string together with the UnescapeUnicode function:
correctString = UnescapeUnicode("Empf\u00E4nger")
The variable will now contain the correct string when inserted into a cell on the worksheet, regardless of system-locale.
Maybe this workaround is more suitable for you than creating a hidden sheet to read the string literals from.
Note that the function UnescapeUnicode provided in this answer only works on Windows because it uses regular expressions. The function from the GitHub repository supports MacOS and performs much better, so I'd recommend just adding the LibStringTools module to your project.