0

I am writing a program that is dealing with letters from a foreign alphabet. The program is taking the input of a number that is associated with the unicode number for a character. For example 062A is the number assigned in unicode for that character.

I first ask the user to input a number that corresponds to a specific letter, i.e. 062A. I am now attempting to turn that number into a 16-bit integer that can be decoded by python to print the character back to the user.

example:

for \u0394

print(bytes([0x94, 0x03]).decode('utf-16'))

however when I am using

int('062A', '16')

I receive this error:

ValueError: invalid literal for int() with base 10: '062A'

I know it is because I am using A in the string, however that is the unicode for the symbol. Can anyone help me?

5
  • Base parameter shouldn't a string but an integer : int('062A', 16) Commented Jun 28, 2020 at 21:58
  • I don't understand; what is the intended relationship between 0x062A and \u0394? Commented Jun 28, 2020 at 21:58
  • Are you trying to encode '062A' with 'utf-16'? Commented Jun 28, 2020 at 22:06
  • I tested int('062A', '16'), and got the same error as @KarlKnechtel (TypeError: 'str' object cannot be interpreted as an integer). Please ensure that your post contains the entire, correct, error output. Commented Jun 28, 2020 at 22:07
  • The problem that I had was mostly with the incorrect use of the parameter. I'm still learning how to do basic things, and so I made a nooby mistake. The example of \u0394 and the relationship to 0x062A was none at all. This is my first stack overflow post, sorry for mistakes. I'll do better next time, and thank you all. Commented Jun 28, 2020 at 22:54

1 Answer 1

1

however when I am using int('062A', '16'), I receive this error: ValueError: invalid literal for int() with base 10: '062A'

No, you aren't:

>>> int('062A', '16')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object cannot be interpreted as an integer

It's exactly as it says. The problem is not the '062A', but the '16'. The base should be specified directly as an integer, not a string:

>>> int('062A', 16)
1578

If you want to get the corresponding numbered Unicode code point, then converting through bytes and UTF-16 is too much work. Just directly ask using chr, for example:

>>> chr(int('0394', 16))
'Δ'
Sign up to request clarification or add additional context in comments.

3 Comments

You may or may not be able to display Arabic characters on your terminal. Python can't do anything about that. The simplest, cross-platform way to make sure your strings contain the right text, is to write them out to a text file and view them in a Unicode-aware text editor.
I'm still a newby when it comes to doing this. You figured it out. I thought that converting my stuff into bytes and and UTF-16 is a lot of work too, and it dissuaded me from wanting to work on the project because it felt too hard. Fortunately after reading your comment and answer I've been able to make some more progress. Thank you.
For what it's worth: to convert from the integer to bytes, use the .to_bytes method of the integer - you need to tell it how many bytes to use, and the endianness. There will, I am sure, eventually be a project where you do need this :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.