3

Is comparing two characters (that is one character str) in Python (3.x if that matters) well defined? or do I have to make an explicit conversion?

In other words, is:

'a' > 'b'

the same as:

ord('a') > ord('b')

3
  • A glance over the source code shows string comparisons use memcmp. That is pretty well-defined. It indeed uses the literal character codes. (And while typing this, a question arises. What else would you expect?) Commented Feb 27, 2020 at 13:12
  • 'a' > 'b' is a lexicographic comparison, and ord('a') > ord('b') is a numerical comparison. I'm unsure if this makes a difference when only 1 character is used. Commented Feb 27, 2020 at 13:16
  • @usr2564301 I just wasn't sure because python has no built in char type Commented Feb 27, 2020 at 13:17

1 Answer 1

5

When not sure, check the docs:

Strings (instances of str) compare lexicographically using the numerical Unicode code points (the result of the built-in function ord()) of their characters.

So yes, the behavior is well defined.

Sign up to request clarification or add additional context in comments.

2 Comments

BTW, that might not be what one expects using collating rules of a given natural language - for example, in Polish letter ń precedes letter ó, however in Python 'ń'<'ó' is False, because ord('ń') is 324 and ord('ó') is 243. Similarly 'å' < 'ä' is False even though in Swedish å precedes ä.
@Bloto: that's because of two unrelated reasons. First, Unicode order is "historical" rather than logical. Existing Unicode points cannot be redefined once they are assigned. But, probably more important, such sorting rules differ per language. That's why a proper comparison needs a locale.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.