Regex is not a good solution here.
Validate if a UTF8 string is an integer:
try:
int(val)
is_int = True
except ValueError:
is_int = False
Validate if a UTF8 string is a float: same as above, but with float().
Validate if a UTF8 string is of length(1-255):
is_of_appropriate_length = 1 <= len(val) <= 255
Validate if a UTF8 string is a valid date: this is not trivial. If you know the right format, you can use time.strptime() like this:
# Validate that the date is in the YYYY-MM-DD format.
import time
try:
time.strptime(val, '%Y-%m-%d')
is_in_valid_format= True
except ValueError:
is_in_valid_format = False
EDIT: Another thing to note. Since you specifically mention UTF-8 strings, it would make sense to decode them into Unicode first. This would be done by:
my_unicode_string = my_utf8_string.decode('utf8')
It is interesting to note that when trying to convert a Unicode string to an integer using int(), for example, you are not limited to the "Western Arabic" numerals used in most of the world. int(u'١٧') and int(u'१७') will correctly decode as 17 even though they are Hindu-Arabic and Devangari numerals respectively.