Python - Check if String Contain Only Defined Characters Using Regex
Given a string, the task is to check whether it contains only a specific set of allowed characters. For example:
Allowed characters: a-z
Input: "hello" -> Valid
Input: "hi!" -> InvalidAllowed characters: 0-9
Input: "657" -> Valid
Input: "72A" -> Invalid
Let’s explore different regex-based methods to perform this check in Python.
Using re.fullmatch()
fullmatch() checks if the entire string is made up of only the allowed characters. If even one invalid character appears, the match fails.
import re
s = "hello123"
pattern = r"[a-zA-Z0-9]+"
if re.fullmatch(pattern, s):
print("Valid string")
else:
print("Invalid string")
Output
Valid string
Explanation:
- [a-zA-Z0-9]+ allows alphabets and digits only
- fullmatch() succeeds only if every character matches the pattern
- If the whole string fits the pattern -> Valid
Using re.match() with ^ and $
match() begins checking from the start, so we add ^ and $ to ensure the entire string must match the allowed characters.
import re
s = "Python_3"
pattern = r"^[A-Za-z0-9_]+$"
if re.match(pattern, s):
print("Valid string")
else:
print("Invalid string")
Output
Valid string
Explanation:
- ^ start of string and $ end of string
- Pattern restricts the characters to letters, digits and underscore
- If any disallowed character appears -> no match -> Invalid
Using re.search() to detect invalid characters
Instead of matching allowed characters, we search for characters not allowed using re.search(). If we find any such character then, invalid.
import re
s = "Hello-World"
invalid = r"[^A-Za-z]"
if re.search(invalid, s):
print("Invalid string")
else:
print("Valid string")
Output
Invalid string
Explanation:
- [^A-Za-z] matches any character not in the allowed set
- If search() finds such a character the string contains something invalid
- If nothing is found -> Valid
Using re.findall() to list invalid characters
findall() collects all invalid characters. If the result list is empty string contains only allowed characters.
import re
s = "abcXYZ!@#"
invalid_chars = re.findall(r"[^A-Za-z0-9]", s)
if invalid_chars:
print("Invalid string:", invalid_chars)
else:
print("Valid string")
Output
Invalid string: ['!', '@', '#']
Explanation:
- [^A-Za-z0-9] finds characters outside the allowed set
- findall() returns a list of invalid characters
- If list is empty the string is fully valid