14

How can I search for, say, a sequence of 10 isprint characters in a given string in Python?

With GNU grep, I would simply do grep [[:print:]]{10}

0

1 Answer 1

12

Since POSIX is not supported by Python re module, you have to emulate it with the help of character class.

You can use the one from the regular-expressions.info and add a limiting quantifier {10}:

[\x20-\x7E]{10}

See demo

Alternatively, you can use Matthew Barnett regex module that claims to support POSIX character classes (POSIX character classes are supported.).

Sign up to request clarification or add additional context in comments.

5 Comments

This character class worked for me in Python 3 [`~!@#$%^&*()_=+\[\]{}\\\|;:\"\'<>.,/?] when using inside the re.sub() method
@Iota, that [`~!@#$%^&*()_=+\[\]{}\\\|;:\"\'<>.,/?] only matches ASCII punctuation, it has nothing to do with the concept of "printable chars". So, if you were to use a POSIX character class, it would be [[:punct:]]. To match punctuation in Python, you can use [^\w\s], although there are better and more precise patterns.
My mistake! I misread the [[:print]] class as [[:punct]]. Appreciate your correction.
The regex in the answer will not match Unicode non-ASCII characters like grep (GNU grep) does.
@pabouk-Ukrainestaystrong Then see the bottom of the answer. Just install the PyPi regex module (pip install regex in the console/terminal) and then use import regex and pattern = regex.compile(r'[[:print:]]{10}').

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.