How to search unicode string in python

Question

I am searching some simple Cyrillic patterns in strings using python. The pattern I am using is like /[а-я]+/[а-я]+. When I search pattern by this code

import re
re.search('/[а-я]+/[а-я]+', '/бцршб/бйцбйц')

It cannot find anything. But when I write it like this.

import re
re.search(u'/[а-я]+/[а-я]+', u'/бцршб/бйцбйц')

It works. However in my case, the pattern and text are predefined in Database, so I couldn't find a way to convert them to the Unicode string. What is the solution in this case. Any help would be appreciated.

What do you mean by "predefined in storage"? Please post a complete, short program that demonstrates the problem you are having. — Robᵩ
– Robᵩ, Commented Jul 28, 2015 at 1:06
@jwodder You can try using decode() on strings, it would give you AttributeError: 'str' object has no attribute 'decode' . — Anand S Kumar
– Anand S Kumar, Commented Jul 28, 2015 at 1:08
@Anand: actually the behavior you describe is Python 3's where "str"s are already unicode objects. — jsbueno
– jsbueno, Commented Jul 28, 2015 at 1:12
Thank you guys. It works when decoding strings. So the code is like: import re pattern = '/[а-я]+/[а-я]+'.decode('utf-8') text = '/йцбйц/бйцбц'.decode('utf-8') re.search(pattern, text) — Odgiiv
– Odgiiv, Commented Jul 28, 2015 at 1:14
Oh ok , correct, I just tried in Python 2.x , decode() is what the OP needs. — Anand S Kumar
– Anand S Kumar, Commented Jul 28, 2015 at 1:15

Odgiiv · Accepted Answer · 2015-07-28 03:37:06Z

1

Thank you guys. It works when decoding strings. So the code is like:

import re 
pattern = '/[а-я]+/[а-я]+'.decode('utf-8') 
text = '/йцбйц/бйцбц'.decode('utf-8') 
re.search(pattern, text)

answered Jul 28, 2015 at 3:37

Odgiiv

7231 gold badge12 silver badges32 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How to search unicode string in python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related