Apply list of regex pattern on list python

Question

I have data frame in which txt column contains a list. I want to clean the txt column using function clean_text().

data = {'value':['abc.txt', 'cda.txt'], 'txt':['[''2019/01/31-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart'']',
                                               '[''2019/02/01-11:56:23.288258 1886     7F0ED4CDC704     asfasnfs: remove datepart'']']}
df = pandas.DataFrame(data=data)

def clean_text(text):
    """
    :param text:  it is the plain text
    :return: cleaned text
    """
    patterns = [r"^{53}",
                r"[A-Za-z]+[\d]+[\w]*|[\d]+[A-Za-z]+[\w]*",
                r"[-=/':,?${}\[\]-_()>.~" ";+]"]

    for p in patterns:
        text = re.sub(p, '', text)

    return text

My Solution:

df['txt'] = df['txt'].apply(lambda x: clean_text(x))

But I am getting below error: Error

sre_constants.error: nothing to repeat at position 1

Possible duplicate of Regex sre_constants.error: bad character range — sophros
– sophros, Commented Feb 10, 2019 at 19:59

blhsing · Accepted Answer · 2019-02-10 19:54:44Z

10

^{53} is not a valid regular expression, since the repeater {53} must be preceded by a character or a pattern that can be repeated. If you mean to make it validate a string that is at least 53 characters long you can use the following pattern instead:

^.{53}

answered Feb 10, 2019 at 19:54

blhsing

109k9 gold badges88 silver badges132 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

user15051990 Over a year ago

Thanks for answer. I have updated question, now I get Attribute error.

sophros · Accepted Answer · 2019-02-10 19:57:47Z

3

The culprit is the first pattern from the list - r"^{53}". It reads: ^ - match the beginning of the string and then {53} repeat the previous character or group 53 times. Wait... but there is no other character than ^ which cannot be repeated! Indeed. Add a char that you want to match 53 repetitions of. Or, escape the sequence {53} if you want to match it verbatim, e.g. using re.escape.

answered Feb 10, 2019 at 19:57

sophros

17.3k12 gold badges52 silver badges84 bronze badges

5 Comments

user15051990 Over a year ago

Thanks for answer. I have updated question, now I get Attribute error.

sophros Over a year ago

This should really be another question. How a reader of the question can make any sense of the answers if you change the crucial elements of the question?

sophros Over a year ago

And before you do that - please revert the change first so that the answers make sense with the question.

user15051990 Over a year ago

I have post it as different question: stackoverflow.com/questions/54620550/…. Can you please help me in solving.

sophros Over a year ago

I have already done that although I believe you should reward the effort already made on answering this question.

Collectives™ on Stack Overflow

Apply list of regex pattern on list python

2 Answers 2

1 Comment

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related