0

I have a string that has two things, which is the name of the petitioner and the advocate.

I wanna separate the petitioner names and the advocate names.

All petitioner names start with a number (1-) and the advocate name start with Advocate-.

1) RAM PRASAD\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0Advocate- ADITYA PRASAD MISHRA, A.P. MISHRA

This is another string.

1) KALAICHELVI

Advocate - NOTICE ORDER R1 ONLY, -------------------------------, R1 - TAPAL RETURNED, NOT KNOWN

2) KALIMUTHU 3) RAMACHANDRA GOAUNER 4) SETHU AMMAL 5) SOMU GOUNDER 6) SOMASUNDAR A GOUNDER 7) KARUNANITHI 8) LALAITHAMMAL 9) JEGANNATHA GOUNDER

I tried doing this, re.split(r'[ ]\xa0\xa0(?=[0-9]+\b)', s) but works fine when the Adovate Name isn't present. How do I do this?

2
  • Are these in a file? Do they repeat? Commented Sep 2, 2019 at 8:04
  • In my DB, Yes the advocate name may occur below a petitioner's name. @BurhanKhalid Commented Sep 2, 2019 at 8:05

1 Answer 1

1

If you want to find two distinct things and plan to use regular expressions, it is almost always a good idea to use two distinct expressions instead of one. For example

petitioner_re = re.compile(r"\d+\) ([A-Z ]+)")    # matches petitioners
advocate_re = re.compile(r"Advocate - ([^\n]+)")  # matches advocates

Given your example input, you can apply re.finditer for petitioners and re.search for advocates

content = """
1) KALAICHELVI

Advocate - NOTICE ORDER R1 ONLY, -------------------------------, R1 - TAPAL RETURNED, NOT KNOWN

2) KALIMUTHU 3) RAMACHANDRA GOAUNER 4) SETHU AMMAL 5) SOMU GOUNDER 6) SOMASUNDAR A GOUNDER 7) KARUNANITHI 8) LALAITHAMMAL 9) JEGANNATHA GOUNDER
"""

petitioners = [p.group(1).strip() for p in petitioner_re.finditer(content)]
advocate = advocate_re.search(content)

Which gives the following result

print(petitioners)
['KALAICHELVI', 'KALIMUTHU', 'RAMACHANDRA GOAUNER', 'SETHU AMMAL',
 'SOMU GOUNDER', 'SOMASUNDAR A GOUNDER', 'KARUNANITHI', 'LALAITHAMMAL', 
 'JEGANNATHA GOUNDER']
print(advocate)
'NOTICE ORDER R1 ONLY, -------------------------------, R1 - TAPAL RETURNED, NOT KNOWN'

If you have multiple advocates per entry and want to find all of them, they'll need to be fetched with re.finditer as well.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.