2

I want to extract Name and number from a given string and save it into two lists.

    str = 'Dhoni scored 100 runs and Kohli scored 150 runs.Rohit scored 50 runs and Dhawan scored 250 runs .'

I want to acheive :

    name = ['Dhoni','Kohli','Rohit','Dhawan']
    values = ['100','150','50','250']

I tried to use negative-look ahead but did not succeed. I am trying to use the approach as match a word then a number then again a word. May be I am wrong in this approach. How this can be acheived?

What I tried :

   pattern = r'^[A-Za-z]+\s(?!)[a-z]'
   print(re.findall(pattern,str))

3 Answers 3

3

You might use 2 capturing groups instead:

\b([A-Z][a-z]+)\s+scored\s+(\d+)\b

regex demo

import re

pattern = r"\b([A-Z][a-z]+)\s+scored\s+(\d+)\b"
str = "Dhoni scored 100 runs and Kohli scored 150 runs.Rohit scored 50 runs and Dhawan scored 250 runs ."

matches = re.finditer(pattern, str)
name = []
values = []
for matchNum, match in enumerate(matches, start=1):
    name.append(match.group(1))
    values.append(match.group(2))

print(name)
print(values)

Output

['Dhoni', 'Kohli', 'Rohit', 'Dhawan']
['100', '150', '50', '250']
Sign up to request clarification or add additional context in comments.

4 Comments

How can I use a generic word pattern instead of "score" like \w+ or [A-Za-z]?
@HavishaaSharma like this \b([A-Z][a-z]+)\s+\w+\s+(\d+)\b regex101.com/r/aMeMux/1
Do, we use () for grouping? I mean, How it grouped and created a list?
Yes the () are used for grouping. Then finditer takes the group 1 and 2 values and adds them to both lists.
0

The pattern seems to be name scored value.

>>> res = re.findall(r'(\w+)\s*scored\s*(\d+)', s)
>>> names, values = zip(*res)
>>> names
('Dhoni', 'Kohli', 'Rohit', 'Dhawan')
>>> values
('100', '150', '50', '250')

Comments

0
This code basically give extract of **Name** and **Number** from a given string and save it into two lists and then store in dictionary in a form of key value pair.
import re

x = 'Dhoni scored 100 runs and Kohli scored 150 runs.Rohit scored 50 runs and Dhawan scored 250 runs.'

names=re.findall(r'[A-Z][a-z]*',x)
values=re.findall(r'[0-9]+',x)
dicts={}
for i in range(len(names)):
    dicts[names[i]]=values[i]
    print(dicts)
#Input: Dhoni scored 100 runs and Kohli scored 150 runs.Rohit scored 50 runs and Dhawan scored 250 runs.
#Output: {'Dhoni': '100', 'Kohli': '150', 'Rohit': '50', 'Dhawan': '250'}

#Input: A has 5000 rupees and B has 15000 rupees.C has 85000 rupees and D has 50000 rupees .
#Output: {'A': '5000', 'B': '15000', 'C': '85000', 'D': '50000'}

1 Comment

Welcome at SO! Please give a short description, whar your code is doing.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.