0

i have a list of string with names and numbers like :

["mike5","john","sara2","bob","nick6"]

and i want to create from each string a tuple (name,age) like this :

[('mike', 5), ('john', 0), ('sara', 2), ('bob', 0), ('nick', 5)]

so if a string doesn't contain a number the age is 0

what is the simplest way to do it?

i tried to use :

temp = re.compile("([a-zA-Z]+)([0-9]+)")
res = temp.match(type).group()

but it fails

1
  • 1
    "but it fails" is not a meaningful description of the error Commented Mar 29, 2021 at 17:31

4 Answers 4

1

You can use the following regex to find the name and the number ([a-z]+)(\d+)?, along with .groups(0) as default value (see match.groups())

def split_vals(word):
    name, number = re.search(r"([a-z]+)(\d+)?", word).groups(0)
    return name, int(number)

values = ["mike5", "john", "sara2", "bob", "nick6"]
values = [split_vals(value) for value in values]
# [('mike', 5), ('john', 0), ('sara', 2), ('bob', 0), ('nick', 6)]
Sign up to request clarification or add additional context in comments.

Comments

0

If fails because your match doesn't return anything:

temp.match('john') is None
True

You need to change your regex to:

# The * means 0 or more. Otherwise, you've required a number to be present
temp = re.compile("([a-zA-Z]+)([0-9]*)")
temp.match('john')
<re.Match object; span=(0, 4), match='john'>

Last, if you want tuples, use groups(), not group()

[temp.match(item).groups() for item in x]
[('mike', '5'), ('john', ''), ('sara', '2'), ('bob', ''), ('nick', '6')]

Comments

0

A couple of things:

The regex is correct up to [0-9]+. This means you MUST match 1 or more digits. However, not all your strings will have a digit present such as john, so I would suggest using * which matches zero or more digits.

You are using the syntax pattern.match(string) which will throw an error. You need to use the syntax match(pattern, string) (see below for further clarification).

In addition, using groups() instead of group() will return a tuple of all the captured matches within your regex (again see below).

Using a loop to iterate over your items and an if statement you should be able to achieve your desired result:

lst=["mike5","john","sara2","bob","nick6"]
pattern = re.compile("([a-zA-Z]+)([0-9]*)")
name_age = []
for value in lst: 
    name,age = re.match(pattern,value).groups()
    if not age: age = 0
    name_age.append((name,age))
print(name_age)

Comments

0
import re

inArr = ["mike5","john","sara2","bob","nick6"]
outArr = []

for item in inArr:
    regexResult = re.search('([a-z]+)(\d?)', item, re.IGNORECASE)
    if regexResult:
        name = regexResult.group(1)
        age = regexResult.group(2) or 0
        outArr.append((name, int(age))

print(outArr) # [('mike', 5), ('john', 0), ('sara', 2), ('bob', 0), ('nick', 6)]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.