extracting values from a string Python

Question

Working on a bot application, so I need to extract the values from the message string and pass it to a variable. The message string can be in different ways like :

message = 'name="Raj",lastname="Paul",gender="male", age=23'
message = 'name="Raj",lastname="Paul",age=23'
message = 'name="Raj",lastname="Paul",gender="male"'

The data user provided can contain all values, or sometimes age or gender field will be missing.

Where I am stuck is , I am not sure how to check if age is present in the message text. If it is then extract value corresponding to age. If age is not in message, ignore age.

It is possible to check each one word in a loop and extract the string, but it becomes quite lengthy. Please let me know if there is more easier ways

Like

if Age is present in message then get the value of age,
if lastname is present in message then get the value of lastname
if gender is present in message then get the value of gender
if name is present in message then get the value of name

If you just want to see if age is in message you can do if 'age' in message: — Chrispresso
– Chrispresso, Commented May 9, 2019 at 17:56
To be a bit safer, use if message.startswith('age=') or ',age=' in message:. This way you won't get false positives on things like lastname="Sager" — John Gordon
– John Gordon, Commented May 9, 2019 at 17:57
@Chrispresso, I want to check if each value like age is present then extract the value of age. lastname is present then extract its value like that — wanderors
– wanderors, Commented May 9, 2019 at 18:00

Austin · Accepted Answer · 2019-05-09 18:03:36Z

1

Use regex:

(?:[, ])age=(\d+)

which extracts numbers following 'age=' from the string.

Code:

import re

message = 'name="Raj",lastname="Paul",gender="male", age=23'
m = re.search(r'(?:[, ])age=(\d+)', message)
if m:
    print(m.group(1))

# 23

edited May 9, 2019 at 18:03

answered May 9, 2019 at 17:56

Austin

26.1k4 gold badges28 silver badges52 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Open AI - Opting Out Over a year ago

what about page=5?

wanderors Over a year ago

@PeterWood, its more like I need to get the value for all fields, if age is not there ignore it, if present get. If lastname is not there ignore else get the value of lastname...like that way

Mark · Accepted Answer · 2019-05-09 18:03:40Z

1

If you just want to test for age you can search the string. If you want to use this for other things in addtion to checking the age, you can split it up into a dictionary.

message = 'name="Raj",lastname="Paul",gender="male", age=23'
pairs = [pair.replace('"', '').strip() for pair in message.split(',')]
d = dict([p.split('=') for p in pairs])

'age' in d # True
d['name'] # 'Raj'

answered May 9, 2019 at 18:03

Mark

92.7k8 gold badges116 silver badges156 bronze badges

Comments

Chrispresso · Accepted Answer · 2019-05-09 18:10:29Z

One thing you can do is use a regular expression and extract individual portions.

For instance, assume your message is message = 'name="Raj",lastname="Paul",gender="male", age=23', you can make your regular expression (?P<var>.*?)=(?P<out>.*?),

Here is what I would do:

import re
message = 'name="Raj",lastname="Paul",gender="male", age=23'
message += ',' # Add a comma for the regex
findall = re.findall(r'(?P<var>.*?)=(?P<out>.*?),', message) # Note the additional comma
extracted = {k.strip(): v.strip() for k,v in findall}
if 'age' in extracted:
    print(extracted['age']) # prints 23

extracted then would be a map that looks like this: {'name': '"Raj"', 'lastname': '"Paul"', 'gender': '"male"', 'age': '23'}. You can get rid of the double quotes if you really want and convert age to an int from there.

To get all the fields present you could do:

for field in extracted:
    print(field, extracted[field])

# Prints
name "Raj"
lastname "Paul"
gender "male"
age 23

amrtw09 · Accepted Answer · 2019-05-09 18:29:30Z

1

message = 'name="Raj",lastname="Paul",gender="male", age=23'

new_msg = message.replace('"', '').replace(' ', '').split(',')  # 2nd replace to delete the extra space before age

msg_dict = dict([x.split('=') for x in new_msg])

print(msg_dict)

This code returns the following output as a dictionary. You can loop through each message and it will put the right attribute with the right key.

{'name': 'Raj', 'lastname': 'Paul', 'gender': 'male', 'age': '23'}

answered May 9, 2019 at 18:29

amrtw09

3131 gold badge2 silver badges8 bronze badges

Comments

ju.arroyom · Accepted Answer · 2019-05-09 18:22:17Z

0

This is another possibility:

message1 = 'name="Raj",lastname="Paul",gender="male", age=23'

message2 = 'name="Raj",lastname="Paul",age=23'

message3 = 'name="Raj",lastname="Paul",gender="male"'

messages = [message1, message2, message3]

splits = [m.split(",") for m in messages]

def flatten(lst):
    temp = []
    for l in lst:
        val1, val2 = l.split("=")
        val1 = val1.strip()
        val2 = val2.strip()
        temp.append(val1)
        temp.append(val2)
    return temp

clean = list(map(lambda x: flatten(x), splits))

final = [x for x in clean if 'age' in x]

final

This would keep those messages that contain 'age'

answered May 9, 2019 at 18:22

ju.arroyom

1829 bronze badges

Collectives™ on Stack Overflow

extracting values from a string Python

5 Answers 5

2 Comments

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

2 Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related