0

Working on a bot application, so I need to extract the values from the message string and pass it to a variable. The message string can be in different ways like :

message = 'name="Raj",lastname="Paul",gender="male", age=23'
message = 'name="Raj",lastname="Paul",age=23'
message = 'name="Raj",lastname="Paul",gender="male"'

The data user provided can contain all values, or sometimes age or gender field will be missing.

Where I am stuck is , I am not sure how to check if age is present in the message text. If it is then extract value corresponding to age. If age is not in message, ignore age.

It is possible to check each one word in a loop and extract the string, but it becomes quite lengthy. Please let me know if there is more easier ways

Like

if Age is present in message then get the value of age,
if lastname is present in message then get the value of lastname
if gender is present in message then get the value of gender
if name is present in message then get the value of name
3
  • If you just want to see if age is in message you can do if 'age' in message: Commented May 9, 2019 at 17:56
  • To be a bit safer, use if message.startswith('age=') or ',age=' in message:. This way you won't get false positives on things like lastname="Sager" Commented May 9, 2019 at 17:57
  • @Chrispresso, I want to check if each value like age is present then extract the value of age. lastname is present then extract its value like that Commented May 9, 2019 at 18:00

5 Answers 5

1

Use regex:

(?:[, ])age=(\d+)

which extracts numbers following 'age=' from the string.

Code:

import re

message = 'name="Raj",lastname="Paul",gender="male", age=23'
m = re.search(r'(?:[, ])age=(\d+)', message)
if m:
    print(m.group(1))

# 23
Sign up to request clarification or add additional context in comments.

2 Comments

what about page=5?
@PeterWood, its more like I need to get the value for all fields, if age is not there ignore it, if present get. If lastname is not there ignore else get the value of lastname...like that way
1

If you just want to test for age you can search the string. If you want to use this for other things in addtion to checking the age, you can split it up into a dictionary.

message = 'name="Raj",lastname="Paul",gender="male", age=23'
pairs = [pair.replace('"', '').strip() for pair in message.split(',')]
d = dict([p.split('=') for p in pairs])

'age' in d # True
d['name'] # 'Raj'

Comments

1

One thing you can do is use a regular expression and extract individual portions.

For instance, assume your message is message = 'name="Raj",lastname="Paul",gender="male", age=23', you can make your regular expression (?P<var>.*?)=(?P<out>.*?),

Here is what I would do:

import re
message = 'name="Raj",lastname="Paul",gender="male", age=23'
message += ',' # Add a comma for the regex
findall = re.findall(r'(?P<var>.*?)=(?P<out>.*?),', message) # Note the additional comma
extracted = {k.strip(): v.strip() for k,v in findall}
if 'age' in extracted:
    print(extracted['age']) # prints 23

extracted then would be a map that looks like this: {'name': '"Raj"', 'lastname': '"Paul"', 'gender': '"male"', 'age': '23'}. You can get rid of the double quotes if you really want and convert age to an int from there.

To get all the fields present you could do:

for field in extracted:
    print(field, extracted[field])

# Prints
name "Raj"
lastname "Paul"
gender "male"
age 23

Comments

1
message = 'name="Raj",lastname="Paul",gender="male", age=23'

new_msg = message.replace('"', '').replace(' ', '').split(',')  # 2nd replace to delete the extra space before age

msg_dict = dict([x.split('=') for x in new_msg])

print(msg_dict)

This code returns the following output as a dictionary. You can loop through each message and it will put the right attribute with the right key.

{'name': 'Raj', 'lastname': 'Paul', 'gender': 'male', 'age': '23'}

Comments

0

This is another possibility:

message1 = 'name="Raj",lastname="Paul",gender="male", age=23'

message2 = 'name="Raj",lastname="Paul",age=23'

message3 = 'name="Raj",lastname="Paul",gender="male"'

messages = [message1, message2, message3]

splits = [m.split(",") for m in messages]

def flatten(lst):
    temp = []
    for l in lst:
        val1, val2 = l.split("=")
        val1 = val1.strip()
        val2 = val2.strip()
        temp.append(val1)
        temp.append(val2)
    return temp

clean = list(map(lambda x: flatten(x), splits))

final = [x for x in clean if 'age' in x]

final

This would keep those messages that contain 'age'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.