2

I have a string, resulting from some machine learning algorithm, which is generally formed by multiple lines. At the beginning and at the end there can be some lines not containing any characters (except for whitespaces), and in between there should be 2 lines, each containing a word followed by some numbers and (sometimes) other characters.

Something like this


first_word  3 5 7 @  4
second_word 4 5 67| 5 [


I need to extract the 2 words and the numeric characters.

I can eliminate the empty lines by doing something like:

lines_list = initial_string.split("\n")
for line in lines_list:
    if len(line) > 0 and not line.isspace():
        print(line)

but now I was wondering:

  1. if there is a more robust, general way
  2. how to parse each of the remaining 2 central lines, by extracting the words and digits (and discard the other characters mixed in between the digits, if there are any)

I imagine reg expressions could be useful, but I never really used them, so I'm struggling a little bit at the moment

1
  • What exact ouput do you expect? Commented Oct 26, 2021 at 8:56

1 Answer 1

3

I would use re.findall here:

inp = '''first_word  3 5 7 @  4
second_word 4 5 67| 5 ['''
matches = re.findall(r'\w+', inp)
print(matches)  # ['first_word', '3', '5', '7', '4', 'second_word', '4', '5', '67', '5']

If you want to process each line separately, then simply split in the input on CR?LF and use the same approach:

inp = '''first_word  3 5 7 @  4
second_word 4 5 67| 5 ['''
lines = inp.split('\n')
for line in lines:
    matches = re.findall(r'\w+', line)
    print(matches)

This prints:

['first_word', '3', '5', '7', '4']
['second_word', '4', '5', '67', '5']
Sign up to request clarification or add additional context in comments.

1 Comment

Your answer worked perfectly for the case I posted, but I had to modify the question since I acquired some new info on the string I'm parsing. Could you please take another look now? Or, I can accept the previous question and ask about the new points in another question

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.