How can I separate characters inside a string in python?

Question

I have data in a txt file and I need to separate a sentence from a value. Every line of the txt file has the form <Sentence> <number>. I need to read the value and the sentence in two different columns, but the sentences can contain numbers, dots and every possible stuff since they are just random sentences. The numeric value in question though is always at the end of the line. For example :

This coffee is bad. -1

How can I do this in Python?

If the format is ...anything here... then ". ##" that will be fairly simple. But the end of the sentence is the key. Is it always a "." followed by space(s)? — Shavk with a Hoon
– Shavk with a Hoon, Commented Jun 28, 2022 at 20:47
No it's not always a dot followed by spaces. Sometimes dots are forgotten, sometimes are 3 dots, sometimes it's a comma and whatever you may write like parenthesis or so. The only thing that's always true is that the value is at the end of the sentence separated by 3 spaces — NicodemoXIII
– NicodemoXIII, Commented Jun 29, 2022 at 8:44

Shabble · Accepted Answer · 2022-06-28 20:49:14Z

1

if it always follows the format sentence / random <space><number><end> then something like:

sent, _, num = input_str.rpartition(' ')

answered Jun 28, 2022 at 20:49

Shabble

6324 silver badges11 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

mozway · Accepted Answer · 2022-06-28 20:58:47Z

0

Here is a solution using pandas to load the CSV as DataFrame with a regex separator:

import pandas as pd

df = pd.read_csv('file.csv', sep='\s(?=\S+$)', engine='python',
                 header=None, names=['sentence', 'Value'])

Output:

              sentence  value
0  This coffee is bad.     -1
1        other example    123

You can then easily convert to lists:

df.to_dict('list')

Output:

{'sentence': ['This coffee is bad.', 'other example'],
 'value': [-1, 123]}

Used text input:

This coffee is bad. -1
other example 123

edited Jun 28, 2022 at 20:58

answered Jun 28, 2022 at 20:53

mozway

267k13 gold badges56 silver badges106 bronze badges

2 Comments

NicodemoXIII Over a year ago

This worked smoothly. Just to understand, how am i supposed to know this separator works for multiple spaces?

mozway Over a year ago

I don't get the question, this regex works only for 2 columns as described: sentence + single "word"/digits in the end

bekirbakar · Accepted Answer · 2022-06-28 21:10:53Z

0

There are many ways to do it.

The simple/dirty solution is as follows:

Run regex pattern to extract digit groups then select the last one as the second column.
Subtract what you find in the first step from the string/line and make it the first column.

This code should give you an idea.

import re

sample = "This coffee 5656 is bad. -134 -454"
    
result = re.findall('[0-9]+', sample)
    
first_column = sample.replace(result[-1], '')
second_column = result[-1]

print(f'First Column: {first_column}')
print(f'Second Column: {second_column}')

Output

First Column: This coffee 5656 is bad. -134 -
Second Column: 454

answered Jun 28, 2022 at 21:10

bekirbakar

1663 silver badges8 bronze badges

Collectives™ on Stack Overflow

How can I separate characters inside a string in python?

3 Answers 3

Comments

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related