4

I'm new to python and would like to know how I can tokenize strings based on a specified delimiter. For example, if I have the string "brother's" and I would like to turn it to ["brother", "\s"] or a string "red/blue" to ["red", "blue"], what would be the most appropriate way to do this? Thanks.

1
  • 1
    I would start with pydoc str and work from there. Commented Feb 5, 2014 at 6:17

3 Answers 3

2

You would use the split method:

>>> 'red/blue'.split('/')
['red', 'blue']
>>> "brother's".split("'")
['brother', 's']
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks. How about if I had something like "brothers'" with the single quote after "brothers" and I want it to be ['brother', '\'s']?
1

What you're looking for is called split, and it's called on the str object. For instance:

>>> brotherstring = "brother's"
>>> brotherstring.split("'")
['brother', 's']
>>> redbluestring = "red/blue"
>>> redbluestring.split("/")
['red', 'blue']

There are a few variants on split, such as rsplit, partition, etc that all do different things. Read the documentation to find the one that works best for your purpose.

Comments

1

Try this.

>>> strr =  "brother's"
>>> strr.replace("'","\\'").split("\\")
['brother', "'s"]

>>> strrr = "red/blue"
>>> strrr.split('/')
['red', 'blue']

2 Comments

This is a great answer. It shows how to preserve punctuation, in the case that your punctuation is not your delimiter. Can reconstruct the original later, or clean further if the apostrophe is really unwanted.
@VISQL Thanks for the appreciation.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.