1

how do you replace just the single white space between words to '_' in python?

For example:

Input:

09     Web Problem       Any problem has to do with the dept. web sites
12     SW Help           Questions about installed SW (hotline support)

Output:

09     Web_Problem       Any_problem_has_to_do_with_the_dept._web_sites
12     SW_Help           Questions_about_installed_SW_(hotline_support)

thanks!

2
  • 2
    s.replace(' ', '_') would be the answer, but you seem to have a requirement not to put underscores around numbers. Commented Feb 4, 2012 at 4:47
  • No, now that the question is formatted correctly (and the fact the text stated "single white space"), the requirement is to only replace spaces (or white space) that occurs on its own. In other words, don't touch white-space groups of a size more than one. Commented Feb 4, 2012 at 4:59

3 Answers 3

3

You can use regular expressions to do this:

>>> import re
>>> x = '09     Web Problem       Any problem has to do with the dept. web sites'
>>> print re.sub(r'([^\s])\s([^\s])', r'\1_\2',x)
09     Web_Problem       Any_problem_has_to_do_with_the_dept._web_sites

The search pattern is (1) any non-white-space character, followed by (2) a single white-space character, followed by (3) another non-white-space character.

Numbers 1 and 3 are captured so that they can be used in the replacement pattern. Number 2 is ignored and we put an underscore in instead.

This leaves the multi-white-space areas alone and simply changes the singly-occurring white-space characters into underscores, which is what I think you were asking for.

Sign up to request clarification or add additional context in comments.

1 Comment

And thank your for explaining the regex. do you mind recommand some good sites or books to learn regex as well?
1

If you are trying to maintain the space between the first number and text then:

Updated:

import re
match = re.match("^([0-9]+)(\ +)(.*?)(\ +\ +)(.*)",yourstring)
output = match.group(1)+match.group(2)+match.group(3).replace(' ','_')+match.group(4)+ match.group(5).replace(' ', '_')

Comments

0

To read the file in, you'll want to use the open() function along with a loop (a for loop would make good sense) to read each line.

To break the line into pieces, you can use the nifty string slice syntax. See http://docs.python.org/tutorial/introduction.html#strings for some examples on slices.

The do the actual replacement of spaces to _, the replace method is what you want.

'abc def'.replace(' ', '_')

See http://docs.python.org/library/stdtypes.html#string-methods for more useful string methods.

Since you are just getting started with Python, I highly recommend the following tutorial: http://learnpythonthehardway.org/ Work through the whole thing and you should have a solid foundation to build on.

4 Comments

The caveat to this is that he would need to make sure the space between the number, subject and text was a tab and not a space.
If you're also needing to break your input file up, this is starting to look alot like homework. Take a look at open() to read the file in and string slices to break each line apart. See docs.python.org/tutorial/introduction.html#strings for some examples on slices.
thanks for your quick respond!!! I actually formated my example wrong at the time you answered my question. Could you take a look again?
Yeah, I'll add the information on open and string slices to my answer above. Keep in mind that I'm trying to give you some direction now rather than the exact code to cut and paste.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.