22

In C++, what's an easy way to turn:

This std::string

\t\tHELLO WORLD\r\nHELLO\t\nWORLD     \t

Into:

HELLOWORLDHELLOWORLD
4
  • 2
    @tomislav-maric I don't think it's a duplicate of that post, the OP there was working with a cin stream, and thus using iostream functions. Commented Jan 9, 2013 at 10:29
  • similar but not exact duplicate, so not voting to close. Commented Jan 9, 2013 at 10:29
  • @CashCow I checked it again.. you are right, sorry about that. Commented Jan 9, 2013 at 10:36
  • 2
    See also Remove spaces from std::string in C++ Commented Feb 26, 2014 at 2:17

6 Answers 6

36

Simple combination of std::remove_if and std::string::erase.

Not totally safe version

s.erase( std::remove_if( s.begin(), s.end(), ::isspace ), s.end() );

For safer version replace ::isspace with

std::bind( std::isspace<char>, _1, std::locale::classic() )

(Include all relevant headers)

For a version that works with alternative character types replace <char> with <ElementType> or whatever your templated character type is. You can of course also replace the locale with a different one. If you do that, beware to avoid the inefficiency of recreating the locale facet too many times.

In C++11 you can make the safer version into a lambda with:

[]( char ch ) { return std::isspace<char>( ch, std::locale::classic() ); }
Sign up to request clarification or add additional context in comments.

18 Comments

@chris ::isspace includes the new line as well: cplusplus.com/reference/cctype/isspace
isspace has UB for all characters except those in the basic something something. C99 §7.4/1.
C++98 delegates the behaviour of the C standard library to C89, and C++11 delegates the behaviour of the C standard library to C99.
My apologies. I got slightly confused about the true nature of the problem :) I knew using isspace was wrong, but I got confused as to the why. The why is related to isspace taking an int and to char being signed. Here is a small program that explains the issue stacked-crooked.com/view?id=817f92f4a2482e5da0b7533285e53edb.
(And note how this is not about multibyte encodings; any byte with a value higher than 0x7F in the source, regardless of encoding will trigger this issue; even single byte encodings like Latin-1 or Windows-1252 will cause it. Only 7-bit encodings like ASCII work fine)
|
13

If C++03

struct RemoveDelimiter
{
  bool operator()(char c)
  {
    return (c =='\r' || c =='\t' || c == ' ' || c == '\n');
  }
};

std::string s("\t\tHELLO WORLD\r\nHELLO\t\nWORLD     \t");
s.erase( std::remove_if( s.begin(), s.end(), RemoveDelimiter()), s.end());

Or use C++11 lambda

s.erase(std::remove_if( s.begin(), s.end(), 
     [](char c){ return (c =='\r' || c =='\t' || c == ' ' || c == '\n');}), s.end() );

PS. Erase-remove idiom is used

Comments

4

c++11

std::string input = "\t\tHELLO WORLD\r\nHELLO\t\nWORLD     \t";

auto rs = std::regex_replace(input,std::regex("\\s+"), "");

std::cout << rs << std::endl;

/tmp ❮❮❮ ./play

HELLOWORLDHELLOWORLD

Comments

4

In C++11 you can use a lambda rather than using std::bind:

str.erase(
    std::remove_if(str.begin(), str.end(), 
        [](char c) -> bool
        { 
            return std::isspace<char>(c, std::locale::classic()); 
        }), 
    str.end());

Comments

3

You could use Boost.Algorithm's erase_all

#include <boost/algorithm/string/erase.hpp>
#include <iostream>
#include <string>

int main()
{
    std::string s = "Hello World!";
    // or the more expensive one-liner in case your string is const
    // std::cout << boost::algorithm::erase_all_copy(s, " ") << "\n";
    boost::algorithm::erase_all(s, " "); 
    std::cout << s << "\n";
}

NOTE: as is mentioned in the comments: trim_copy (or its cousins trim_copy_left and trim_copy_right) only remove whitespace from the beginning and end of a string.

1 Comment

I saw some solutions that used Boost, but I'm not after a trim function, trimming I believe is doing something like XX___XX_ -> XX_XX whereas I want the final solution to be XXXX.
2

Stepping through it character by character and using string::erase() should work fine.

void removeWhitespace(std::string& str) {
    for (size_t i = 0; i < str.length(); i++) {
        if (str[i] == ' ' || str[i] == '\n' || str[i] == '\t') {
            str.erase(i, 1);
            i--;
        }
    }
}

2 Comments

Doesn't work when there are adjacent space characters. The first one is erased, moving the second one down to position i. Then you go around the loop, increment i, and never check the second one.
You're right. Fixed it.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.