Remove empty lines from a string in ruby

Question

I've gone through other similar questions and they dont seem to explain my problem.

My output ,right now is like this, I would like to remove empty lines from the string in ruby,

#    

CIRRUS LADIES NIGHT with DJ ROHIT

4th of JULY Party ft. DJ JASMEET @ I-Bar

Submerge Deep @ Pebble | Brute Force (Tuhin Mehta) | DJ Arpan (Opening)

Champagne Showers - DJs Panic & Nyth @ Blue Waves

THURSDAY PAST AND PRESENT @ Hint

and I want my output to be like this,

CIRRUS LADIES NIGHT with DJ ROHIT
4th of JULY Party ft. DJ JASMEET @ I-Bar
Submerge Deep @ Pebble | Brute Force (Tuhin Mehta) | DJ Arpan (Opening)
Champagne Showers - DJs Panic & Nyth @ Blue Waves
THURSDAY PAST AND PRESENT @ Hint

I've tried gsub /^$\n/,'' , gsub(/\n/,'') , squeeze("\n") and delete! "\n" to no avail.

Also,I forgot to mention that my string starts with a blank line, the # denotes a blank line before the first line,if that would change anything.

My String.inspect as requested,the content of the string has changed,though the issue is still the same.

string.inspect :

"\n\n\t\t\t\t\t\t\t\t\t"
"Tricky Tuesdays with DJ John @ Blend"
"\n\n\t\t\t\t\t\t\t\t\t"
"Bladder Buster Challenge with DJ Sean @ Star Rock"
"\n\n\t\t\t\t\t\t\t\t\t"
"Classic Rock Tuesday @ 10D - Chennai"
"\n\n\t\t\t\t\t\t\t\t\t"
"Vodka Night with DJ John @ Blend"
"\n\n\t\t\t\t\t\t\t\t\t"
"\"BOLLYWOOD WEDNESDAYS\" with DJ D Nash @ Candy Club"
"\n\n\t\t\t\t\t\t\t\t\t"
"RE - LAUNCH WEDNESDAY LADIES NIGHT @ ZODIAC"
"\n\n\t\t\t\t\t\t\t\t\t"
"Ladies Night @ 10 D - Chennai"
"\n\n\t\t\t\t\t\t\t\t\t"
"Wednesday Mayhem @ Dublin"
"\n\n\t\t\t\t\t\t\t\t\t"

Can you replace "\n\n" --> "\n"? Or even better "\n+" --> "\n"? — nhahtdh
– nhahtdh, Commented Jul 2, 2012 at 14:19
yeah,I've tried gsub("\n+","") and gsub(/\n\n/,"\n"),they don't work. — arvindravi
– arvindravi, Commented Jul 2, 2012 at 14:27
@arvindravi Please post the result of a .inspect on your string. — Phrogz
– Phrogz, Commented Jul 2, 2012 at 15:35
The .inspect result you posted doesn't look like the usual result of inspecting a single string. Can you post the result of calling .class on your object? — ebeland
– ebeland, Commented Jul 3, 2012 at 14:19
@ebeland Thank you for pointing it out. Sorry for the trouble everyone,I've been so ignorant and stupid. I can't believe i overlooked that they are different strings! My bad! — arvindravi
– arvindravi, Commented Jul 3, 2012 at 15:35

Matheus Moreira · Accepted Answer · 2012-07-02 15:01:34Z

5

Here's my solution:

text.gsub(/\n+|\r+/, "\n").squeeze("\n").strip

answered Jul 2, 2012 at 15:01

Matheus Moreira

17.1k3 gold badges73 silver badges112 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Tom De Leu · Accepted Answer · 2012-07-03 19:03:41Z

3

This removes all consecutive empty lines:

result = s.squeeze("\r\n").gsub(/(\r\n)+/, "\r\n")

or a commandline option without Ruby:

grep -v "^$" <file>

edited Jul 3, 2012 at 19:03

answered Jul 2, 2012 at 15:02

Tom De Leu

8,2644 gold badges34 silver badges30 bronze badges

3 Comments

Phrogz Over a year ago

"a\r\n\r\n\r\nb".squeeze("\r\n").gsub("\r\n\r\n","\r\n") #=> "a\r\n\r\nb" This is because gsub does not include the replacement results when searching again. You'd need something like result = s.squeeze(...).tap{ |s2| :go while s2.gsub!("\r\n\r\n","\r\n") }

Tom De Leu Over a year ago

ok interesting, did not know that. So you actually meant my solution does not handle Windows line endings.

Phrogz Over a year ago

Whoops! Yes, logic inversion :)

eds · Accepted Answer · 2013-02-28 04:39:21Z

2

First of all, your code removes all newlines, not just the blank ones - that doesn't sound like what you want.

Second, THE operating systems have historically disagreed on how to represent newlines - old Macs used \r for new lines, Linux and OSX use \n, and Windows uses the combo \r\n. So you really want to replace consecutive \r's and \ns (indicating a blank line in there) with a single \n.

edited Feb 28, 2013 at 4:39

answered Jul 2, 2012 at 14:34

eds

4392 silver badges10 bronze badges

5 Comments

arvindravi Over a year ago

OS - Linux - Fedora 17 I tried gsub /\r+/,"\r" and gsub /\n+/,"\n" and still no luck.

Matheus Moreira Over a year ago

Aren't line endings automatically converted to \n when the file is opened in text mode?

arvindravi Over a year ago

@MatheusMoreira Its not a file,I'm building a scraper that generates that string,which changes according to the page,so,I'm just trying to get rid of those empty/blank lines there.

Matheus Moreira Over a year ago

@arvindravi, are you sure you're dealing with \n line breaks, and not something like <br /> tags?

eds Over a year ago

@arvindravi Windows uses \r\n for each newline I believe, so the pattern has to accommodate any of these possibilities: `gsub /[\r\n]+/,"\n" would be fine as it covers all three possibilities for how newlines are represented

Victor Moroz · Accepted Answer · 2012-07-02 15:54:16Z

1

.split(/\n/).reject{ |l| l.chomp.empty? }.join("\n")

for Unix style only:

.split(/\n/).reject(&:empty?).join("\n")

removes whitespace lines too (Unix, Rails method):

.split(/\n/).reject(&:blank?).join("\n")

edited Jul 2, 2012 at 15:54

answered Jul 2, 2012 at 15:46

Victor Moroz

9,2351 gold badge21 silver badges23 bronze badges

Comments

Phrogz · Accepted Answer · 2012-07-02 16:03:25Z

Here's a single regex that removes all blank lines, including those at the start or end of the file, including lines that contain only spaces or tabs, and allowing for all three forms of line ending markers (\r\n, \n, and \r):

def remove_blank_lines( str, line_ending="\n" )
  str.gsub(/(?<=\A|#{line_ending})[ \t]*(?:#{line_ending}|\z)/,'')
end

Tested:

[ "\r\n", "\n", "\r" ].each do |marker|
    puts '='*70, "Lines ending with: #{marker.inspect}", '='*70
  [ "", " ", "\t", " \t", "\t " ].each do |whitespace|
    0.upto(2) do |lines|
        blank_lines = "#{whitespace}#{marker*lines}"
      s = "#{marker*lines}a#{marker*lines}b#{blank_lines}c#{blank_lines}"
      tight = remove_blank_lines(s, marker)
      puts "%43s -> %s" % [s.inspect, tight.inspect]
    end
  end
end

#=> ======================================================================
#=> Lines ending with: "\r\n"
#=> ======================================================================
#=>                                       "abc" -> "abc"
#=>                       "\r\na\r\nb\r\nc\r\n" -> "a\r\nb\r\nc\r\n"
#=>       "\r\n\r\na\r\n\r\nb\r\n\r\nc\r\n\r\n" -> "a\r\nb\r\nc\r\n"
#=>                                     "ab c " -> "ab c "
#=>                     "\r\na\r\nb \r\nc \r\n" -> "a\r\nb \r\nc \r\n"
#=>     "\r\n\r\na\r\n\r\nb \r\n\r\nc \r\n\r\n" -> "a\r\nb \r\nc \r\n"
#=>                                   "ab\tc\t" -> "ab\tc\t"
#=>                   "\r\na\r\nb\t\r\nc\t\r\n" -> "a\r\nb\t\r\nc\t\r\n"
#=>   "\r\n\r\na\r\n\r\nb\t\r\n\r\nc\t\r\n\r\n" -> "a\r\nb\t\r\nc\t\r\n"
#=>                                 "ab \tc \t" -> "ab \tc \t"
#=>                 "\r\na\r\nb \t\r\nc \t\r\n" -> "a\r\nb \t\r\nc \t\r\n"
#=> "\r\n\r\na\r\n\r\nb \t\r\n\r\nc \t\r\n\r\n" -> "a\r\nb \t\r\nc \t\r\n"
#=>                                 "ab\t c\t " -> "ab\t c\t "
#=>                 "\r\na\r\nb\t \r\nc\t \r\n" -> "a\r\nb\t \r\nc\t \r\n"
#=> "\r\n\r\na\r\n\r\nb\t \r\n\r\nc\t \r\n\r\n" -> "a\r\nb\t \r\nc\t \r\n"
#=> ======================================================================
#=> Lines ending with: "\n"
#=> ======================================================================
#=>                                       "abc" -> "abc"
#=>                               "\na\nb\nc\n" -> "a\nb\nc\n"
#=>                       "\n\na\n\nb\n\nc\n\n" -> "a\nb\nc\n"
#=>                                     "ab c " -> "ab c "
#=>                             "\na\nb \nc \n" -> "a\nb \nc \n"
#=>                     "\n\na\n\nb \n\nc \n\n" -> "a\nb \nc \n"
#=>                                   "ab\tc\t" -> "ab\tc\t"
#=>                           "\na\nb\t\nc\t\n" -> "a\nb\t\nc\t\n"
#=>                   "\n\na\n\nb\t\n\nc\t\n\n" -> "a\nb\t\nc\t\n"
#=>                                 "ab \tc \t" -> "ab \tc \t"
#=>                         "\na\nb \t\nc \t\n" -> "a\nb \t\nc \t\n"
#=>                 "\n\na\n\nb \t\n\nc \t\n\n" -> "a\nb \t\nc \t\n"
#=>                                 "ab\t c\t " -> "ab\t c\t "
#=>                         "\na\nb\t \nc\t \n" -> "a\nb\t \nc\t \n"
#=>                 "\n\na\n\nb\t \n\nc\t \n\n" -> "a\nb\t \nc\t \n"
#=> ======================================================================
#=> Lines ending with: "\r"
#=> ======================================================================
#=>                                       "abc" -> "abc"
#=>                               "\ra\rb\rc\r" -> "a\rb\rc\r"
#=>                       "\r\ra\r\rb\r\rc\r\r" -> "a\rb\rc\r"
#=>                                     "ab c " -> "ab c "
#=>                             "\ra\rb \rc \r" -> "a\rb \rc \r"
#=>                     "\r\ra\r\rb \r\rc \r\r" -> "a\rb \rc \r"
#=>                                   "ab\tc\t" -> "ab\tc\t"
#=>                           "\ra\rb\t\rc\t\r" -> "a\rb\t\rc\t\r"
#=>                   "\r\ra\r\rb\t\r\rc\t\r\r" -> "a\rb\t\rc\t\r"
#=>                                 "ab \tc \t" -> "ab \tc \t"
#=>                         "\ra\rb \t\rc \t\r" -> "a\rb \t\rc \t\r"
#=>                 "\r\ra\r\rb \t\r\rc \t\r\r" -> "a\rb \t\rc \t\r"
#=>                                 "ab\t c\t " -> "ab\t c\t "
#=>                         "\ra\rb\t \rc\t \r" -> "a\rb\t \rc\t \r"
#=>                 "\r\ra\r\rb\t \r\rc\t \r\r" -> "a\rb\t \rc\t \r"

stema · Accepted Answer · 2012-07-02 14:27:13Z

0

Try

/^\n/

and replace with the empty string.

are you sure your newline character is only \n? If not try

/^\r?\n/

to allow also the linebreak sequence \r\n.

answered Jul 2, 2012 at 14:27

stema

93.5k20 gold badges110 silver badges138 bronze badges

Comments

Phrogz · Accepted Answer · 2012-07-02 15:34:45Z

Here's an ugly hack based on @Tom's answer:

result = s.squeeze("\r\n").tap{ |s2| :go while s2.gsub!("\r\n\r\n","\r\n") }

It supports DOS (\r\n), Unix (\n), and MacOS 9- (\r) line breaks. Tested:

[ "\r\n", "\n", "\r" ].each do |marker|
  1.upto(5) do |lines|
    s = "a#{marker*lines}b"
    tight = s.squeeze("\r\n").tap{ |s2| :go while s2.gsub!("\r\n\r\n","\r\n") }
    puts "%24s -> %s" % [s.inspect, tight.inspect]
  end
end
#=>                 "a\r\nb" -> "a\r\nb"
#=>             "a\r\n\r\nb" -> "a\r\nb"
#=>         "a\r\n\r\n\r\nb" -> "a\r\nb"
#=>     "a\r\n\r\n\r\n\r\nb" -> "a\r\nb"
#=> "a\r\n\r\n\r\n\r\n\r\nb" -> "a\r\nb"
#=>                   "a\nb" -> "a\nb"
#=>                 "a\n\nb" -> "a\nb"
#=>               "a\n\n\nb" -> "a\nb"
#=>             "a\n\n\n\nb" -> "a\nb"
#=>           "a\n\n\n\n\nb" -> "a\nb"
#=>                   "a\rb" -> "a\rb"
#=>                 "a\r\rb" -> "a\rb"
#=>               "a\r\r\rb" -> "a\rb"
#=>             "a\r\r\r\rb" -> "a\rb"
#=>           "a\r\r\r\r\rb" -> "a\rb"

Note that this assumes that your blank lines are truly blank, and do not have any whitespace on them. If this is the case, you could do a pre pass of s.gsub(/^[ \t]+$/,'')

George Shaw · Accepted Answer · 2015-07-27 20:11:54Z

0

This will do it: .gsub(/(\n\s*\n)+/, "\n")

and replace \n in the regex with [\n|\r] if needed.

answered Jul 27, 2015 at 20:11

George Shaw

1,7811 gold badge19 silver badges33 bronze badges

Collectives™ on Stack Overflow

Remove empty lines from a string in ruby

8 Answers 8

Comments

3 Comments

5 Comments

Comments

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

Comments

3 Comments

5 Comments

Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related