3

I've gone through other similar questions and they dont seem to explain my problem.

My output ,right now is like this, I would like to remove empty lines from the string in ruby,

#    

CIRRUS LADIES NIGHT with DJ ROHIT

4th of JULY Party ft. DJ JASMEET @ I-Bar

Submerge Deep @ Pebble | Brute Force (Tuhin Mehta) | DJ Arpan (Opening)

Champagne Showers - DJs Panic & Nyth @ Blue Waves

THURSDAY PAST AND PRESENT @ Hint

and I want my output to be like this,

CIRRUS LADIES NIGHT with DJ ROHIT
4th of JULY Party ft. DJ JASMEET @ I-Bar
Submerge Deep @ Pebble | Brute Force (Tuhin Mehta) | DJ Arpan (Opening)
Champagne Showers - DJs Panic & Nyth @ Blue Waves
THURSDAY PAST AND PRESENT @ Hint

I've tried gsub /^$\n/,'' , gsub(/\n/,'') , squeeze("\n") and delete! "\n" to no avail.

Also,I forgot to mention that my string starts with a blank line, the # denotes a blank line before the first line,if that would change anything.

My String.inspect as requested,the content of the string has changed,though the issue is still the same.

string.inspect :

"\n\n\t\t\t\t\t\t\t\t\t"
"Tricky Tuesdays with DJ John @ Blend"
"\n\n\t\t\t\t\t\t\t\t\t"
"Bladder Buster Challenge with DJ Sean @ Star Rock"
"\n\n\t\t\t\t\t\t\t\t\t"
"Classic Rock Tuesday @ 10D - Chennai"
"\n\n\t\t\t\t\t\t\t\t\t"
"Vodka Night with DJ John @ Blend"
"\n\n\t\t\t\t\t\t\t\t\t"
"\"BOLLYWOOD WEDNESDAYS\" with DJ D Nash @ Candy Club"
"\n\n\t\t\t\t\t\t\t\t\t"
"RE - LAUNCH WEDNESDAY LADIES NIGHT @ ZODIAC"
"\n\n\t\t\t\t\t\t\t\t\t"
"Ladies Night @ 10 D - Chennai"
"\n\n\t\t\t\t\t\t\t\t\t"
"Wednesday Mayhem @ Dublin"
"\n\n\t\t\t\t\t\t\t\t\t"
5
  • 3
    Can you replace "\n\n" --> "\n"? Or even better "\n+" --> "\n"? Commented Jul 2, 2012 at 14:19
  • yeah,I've tried gsub("\n+","") and gsub(/\n\n/,"\n"),they don't work. Commented Jul 2, 2012 at 14:27
  • @arvindravi Please post the result of a .inspect on your string. Commented Jul 2, 2012 at 15:35
  • The .inspect result you posted doesn't look like the usual result of inspecting a single string. Can you post the result of calling .class on your object? Commented Jul 3, 2012 at 14:19
  • @ebeland Thank you for pointing it out. Sorry for the trouble everyone,I've been so ignorant and stupid. I can't believe i overlooked that they are different strings! My bad! Commented Jul 3, 2012 at 15:35

8 Answers 8

5

Here's my solution:

text.gsub(/\n+|\r+/, "\n").squeeze("\n").strip
Sign up to request clarification or add additional context in comments.

Comments

3

This removes all consecutive empty lines:

result = s.squeeze("\r\n").gsub(/(\r\n)+/, "\r\n")

or a commandline option without Ruby:

grep -v "^$" <file>

3 Comments

"a\r\n\r\n\r\nb".squeeze("\r\n").gsub("\r\n\r\n","\r\n") #=> "a\r\n\r\nb" This is because gsub does not include the replacement results when searching again. You'd need something like result = s.squeeze(...).tap{ |s2| :go while s2.gsub!("\r\n\r\n","\r\n") }
ok interesting, did not know that. So you actually meant my solution does not handle Windows line endings.
Whoops! Yes, logic inversion :)
2

First of all, your code removes all newlines, not just the blank ones - that doesn't sound like what you want.

Second, THE operating systems have historically disagreed on how to represent newlines - old Macs used \r for new lines, Linux and OSX use \n, and Windows uses the combo \r\n. So you really want to replace consecutive \r's and \ns (indicating a blank line in there) with a single \n.

5 Comments

OS - Linux - Fedora 17 I tried gsub /\r+/,"\r" and gsub /\n+/,"\n" and still no luck.
Aren't line endings automatically converted to \n when the file is opened in text mode?
@MatheusMoreira Its not a file,I'm building a scraper that generates that string,which changes according to the page,so,I'm just trying to get rid of those empty/blank lines there.
@arvindravi, are you sure you're dealing with \n line breaks, and not something like <br /> tags?
@arvindravi Windows uses \r\n for each newline I believe, so the pattern has to accommodate any of these possibilities: `gsub /[\r\n]+/,"\n" would be fine as it covers all three possibilities for how newlines are represented
1

.split(/\n/).reject{ |l| l.chomp.empty? }.join("\n")

for Unix style only:

.split(/\n/).reject(&:empty?).join("\n")

removes whitespace lines too (Unix, Rails method):

.split(/\n/).reject(&:blank?).join("\n")

Comments

1

Here's a single regex that removes all blank lines, including those at the start or end of the file, including lines that contain only spaces or tabs, and allowing for all three forms of line ending markers (\r\n, \n, and \r):

def remove_blank_lines( str, line_ending="\n" )
  str.gsub(/(?<=\A|#{line_ending})[ \t]*(?:#{line_ending}|\z)/,'')
end

Tested:

[ "\r\n", "\n", "\r" ].each do |marker|
    puts '='*70, "Lines ending with: #{marker.inspect}", '='*70
  [ "", " ", "\t", " \t", "\t " ].each do |whitespace|
    0.upto(2) do |lines|
        blank_lines = "#{whitespace}#{marker*lines}"
      s = "#{marker*lines}a#{marker*lines}b#{blank_lines}c#{blank_lines}"
      tight = remove_blank_lines(s, marker)
      puts "%43s -> %s" % [s.inspect, tight.inspect]
    end
  end
end

#=> ======================================================================
#=> Lines ending with: "\r\n"
#=> ======================================================================
#=>                                       "abc" -> "abc"
#=>                       "\r\na\r\nb\r\nc\r\n" -> "a\r\nb\r\nc\r\n"
#=>       "\r\n\r\na\r\n\r\nb\r\n\r\nc\r\n\r\n" -> "a\r\nb\r\nc\r\n"
#=>                                     "ab c " -> "ab c "
#=>                     "\r\na\r\nb \r\nc \r\n" -> "a\r\nb \r\nc \r\n"
#=>     "\r\n\r\na\r\n\r\nb \r\n\r\nc \r\n\r\n" -> "a\r\nb \r\nc \r\n"
#=>                                   "ab\tc\t" -> "ab\tc\t"
#=>                   "\r\na\r\nb\t\r\nc\t\r\n" -> "a\r\nb\t\r\nc\t\r\n"
#=>   "\r\n\r\na\r\n\r\nb\t\r\n\r\nc\t\r\n\r\n" -> "a\r\nb\t\r\nc\t\r\n"
#=>                                 "ab \tc \t" -> "ab \tc \t"
#=>                 "\r\na\r\nb \t\r\nc \t\r\n" -> "a\r\nb \t\r\nc \t\r\n"
#=> "\r\n\r\na\r\n\r\nb \t\r\n\r\nc \t\r\n\r\n" -> "a\r\nb \t\r\nc \t\r\n"
#=>                                 "ab\t c\t " -> "ab\t c\t "
#=>                 "\r\na\r\nb\t \r\nc\t \r\n" -> "a\r\nb\t \r\nc\t \r\n"
#=> "\r\n\r\na\r\n\r\nb\t \r\n\r\nc\t \r\n\r\n" -> "a\r\nb\t \r\nc\t \r\n"
#=> ======================================================================
#=> Lines ending with: "\n"
#=> ======================================================================
#=>                                       "abc" -> "abc"
#=>                               "\na\nb\nc\n" -> "a\nb\nc\n"
#=>                       "\n\na\n\nb\n\nc\n\n" -> "a\nb\nc\n"
#=>                                     "ab c " -> "ab c "
#=>                             "\na\nb \nc \n" -> "a\nb \nc \n"
#=>                     "\n\na\n\nb \n\nc \n\n" -> "a\nb \nc \n"
#=>                                   "ab\tc\t" -> "ab\tc\t"
#=>                           "\na\nb\t\nc\t\n" -> "a\nb\t\nc\t\n"
#=>                   "\n\na\n\nb\t\n\nc\t\n\n" -> "a\nb\t\nc\t\n"
#=>                                 "ab \tc \t" -> "ab \tc \t"
#=>                         "\na\nb \t\nc \t\n" -> "a\nb \t\nc \t\n"
#=>                 "\n\na\n\nb \t\n\nc \t\n\n" -> "a\nb \t\nc \t\n"
#=>                                 "ab\t c\t " -> "ab\t c\t "
#=>                         "\na\nb\t \nc\t \n" -> "a\nb\t \nc\t \n"
#=>                 "\n\na\n\nb\t \n\nc\t \n\n" -> "a\nb\t \nc\t \n"
#=> ======================================================================
#=> Lines ending with: "\r"
#=> ======================================================================
#=>                                       "abc" -> "abc"
#=>                               "\ra\rb\rc\r" -> "a\rb\rc\r"
#=>                       "\r\ra\r\rb\r\rc\r\r" -> "a\rb\rc\r"
#=>                                     "ab c " -> "ab c "
#=>                             "\ra\rb \rc \r" -> "a\rb \rc \r"
#=>                     "\r\ra\r\rb \r\rc \r\r" -> "a\rb \rc \r"
#=>                                   "ab\tc\t" -> "ab\tc\t"
#=>                           "\ra\rb\t\rc\t\r" -> "a\rb\t\rc\t\r"
#=>                   "\r\ra\r\rb\t\r\rc\t\r\r" -> "a\rb\t\rc\t\r"
#=>                                 "ab \tc \t" -> "ab \tc \t"
#=>                         "\ra\rb \t\rc \t\r" -> "a\rb \t\rc \t\r"
#=>                 "\r\ra\r\rb \t\r\rc \t\r\r" -> "a\rb \t\rc \t\r"
#=>                                 "ab\t c\t " -> "ab\t c\t "
#=>                         "\ra\rb\t \rc\t \r" -> "a\rb\t \rc\t \r"
#=>                 "\r\ra\r\rb\t \r\rc\t \r\r" -> "a\rb\t \rc\t \r"

Comments

0

Try

/^\n/

and replace with the empty string.

are you sure your newline character is only \n? If not try

/^\r?\n/

to allow also the linebreak sequence \r\n.

Comments

0

Here's an ugly hack based on @Tom's answer:

result = s.squeeze("\r\n").tap{ |s2| :go while s2.gsub!("\r\n\r\n","\r\n") }

It supports DOS (\r\n), Unix (\n), and MacOS 9- (\r) line breaks. Tested:

[ "\r\n", "\n", "\r" ].each do |marker|
  1.upto(5) do |lines|
    s = "a#{marker*lines}b"
    tight = s.squeeze("\r\n").tap{ |s2| :go while s2.gsub!("\r\n\r\n","\r\n") }
    puts "%24s -> %s" % [s.inspect, tight.inspect]
  end
end
#=>                 "a\r\nb" -> "a\r\nb"
#=>             "a\r\n\r\nb" -> "a\r\nb"
#=>         "a\r\n\r\n\r\nb" -> "a\r\nb"
#=>     "a\r\n\r\n\r\n\r\nb" -> "a\r\nb"
#=> "a\r\n\r\n\r\n\r\n\r\nb" -> "a\r\nb"
#=>                   "a\nb" -> "a\nb"
#=>                 "a\n\nb" -> "a\nb"
#=>               "a\n\n\nb" -> "a\nb"
#=>             "a\n\n\n\nb" -> "a\nb"
#=>           "a\n\n\n\n\nb" -> "a\nb"
#=>                   "a\rb" -> "a\rb"
#=>                 "a\r\rb" -> "a\rb"
#=>               "a\r\r\rb" -> "a\rb"
#=>             "a\r\r\r\rb" -> "a\rb"
#=>           "a\r\r\r\r\rb" -> "a\rb"

Note that this assumes that your blank lines are truly blank, and do not have any whitespace on them. If this is the case, you could do a pre pass of s.gsub(/^[ \t]+$/,'')

Comments

0

This will do it: .gsub(/(\n\s*\n)+/, "\n")

and replace \n in the regex with [\n|\r] if needed.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.