1

So I need the output of my program to look like:

ababa
ab ba 
 xxxxxxxxxxxxxxxxxxx
that is it followed by a lot of spaces .
 no dot at the end
The largest run of consecutive whitespace characters was 47.

But what I am getting is:

ababa

ab ba

xxxxxxxxxxxxxxxxxxx
that is it followed by a lot of spaces .
no dot at the end
The longest run of consecutive whitespace characters was 47.

When looking further into the code I wrote, I found with the print(c) statement that this happens:

['ababa', '', 'ab           ba ', '', '                                      xxxxxxxxxxxxxxxxxxx', 'that is it followed by a lot of spaces                         .', '                                               no dot at the end']

Between some of the lines, theres the , '',, which is probably the cause of why my print statement wont work.

How would I remove them? I've tried using different list functions but I keep getting syntax errors.

This is the code I made:

  a = '''ababa

    ab           ba 

                                      xxxxxxxxxxxxxxxxxxx
that is it followed by a lot of spaces                         .
                                               no dot at the end'''


c = a.splitlines()
print(c)

#d = c.remove(" ") #this part doesnt work
#print(d)

for row in c:
    print(' '.join(row.split()))

last_char = ""
current_seq_len = 0
max_seq_len = 0

for d in a:
    if d == last_char:
        current_seq_len += 1
        if current_seq_len > max_seq_len:
            max_seq_len = current_seq_len
    else:
        current_seq_len = 1
        last_char = d
    #this part just needs to count the whitespace

print("The longest run of consecutive whitespace characters was",str(max_seq_len)+".")
2
  • What kind of logic creates " xxxxxxxx" out of "      xxxxxxxx" ?? Commented Sep 20, 2013 at 11:44
  • 1
    Side note: the remove method modifies the list and returns None. Hence you should not do d = c.remove('') but simply: c.remove('') and afterwards c will have one less empty string. To remove all empty strings via remove do: for _ in range(c.count('')): c.remove('') (By the way: the empty string is '', i.e. quote-quote, without any space. In your case you where removing a single space string: ' ' quote-space-quote and you probably got some ValueErrors) Commented Sep 20, 2013 at 11:51

3 Answers 3

2

Regex time:

import re

print(re.sub(r"([\n ])\1*", r"\1", a))
#>>> ababa
#>>>  ab ba 
#>>>  xxxxxxxxxxxxxxxxxxx
#>>> that is it followed by a lot of spaces .
#>>>  no dot at the end

re.sub(matcher, replacement, target_string)

Matcher is r"([\n ])\1* which means:

([\n ]) → match either "\n" or " " and put it in a group (#1)
\1*     → match whatever group #1 matched, 0 or more times

And the replacement is just

\1 → group #1

You can get the longest whitespace sequence with

max(len(match.group()) for match in re.finditer(r"([\n ])\1*", a))

Which uses the same matcher but instead just gets their lengths, and then maxs it.

Sign up to request clarification or add additional context in comments.

Comments

2

From what I can tell, your easiest solution would be using list comprehension:

c= [item for item in a.splitlines() if item != '']

If you wish to make it slightly more robust by also removing strings that only contain whitespace such as ' ', then you can alter it as follows:

c= [item for item in a.splitlines() if item.strip() != '']

You can then also join it the list back together as follows:

output = '\n'.join(c)

2 Comments

if item.strip() is enough. No need to add != "".
While its true, I prefer to use the explicit form for readability's sake.
1

This can be easily solved with the built-in filter function:

c = filter(None, a.splitlines())
# or, more explicit
c = filter(lambda x: x != "", a.splitlines())

The first variant will create a list with all elements from the list returned by a.splitlines() that do not evaluate to False, like the empty string. The second variant creates a small anonymous function (using lambda) that checks if a given element is the empty string and returns False if that is the case. This is more explicit than the first variant.

Another option would be to use a list comprehension that achieves the same thing:

c = [string for string in a.splitlines if string]
# or, more explicit
c = [string for string in a.splitlines if string != ""]

4 Comments

This would work. However if one of the items in the list is an empty string i.e. just white space such as ' ', then it would not be filtered out.
@MichaelAquilina If a string contains white-space then it is not an empty string. To check whether a string is empty or space-only simply use lambda x: x.strip()). strip() without arguments removes all the consecutive spaces from the left and right of the string, resulting in an empty string if the string is space-only.
@Bakuriu this in fact the approach I suggested in my answer.
But from what I could gather from the OP's question he was only dealing with true empty strings (""), that's why I didn't include strip here.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.