1

I need help with trying to remove all the whitespace from the non-newline lines in my output in Python 3. I want to do this so I can convert my string into a list. Currently when I run my program it outputs this:

Belgian Waffles
    $5.95
    Two of our famous Belgian Waffles with plenty of real maple syrup
    650


    Strawberry Belgian Waffles


    (extra output)


    Homestyle Breakfast
    $6.95
    Two eggs, bacon or sausage, toast, and our ever-popular hash browns
    950
(end of output)

The result I'm trying to get is this:

Belgian Waffles
$5.95
Two of our famous Belgian Waffles with plenty of real maple syrup
650


Strawberry Belgian Waffles


(extra output)


Homestyle Breakfast
$6.95
Two eggs, bacon or sausage, toast, and our ever-popular hash browns
950
(end of output)

This is the current code that I have right now:

import os
import re


def get_filename():
    print("Enter the name of the file: ")
    filename = input()
    return filename


def read_file(filename):
    if os.path.exists(filename):
        with open(filename, "r") as file:
            full_text = file.read()
            return full_text
    else:
        print("This file does not exist")
    
    
def get_tags(full_text):
    tags = re.findall('<.*?>', full_text)
    for tag in tags:
        full_text = full_text.replace(tag, '')
    return tags


def get_text(text):
    tags = re.findall('<.*?>', text)
    for tag in tags:
        text = text.replace(tag, '')
    text = text.strip()
    return text


def display_output(text):
    print(text)
    
    
def main():
    filename = get_filename()
    full_text = read_file(filename)

    tags = get_tags(full_text)
    text = get_text(full_text)

    display_output(text)


main()

Any help or suggestions would be appreciated.

2
  • 4
    You should provide the expected output for clarity. Also, it looks like you are using regexes to parse markup language, be aware that there are specialized libraries that do this efficiently and robustly. Commented Jul 12, 2022 at 6:24
  • Ok, I just edited the question to show my desired output. Commented Jul 12, 2022 at 16:32

1 Answer 1

1

You can either use REGEX experession or use this function below which iterates over every character and checks it's ascii value. If those ascii value are mentioned in the list then it appends it to the final string of characters. You can find the list of ascii values here and increase the list size as per your requirement.

def removeGarbageCharacter(thisString):
   thisString = "hello world"
   FinalString = ""
   ASCII_of_Other_char = [34,39,44,32] # Ascii of space, comma, semi-colon etc
   for thisChar in thisString:
      asciiVal = ord(thisChar)
      if (asciiVal >=65 and asciiVal <=90) or (asciiVal >=97 and asciiVal <=122) or (asciiVal >=48 and asciiVal <=57)or asciiVal in ASCII_of_Other_char: #Ascii of A-Z, a-z, 0-9
         FinalString += thisChar
   return(FinalString)
Sign up to request clarification or add additional context in comments.

1 Comment

Feel free to mark this answer correct if it helped :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.