3

I'm trying to figure out a simple way to sort words from a file, however the spaces "\n" are always returned when I print the words. How could I improve this code to make it work properly? I'm using python 2.7 Thanks in advance.

def sorting(self):
    filename = ("food.txt")
    file_handle = open(filename, "r")
    for word in file_handle:
        word = word.split()
        print sorted(file_handle)
    file_handle.close()

5 Answers 5

3

You actually have two problems here.


The big one is that print sorted(file_handle) reads and sorts the whole rest of the file and prints that out. You're doing that once per line. So, what happens is that you read the first line, split it, ignore the result, sort and print all the lines after the first, and then you're done.

What you want to do is accumulate all the words as you go along, then sort and print that. Like this:

def sorting(self):
    filename = ("food.txt")
    file_handle = open(filename, "r")
    words = []
    for line in file_handle:
        words += line.split()
    file_handle.close()
    print sorted(words)

Or, if you want to print the sorted list one line at a time, instead of as a giant list, change the last line to:

print '\n'.sorted(words)

For the second, more minor problem, the one you asked about, you just need to strip off the newlines. So, change the words += line to this:

words += line.strip().split()

However, if you had solved the first problem, you wouldn't even have noticed this one. If you have a line like "one two three\n", and you call split() on it, you will get back ["one", "two", "three"], with no \n to worry about. So, you don't actually even need to solve this one.


While we're at it, there are a few other improvements you could make here:

  • Use a with statement to close the file instead of doing it manually.
  • Make this function return the list of words (so you can do various different things with it, instead of just printing it and returning nothing).
  • Take the filename as a parameter instead of hardcoding it (for similar flexibility).
  • Maybe turn the loop into a comprehension—but that would require an extra "flattening" step, so I'm not sure it's worth it.
  • If you don't want duplicate words, use a set rather than a list.
  • Depending on the use case, you often want to use rstrip() or rstrip('\n') to remove just the trailing newline, while leaving, say, paragraph indentation tabs or spaces. If you're looking for individual words, however, you probably don't want that.
  • You might want to filter out and/or split on non-alphabetical characters, so you don't get "that." as a word. Doing even this basic kind of natural-language processing is non-trivial, so I won't show an example here. (For example, you probably want "John's" to be a word, you may or may not want "jack-o-lantern" to be one word instead of three; you almost certainly don't want "two-three" to be one word…)
  • The self parameter is only needed in methods of classes. This doesn't appear to be in any class. (If it is, it's not doing anything with self, so there's no visible reason for it to be in a class. You might have some reason which would be visible in your larger program, of course.)

So, anyway:

def sorting(filename):
    words = []
    with open(filename) as file_handle:
        for line in file_handle:
            words += line.split()
    return sorted(words)

print '\n'.join(sorting('food.txt'))
Sign up to request clarification or add additional context in comments.

Comments

2

Basically all you have to do is strip that newline (and all other whitespace because you probably don't want it):

def sorting(self):
    filename = ("food.txt")
    file_handle = open(filename, "r")
    for line in file_handle:
        word = line.strip().split()
        print sorted(file_handle)
    file_handle.close()

Otherwise you can just remove the last character with line[:-1].split()

6 Comments

It's probably a bit more Pythonic to use the context manager with statement to handle the file as well.
Depending on the use case, you often want to use rstrip() or rstrip('\n') to remove just the trailing newline, while leaving, say, paragraph indentation tabs or spaces. But it sounds like in the OP's use case there's no reason for that, and this is fine.
I've tried all those possible changes, but it still returns me the spaces.. ['\n', 'five\n', 'four\n', 'one\n', 'three\n', 'two\n']
@user205820: Your function doesn't return anything (except the default None that a function returns if it doesn't return anything else), so it can't be returning that. It prints that, for the reason I explained in my answer. You have two problems, and this answer only solves the one you asked about, not the other (more serious) one you didn't.
@user205820: And actually, if you've solved the other problem, you wouldn't need to solve this one, while if you solve this one, that still won't help until you solve the other one too. So really, you asked the wrong question, which is why Jochen's correct answer to your question doesn't actually help you.
|
0

Use .strip(). It will remove white space by default. You can also add other characters (like "\n") to strip as well. This will leave just the words.

1 Comment

"\n" is already included in the default whitespace used by strip(); you don't need to add it. (And if you do add it, you need to look up all the other default characters and add those in too, because you're no longer getting the defaults anymore.)
0

Try this:

def sorting(self):
    words = []
    with open("food.txt") as f:
        for line in f:
            words.extend(line.split())
    return sorted(words, key=lambda word: word.lower())

Comments

-1

To avoid printing the new lines just put , in the end:

print sorted(file_handle),

In your code, i don't see that you are sorting the whole file, just the line. Use a list to save all the words, and after you read the file, sort them all.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.