append word from line when reading txt file python

Question

I am trying to create a program that will read a text file and create a list of lines of words.

However am only able to append each line and not word, any help would be appreciated with this problem.

text = open("file.txt","r")

for line in text.readlines():
    sentence = line.strip()
    list.append(sentence)

    print list 
text.close()

Example text

I am here
to do something

and I wanted it to append it like this

[['I','am','here']['to','do','something']]

Thanks in advance.

Are the lines in a particular format? For example, are they space-delimited, or could they use all sorts of punctuation? Are there punctuation marks that need to be removed? Basically, clearing up what an individual word means is important for this problem. — pseudoramble
– pseudoramble, Commented Oct 16, 2011 at 23:57
you shouldn't use list as variable name. I think you mean text.close() instead of file.close() — John La Rooy
– John La Rooy, Commented Oct 17, 2011 at 0:38

John La Rooy · Accepted Answer · 2011-10-17 00:45:16Z

1

>>> with open("file.txt","r") as f:
...     map(str.split, f)
... 
[['i', 'am', 'here'], ['to', 'do', 'something']]

answered Oct 17, 2011 at 0:45

John La Rooy

306k54 gold badges378 silver badges514 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Brendan · Accepted Answer · 2011-10-17 00:19:32Z

1

Each line in the example is just a string, so something like,

...
    PUNCTUATION = ',.?!"\''
    words = [w.strip(PUNCTUATION) for w in line.split() if w.strip(PUNCTUATION)]
    list.append(words)
...

would probably be okay to the first approximation although may not cover every edge case in the way that you want (i.e. hyphenated words, words not separated by whitespace, words that have a trailing apostrophe etc.)

The conditional is to avoid blank entries.

edited Oct 17, 2011 at 0:19

answered Oct 17, 2011 at 0:00

Brendan

19.6k19 gold badges90 silver badges118 bronze badges

Comments

Russell Dias · Accepted Answer · 2011-10-17 00:21:31Z

1

Where exactly are you getting the y variable?

In the most basic sense (because you have not quite specified what to do with punctuation) you can split each line into a list of words using line.split(' '), which splits on every space. If you have other delimiters you can substitute that in, instead of the space. Assign the above split to a var if need be and append it to your list.

@Brendan has provided a good solution to strip basic punctuation. Alternatively, you could also use a simple regex re.findall(r'\w+', file) to find all words in a given file.

Using yet another way, you can take advantage of pythons string library, and string.punctuation in particular:

str = list(line)
''.join([ word for word in str if not word in string.punctuation ]).split()

edited Oct 17, 2011 at 0:21

answered Oct 17, 2011 at 0:01

Russell Dias

73.9k5 gold badges58 silver badges72 bronze badges

1 Comment

user998316 Over a year ago

sorry, i fixed the y variable

Nate · Accepted Answer · 2011-10-17 02:17:58Z

1

Something like this would cover a large number of cases, and could be tailored to your used symbols:

import re
text = open("file.txt","r")

for line in text.readlines():
    sentence = line.strip()
    words = re.sub(" +"," ",re.sub("[^A-Za-z']"," ",sentence)).split()
    somelist.append(words)

    print list 
text.close()

This would only include the capital and lower case letters and apostrophes (for the sake of contractions)

edited Oct 17, 2011 at 2:17

answered Oct 17, 2011 at 0:07

Nate

12.9k5 gold badges48 silver badges62 bronze badges

Comments

Acorn · Accepted Answer · 2011-10-17 00:05:06Z

0

text = open("file.txt","r")

word_groups = []

for line in text.readlines():
    words = line.strip().split(' ')
    word_groups.append(words)

print word_groups

text.close()

answered Oct 17, 2011 at 0:05

Acorn

50.8k30 gold badges143 silver badges180 bronze badges

Comments

johnsyweb · Accepted Answer · 2011-10-17 01:22:06Z

0

It looks like you're just missing a call to str.split(). Here's a simple a one-line list comprehension that does what you have asked for:

>>> [line.split() for line in open('file.txt')]
[['i', 'am', 'here'], ['to', 'do', 'something']]

edited Oct 17, 2011 at 1:22

answered Oct 17, 2011 at 1:16

johnsyweb

143k26 gold badges197 silver badges253 bronze badges

Collectives™ on Stack Overflow

append word from line when reading txt file python

6 Answers 6

Comments

Comments

1 Comment

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

Comments

Comments

1 Comment

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related