0

I have a .txt file with some strings like this:

word_1
word_2
word_3
....
word_n
word_n-1

I would like to read them and place them into list, in order to do something like this:

my_words = set(['word_1',...,'word_n-1'])

This is what I tried:

with open('/path/of/the/.txt') as f:
   lis = set([int(line.split()[0]) for line in f])
   print lis

But I get this error:

    lis = set([int(line.split()[0]) for line in f])
ValueError: invalid literal for int() with base 10: '\xc3\xa9l'

What would be a better way to do this and how can I deal with the encoding of this extarnal .txt file?.

4
  • 1
    that words are in byte format? Commented Feb 17, 2015 at 1:04
  • 3
    you cant covert a "word" into an int, unless your words are numeric "1", "34", etc. Commented Feb 17, 2015 at 1:06
  • 1
    Remove the int call and it should be ok. Commented Feb 17, 2015 at 1:08
  • Thanks for the help guys. @GLHF the words are just in a txt file(utf-8) Commented Feb 17, 2015 at 1:08

1 Answer 1

1

I think you need something like this:

with open('file.txt') as f:
    lis = set(line.strip() for line in f)
    print lis

The result is:

set(['word_3', 'word_2', 'word_1', 'word_21', 'word_123'])
Sign up to request clarification or add additional context in comments.

5 Comments

Thanks for the help!. This works any idea of how to fix encoding issues?. For example when I run this I get some encoding problems:'manifest\xc3\xb3', 'ellas', 'estuvo', 'agreg\xc3\xb3'. They're spanish words.
Probably you need to look at encoding. For example something like here.
That is not an encoding problem that is the repr output printing it will look like manifestó
So the encoding of the words is still the same?. This is just happening by tha fact that I am printing the results on the pycharm terminal?.
@johndoe Maybe its new problem. New question could be more appropriate. If you decide to create new one, linking to the file of interest or its example would be useful.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.