0

I am trying to convert a .txt file to regular Python list. I have done this before, but the previous situations have involved manually constructed files. I am currently trying to process a .txt file that was composed by another Python script that wrote a list into said .txt file. I am not sure why these formats are being perceived as different by Python

Here is what I mean:

The first .txt looked like:

(Let's call it x.txt)

I like dogs
Go home 
This is the greatest Ice Cream ever

Now if I do:

f = open('x.txt', encoding = "utf8")

z = f.readlines()

print(z)

I get

['I like dogs','Go home','This is the greatest Ice Cream ever']

This is exactly what I want ^

My current .txt file looks like:

(Let's call it y.txt)

['I like dogs','Go home','This is the greatest Ice Cream ever']

Now if I do:

f = open('y.txt', encoding = "utf8")

z = f.readlines()

print(z)

I get a bizarre output that looks like:

['[\'I like dogs. \', \'Go home\', \'This is the greatest Ice Cream 
ever\',]]

I thought double brackets only existed really in Pandas? Where am I going wrong here? How can I get a regular list format output.

Note: To provide some context, I am trying to feed this list into some text cleaning script. When I try to feed that second output into it, I don't get an error, but it turns the list of strings into one long string in a list like: ['IlikedogsGohomeThisisthegreatestIceCreamever']

2
  • When you're saving ['I like dogs','Go home','This is the greatest Ice Cream ever'] inside the text file, they will be saved with string formatting and again while you do readlines() these list-of-strings-converted-to-single-string would be inside a list. Commented Mar 27, 2019 at 6:19
  • @pistol2myhead I figured something like that was the problem. Do you know a way around it? Commented Mar 28, 2019 at 4:44

3 Answers 3

1

If your 'y.txt' file contains this ['I like dogs', 'Go home', 'This is the greatest Ice Cream ever'] without string formatting and after reading the text lines you want to get the list assigned to some variable, try this :

from ast import literal_eval
with open('y.txt', 'r', encoding = 'utf-8') as f:
    b = f.readlines()
    print(b)    # OUTPUT - ["['I like dogs','Go home','This is the greatest Ice Cream ever']"]
    l = literal_eval(b[0])
    print(l)    # OUTPUT - ['I like dogs', 'Go home', 'This is the greatest Ice Cream ever']

There is one restriction to using the above code - this will work only if the text file contains a single list. If it contains multiple list inside 'y.txt', try this :

from ast import literal_eval
with open('y.txt', 'r', encoding = 'utf-8') as f:
    b = f.readlines()
    l = [literal_eval(k.strip()) for k in b]
Sign up to request clarification or add additional context in comments.

Comments

0

List can be extracted directly from y.txt as

>>> with open('y.txt', 'r') as file:
...     line = file.readlines()[0].split("'")[1::2]
... 
>>> line
['I like dogs', 'Go home', 'This is the greatest Ice Cream ever']

2 Comments

Same as above: Unfortunately I get a "'charmap' codec can't decode byte 0x9d in position 8090: character maps to <undefined>" error here. There are non alphabetic characters in my real data.
Worked perfectly fine on my machine for both versions of python.
0

If there is only one line that contains your list as a string and it is the first line, I would suggest you to try this

fil = open('y.txt', 'r', encoding="utf-8")
lis = eval(fil.readlines()[0])

now you should be able to use list - lis

Let me know if that worked.

1 Comment

Unfortunately I get a "'charmap' codec can't decode byte 0x9d in position 8090: character maps to <undefined>" error here. There are non alphabetic characters in my real data.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.