0

I have a movie dataset that looks like this:

1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy
2,Jumanji (1995),Adventure|Children|Fantasy
3,Grumpier Old Men (1995),Comedy|Romance
4,Waiting to Exhale (1995),Comedy|Drama|Romance
5,Father of the Bride Part II (1995),Comedy
6,Heat (1995),Action|Crime|Thriller
7,Sabrina (1995),Comedy|Romance
8,Tom and Huck (1995),Adventure|Children

I want to extract only the last part (genres part, e.g, Adventure|Animation|Children|Comedy|Fantasy) and store them in a list list[Adventure, Animation, Children, Comedy, Fantasy]. However, I am still stuck at slicing step. I don't know how to do that since line[:-1] doesn't slice. I use Python 2.7

with open(path + 'movie.csv') as f:
    for line in f:
        print line[:-1]

2 Answers 2

2
with open(path + 'movie.csv') as f:
    for line in f:
        print line.split(',')[:-1].rstrip('\n').split('|')
Sign up to request clarification or add additional context in comments.

Comments

2

Your slice will return the last character of each line, since the lines are not splitted when you read the file in regular manner. You should read the file using csv module that separates the lines automatically with ',' delimiter. Then split the result with |.

import csv
with open(path + 'movie.csv') as f:
    reader = csv.reader(f, delimiter=',')
    for row in reader:
        print(row[-1].split('|'))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.