0

I have the data in the following format (in csv file):

 id, review
 1, the service was great!
 1, staff was friendly.
 2, nice location
 2, but the place was not clean
 2, the motel was okay
 3, i wouldn't stay there next time
 3, do not stay there

I would like to change the data to the following format:

 1, the service was great! staff was friendly. 
 2, nice location but the place was not clean the motel was okay
 3, i wouldn't stay there next time do not stay there

Any help would be appreciated.

4
  • What have you done so far? What is the matching criteria since the last line does not start with 1 but is appended to the lines before? Commented Sep 3, 2015 at 19:09
  • Take a look at itertools.groupby. Commented Sep 3, 2015 at 19:10
  • @albert I corrected the output. Commented Sep 3, 2015 at 19:21
  • @tobias_k I corrected the typo. thanks Commented Sep 3, 2015 at 19:43

1 Answer 1

1

You can use itertools.groupby for grouping consecutive entries that have the same number.

import itertools, operator, csv
with open("test.csv") as f:
    reader = csv.reader(f, delimiter=",")
    next(reader) # skip header line
    for key, group in itertools.groupby(reader, key=operator.itemgetter(0)):
        print key, ' '.join(g[1] for g in group)

Output:

1  the service was great!  staff was friendly.
2  nice location  but the place was not clean  the motel was okay
3  i wouldn't stay there next time  do not stay there

Note: The code for reading the file is assuming that it's an actual CSV file, with , delimiter:

id, review
1, the service was great!
...
Sign up to request clarification or add additional context in comments.

1 Comment

This is exactly what I was looking for.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.