1

My code:

import csv
import operator


first_csv_file = open('/Users/jawadmrahman/Downloads/account-cleanup-3 array/example.csv', 'r+')
csv_sort = csv.reader(first_csv_file, delimiter=',')
sort = sorted(csv_sort, key=operator.itemgetter(0))
sorted_csv_file = open('new_sorted2.csv', 'w+', newline='')
write = csv.writer(sorted_csv_file)
for eachline in sort:
    print (eachline)
    write.writerows(eachline)

I have an example csv file: enter image description here

I want to sort by the first column and get the results in this fashion: 1,9 2,17, 3,4 7,10 With the code posted above, this is how I am getting it now: enter image description here

How do I fix this?

7
  • 1
    Is , supposed to represent a decimal point in this context? Commented Jan 7, 2022 at 18:33
  • 2
    pandas package is the most comprehensive and well supported package for manipulating tabular data such as CSVs. Read, sort, and save should be about 3 lines of code in Pandas. See stackoverflow.com/questions/37787698/… and stackoverflow.com/questions/14365542/… Commented Jan 7, 2022 at 18:35
  • 2
    eachline is itself a list and thus write.writerows(eachline) is producing two rows for every eachline. Try write.writerow(eachline). While you are at it, I encourage you to look at what the with keyword used with open() does for you. It will clean up your code substantially. Commented Jan 7, 2022 at 19:07
  • 1
    Please do not include images of data. Please edit your question and include your input CSV and desired output CSV as text. Commented Jan 7, 2022 at 19:12
  • 1
    @JonSG, thank you! Commented Jan 11, 2022 at 16:09

1 Answer 1

2

As JonSG pointed out in the comments to your original post, you're calling writerows() (plural) on a single row, eachline.

Change that last line to write.writerow(eachline) and you'll be good.

Looking at the problem in depth

writerows() expects "a list of a list of values". The outer list contains the rows, the inner list for each row is effectively the cell (column for that row):

sort = [
  ['1', '9'],
  ['2', '17'],
  ['3', '4'],
  ['7', '10'],
]

writer.writerows(sort)

will produce the sorted CSV with two columns and four rows that you expect (and your print statement shows).

When you call writerows() with a single row:

for eachline in sort:
    writer.writerows(eachline)

you get some really weird output:

  • it interprets eachline at the outer list containing a number of rows, which means...

  • it interprets each item in eachline as a row having individual columns...

  • and each item in eachline is a Python sequence, string, so writerows() iterates over each character in your string, treating each character as its own column...

    ['1','9'] is seen as two single-column rows, ['1'] and ['9']:

    1
    9
    

    ['2', '17'] is seen as the single-column row ['2'] and the double-column row ['1', '7']:

    2
    1,7
    
Sign up to request clarification or add additional context in comments.

1 Comment

Ah I understand. This works, thank you!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.