2

I am trying to sort a text file by the 4th column that contains over 1000 numbers. I can isolate the number column fine but I am unable to sort in ascending order. Here is what I believed was correct. But I keep getting the following error:

'str' object has no attribute 'sort'

Any advise would be great!

file = open("MyFile.txt")

column = []  

for line in file:
    column = line[1:].split("\t")[3]

    print (column.sort())
0

4 Answers 4

7

If I'm right, you're trying to sort the rows, using the 4th column as an index, no?

sorted(open("MyFile.txt").readlines(), key=lambda line: int(line.split('\t')[3]))

Should give you the lines, sorted by the integer value of their 4th tab-split column.

Sign up to request clarification or add additional context in comments.

Comments

4

line.split() returns a string, as does reading a line from a file. You cannot sort a string because it is immutable. You can say:

for line in file:
    column.append(float(line[1:].split("\t")[3]))

column.sort()

1 Comment

"line.split() returns a string": no, line.split() returns a list.
1

Since you say that the file contains numbers separated by the tab character, you could use the csv module to process it. Note that I show 'statistic' since csv files contain headers that allow keys. If you do not have that or do not want to use it, just substitute the column index (in your case 3). If there is no header line, use the fieldnames parameter to set the column names.

import csv
ifile = open('file.csv', 'rb')
infile = csv.DictReader(ifile, delimiter='\t')
# If the first line does not contain the header then specify the header
try:
  sortedlist = sorted(infile, key=lambda d: float(d['statistic']))
except ValueError:
  #First line was the header, go back and skip it
  ifile.seek(0)
  ifile.next()
  sortedlist = sorted(infile, key=lambda d: float(d['statistic']))
ifile.close()

# now process sortedlist and build an output file to write using csv.DictWriter()

Comments

0

try this code:

file = open("a")
column = []

for line in file:
    column.append(int(line.split("\t")[3]))

column.sort()
print(column)

file.close()

what changed:

  1. line.split("\t") returns a list of strings, so doing column.append(int(line.split("\t")[3])) we select the fourth element of this list, transform it into an integer and add this integer to our list (column)
  2. doing print (column.sort()) would print the output of the sort method, which is None so we first have to sort the list before we print it. Another solution would be to use the sorted function print(sorted(column)) (see here too undestand the difference)
  3. we close the file we opened, no memory leak

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.