1

I am writing a program to sort the names of amino acids depending on their energy value within a particular company.

I have extracted relevant data to the following numpy array.

And I tried this

In[37]: Data = np.array([
 ['ASN 205', -9.64164],
 ['LEU 206', -8.985774],
 ['ASN 207', -7.314434],
 ['PRO 208', -4.105338],
 ['ASN 209', -2.092342],
 ['GLY 210', -2.101412],
 ['LYS 211', -2.483852],
 ['ARG 212', -24.20364],
 ['SER 213', -1.181002],
 ['VAL 214', 0.057618]])
In[38]: ind3 = np.lexsort((Data[:,0],Data[:,1]))
In[39]: Result = Data[ind3]
In[40]: Result
Out[40]: 
array([['SER 213', '-1.181002'],
       ['ASN 209', '-2.092342'],
       ['GLY 210', '-2.101412'],
       ['LYS 211', '-2.483852'],
       ['ARG 212', '-24.20364'],
       ['PRO 208', '-4.105338'],
       ['ASN 207', '-7.314434'],
       ['LEU 206', '-8.985774'],
       ['ASN 205', '-9.64164'],
       ['VAL 214', '0.057618']], 
      dtype='|S9')

But the problem here is float values are arranged in a lexicographic way. I want it be be ordered according to their value means -24.20364 first then ...-2.483852.

How do I do this?

4
  • The regular numpy.sort says this :"order : list, optional When a is a structured array, this argument specifies which fields to compare first, second, and so on." Commented Jan 28, 2015 at 12:59
  • 1
    "Please reply as soon as possible" Why? Commented Jan 28, 2015 at 12:59
  • Your Data array has been immediately converted to all-string-array (dtype='|S9'). String is not a good format for floats. Think of a better data structure, such as a dict. Commented Jan 28, 2015 at 13:03
  • after rearranging my data I solved this problem using a crude method shown by my friend. How: TransposedData = numpy.transpose(Data) Result = Data[:,np.argsort(Data[1].astype(float))] Done Commented Jan 28, 2015 at 16:09

1 Answer 1

2

explanation: np.array converts all the passed arguments to the biggest type fitting all arguments, i.e. your float get converted in line 1. You can instead use tuples with a specific data type as follows:

Data = np.array([
 ('ASN 205', -9.64164),
 ('LEU 206', -8.985774),
 ('ASN 207', -7.314434),
 ('PRO 208', -4.105338),
 ('ASN 209', -2.092342),
 ('GLY 210', -2.101412),
 ('LYS 211', -2.483852),
 ('ARG 212', -24.20364),
 ('SER 213', -1.181002),
 ('VAL 214', 0.057618)], dtype=[('f', '|S9'), ('g', float)])
ind3 = np.lexsort((Data['f'], Data['g']))
Result = Data[ind3]Out[8]:

output:

array([('ARG 212', -24.20364), ('ASN 205', -9.64164),
       ('LEU 206', -8.985774), ('ASN 207', -7.314434),
       ('PRO 208', -4.105338), ('LYS 211', -2.483852),
       ('GLY 210', -2.101412), ('ASN 209', -2.092342),
       ('SER 213', -1.181002), ('VAL 214', 0.057618)], 
      dtype=[('f', 'S9'), ('g', '<f8')])
Sign up to request clarification or add additional context in comments.

3 Comments

I am newbie, I reading this data from file, so all values are automatically converted to string. I got the tuple in this structure (('HIE 203', '-1.889138'), ('TYR 204', '-2.148216'), ('ASN 205', '-9.64164'), ('LEU 206', '-8.985774'), ('ASN 207', '-7.314434'), ('PRO 208', '-4.105338'), ('ASN 209', '-2.092342'), ('GLY 210', '-2.101412'), ('LYS 211', '-2.483852'), ('ARG 212', '-24.20364'), ('SER 213', '-1.181002'), ('VAL 214', '0.057618')) Now how to convert those string to float ? Sorry for this silly question.
In case you are using numpy's loadtxt, you can just pass it the dtype=[('f', '|S9'), ('g', float)]) argument for the data type.
Nice Idea. But the problem is from the file I am reading this data contains 18 columns, 2 of them string and rest floats. Is there any way to define data type for rest 16 columns in one go ? And one more thing, is that possible to skip reading some columns. ? BTW I am using numpy.genfromtext.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.