1

I have a python script which outputs lots of data, sample is as below. the first of the 4 fields always consists of two letters, one digit, a slash and one or two digits

Gi3/2 --.--.--.-- 0024.e89b.c10e Dell Inc.  
Gi5/4 --.--.--.-- 0030.c1cd.f038 HEWLETTPACKARD   
Gi4/3 --.--.--.-- 0020.ac00.6703 INTERFLEX DATENSYSTEME GMBH  
Gi3/7 --.--.--.-- 0009.4392.34f2 Cisco Systems  
Gi6/6 --.--.--.-- 001c.2333.bd5a Dell Inc  
Gi3/16 --.--.--.-- 0009.7c92.7af2 Cisco Systems  
Gi5/12 --.--.--.-- 0020.ac00.3fb0 INTERFLEX DATENSYSTEME GMBH  
Gi4/5 --.--.--.-- 0009.4392.6db2 Cisco Systems  
Gi4/6 --.--.--.-- 000b.cd39.c7c8 Hewlett Packard  
Gi6/4 --.--.--.-- 0021.70d7.8d33 Dell Inc  
Gi6/14 --.--.--.-- 0009.7c91.fa71 Cisco Systems  

What would be the best way to sort this correctly on the first field, so that this sample would read

Gi3/2   --.--.--.-- 0024.e89b.c10e  Dell Inc.  
Gi3/7   --.--.--.-- 0009.4392.34f2  Cisco Systems  
Gi3/16  --.--.--.-- 0009.7c92.7af2  Cisco Systems  
Gi4/3   --.--.--.-- 0020.ac00.6703  INTERFLEX DATENSYSTEME GMBH  
Gi4/5   --.--.--.-- 0009.4392.6db2  Cisco Systems  
Gi4/6   --.--.--.-- 000b.cd39.c7c8  Hewlett Packard  
Gi5/4   --.--.--.-- 0030.c1cd.f038  HEWLETT PACKARD  
Gi5/12  --.--.--.-- 0020.ac00.3fb0  INTERFLEX DATENSYSTEME GMBH  
Gi6/14  --.--.--.-- 0009.7c91.fa71  Cisco Systems  
Gi6/4   --.--.--.-- 0021.70d7.8d33  Dell Inc  
Gi6/6   --.--.--.-- 001c.2333.bd5a  Dell Inc  

My efforts have been very messy, and resulted in numbers such as 12 coming before 5!

As ever, many thanks for your patience.

3
  • p.s. IP addresses in field 2 removed for privacy, these would be unique values for each line Commented Jun 28, 2009 at 7:35
  • If your script is generating the output, it would probably make sense to sort the data /before/ it gets in this string form. Commented Jun 28, 2009 at 8:07
  • Thanks all, I shifted the output into a list, then passed it to the function suggested by yairchu. The only change I made was to create another list, i.e. list2=sorted(list1, key=lineKey) then print it. Not very Pythonic but does what I need. I agree I could have done this earlier in the script. Commented Jun 28, 2009 at 8:21

4 Answers 4

5
def lineKey (line):
    keyStr, rest = line.split(' ', 1)
    a, b = keyStr.split('/', 1)
    return (a, int(b))

sorted(lines, key=lineKey)
Sign up to request clarification or add additional context in comments.

Comments

4

to sort split each line such that you have two tuple, part before / and integer part after that, so each line should be sorted on something like ('Gi6', 12), see example below

s="""Gi3/2 --.--.--.-- 0024.e89b.c10e Dell Inc.  
Gi5/4 --.--.--.-- 0030.c1cd.f038 HEWLETTPACKARD   
Gi4/3 --.--.--.-- 0020.ac00.6703 INTERFLEX DATENSYSTEME GMBH  
Gi3/7 --.--.--.-- 0009.4392.34f2 Cisco Systems  
Gi6/6 --.--.--.-- 001c.2333.bd5a Dell Inc  
Gi3/16 --.--.--.-- 0009.7c92.7af2 Cisco Systems  
Gi5/12 --.--.--.-- 0020.ac00.3fb0 INTERFLEX DATENSYSTEME GMBH  
Gi4/5 --.--.--.-- 0009.4392.6db2 Cisco Systems  
Gi4/6 --.--.--.-- 000b.cd39.c7c8 Hewlett Packard  
Gi6/4 --.--.--.-- 0021.70d7.8d33 Dell Inc  
Gi6/14 --.--.--.-- 0009.7c91.fa71 Cisco Systems"""

lines = s.split("\n")
def sortKey(l):
    a,b = l.split("/")
    b=int(b[:2].strip())
    return (a,b)

lines.sort(key=sortKey)

for l in lines: print l

Comments

1

You can define a cmp() comparison function, for .sort([cmp[, key[, reverse]]]) calls:

The sort() method takes optional arguments for controlling the comparisons.

cmp specifies a custom comparison function of two arguments (list items) which should return a negative, zero or positive number depending on whether the first argument is considered smaller than, equal to, or larger than the second argument: cmp=lambda x,y: cmp(x.lower(), y.lower()). The default value is None.

In the cmp() function, retrieve the numeric key and use int(field) to ensure numeric (not textual) comparison.

Alternately, a key() function can be defined (thanks, @ Anurag Uniyal):

key specifies a function of one argument that is used to extract a comparison key from each list element: (e.g. key=str.lower). The default value is None.

1 Comment

key would be better instead of cmp
0

If you are working in a unix environment, you can use "sort" to sort such lists.

Another possibility is to use some kind of bucket sort in your python script, which should be a lot faster.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.