16

Is there any elegant way to exploit the correct spacing feature of print numpy.array to get a 2D array, with proper labels, that aligns properly? For example, given an array with 4 rows and 5 columns, how can I provide the array and appropriately sized lists corresponding to the row and header columns to generate some output that looks like this?

      A   B   C   D   E
Z [[ 85  86  87  88  89]
Y  [ 90 191 192  93  94]
X  [ 95  96  97  98  99]
W  [100 101 102 103 104]]

If I naively try:

import numpy
x = numpy.array([[85, 86, 87, 88, 89], \
                 [90, 191, 192, 93, 94], \
                 [95, 96, 97, 98, 99], \
                 [100,101,102,103,104]])

row_labels = ['Z', 'Y', 'X', 'W']


print "     A   B   C   D   E"
for row, row_index in enumerate(x):
    print row_labels[row_index], row

I get:

      A   B   C   D   E
Z  [85  86  87  88  89]
Y  [90 191 192  93  94]
X  [95  96  97  98  99]
W  [100 101 102 103 104]

Is there any way i can get things to line up intelligently? I am definitely open to using any other library if there is a better way to solve my problem.

4 Answers 4

26

You can use IPython notebook + Pandas for that. Type your original example in IPython notebook:

import numpy
x = numpy.array([[85, 86, 87, 88, 89], 
                 [90, 191, 192, 93, 94], 
                 [95, 96, 97, 98, 99], 
                 [100,101,102,103,104]])

row_labels = ['Z', 'Y', 'X', 'W']
column_labels = ['A', 'B', 'C', 'D', 'E']

Then create a DataFrame:

import pandas
df = pandas.DataFrame(x, columns=column_labels, index=row_labels)

And then view it:

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

10

Assuming all matrix numbers have at most 3 digits, you could replace the last part with this:

print "     A   B   C   D   E"
for row_label, row in zip(row_labels, x):
    print '%s [%s]' % (row_label, ' '.join('%03s' % i for i in row))

Which outputs:

     A   B   C   D   E
Z [ 85  86  87  88  89]
Y [ 90 191 192  93  94]
X [ 95  96  97  98  99]
W [100 101 102 103 104]

Formatting with '%03s' results in a string of length 3 with left padding (using spaces). Use '%04s' for length 4 and so on. The full format string syntax is explained in the Python documentation.

Comments

6

Here's a way to leverage the array printing functions. I probably wouldn't use it, but it comes pretty close to meeting your requirements!

a = np.random.rand(5,4)
x = np.array('col1 col2 col3 col4'.split())
y = np.array('row1 row2 row3 row4 row5'.split())
b = numpy.zeros((6,5),object)
b[1:,1:]=a
b[0,1:]=x
b[1:,0]=y
b[0,0]=''
printer = np.vectorize(lambda x:'{0:5}'.format(x,))
print printer(b).astype(object)

[[     col1 col2 col3 col4]
 [row1 0.95 0.71 0.03 0.56]
 [row2 0.56 0.46 0.35 0.90]
 [row3 0.24 0.08 0.29 0.40]
 [row4 0.90 0.44 0.69 0.48]
 [row5 0.27 0.10 0.62 0.04]]

Comments

2

This code is essentially an implementation of scoffey's above, but it doesn't have the three character limitation and is a bit more powerful. Here's my code:

    def format__1(digits,num):
        if digits<len(str(num)):
            raise Exception("digits<len(str(num))")
        return ' '*(digits-len(str(num))) + str(num)
    def printmat(arr,row_labels=[], col_labels=[]): #print a 2d numpy array (maybe) or nested list
        max_chars = max([len(str(item)) for item in flattenList(arr)+col_labels]) #the maximum number of chars required to display any item in list
        if row_labels==[] and col_labels==[]:
            for row in arr:
                print '[%s]' %(' '.join(format__1(max_chars,i) for i in row))
        elif row_labels!=[] and col_labels!=[]:
            rw = max([len(str(item)) for item in row_labels]) #max char width of row__labels
            print '%s %s' % (' '*(rw+1), ' '.join(format__1(max_chars,i) for i in col_labels))
            for row_label, row in zip(row_labels, arr):
                print '%s [%s]' % (format__1(rw,row_label), ' '.join(format__1(max_chars,i) for i in row))
        else:
            raise Exception("This case is not implemented...either both row_labels and col_labels must be given or neither.")

running

    import numpy
    x = numpy.array([[85, 86, 87, 88, 89],
                     [90, 191, 192, 93, 94],
                     [95, 96, 97, 98, 99],
                     [100,101,102,103,104]])
    row_labels = ['Z', 'Y', 'X', 'W']
    column_labels = ['A', 'B', 'C', 'D', 'E']
    printmat(x,row_labels=row_labels, col_labels=column_labels)

gives

         A   B   C   D   E
    Z [ 85  86  87  88  89]
    Y [ 90 191 192  93  94]
    X [ 95  96  97  98  99]
    W [100 101 102 103 104]

This would also be the output if 'x' were just a nested python list instead of a numpy array.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.