0

This is a follow-up to how to access a given column in string formatting.

Instead of

np.random.seed(1234)
df = pd.DataFrame(np.random.randint(7, size=(2, 2)), columns=['a', 'b'])
c = df.iloc[0, :] # get 0-th row
print("Here is {one[a]} and {two}".format(one=c, two=c['b'])) # Ok (see linked question)

I'd like to be able to refer to the column in a nested argument but it doesn't work:

print("Here is {one[{col}]} and {two}".format(col='a', one=c, two=c['b'])) # Problem: KeyError: '{col}'

It should work but it didn't. Any hint?

1
  • I don't understand the downvote but anyway .. Commented Feb 26, 2016 at 7:30

2 Answers 2

2

As I posted in my reply to the linked question, that is not how you should be referencing your data.

Per the docs:

str.format(*args, **kwargs) Perform a string formatting operation. The string on which this method is called can contain literal text or replacement fields delimited by braces {}. Each replacement field contains either the numeric index of a positional argument, or the name of a keyword argument. Returns a copy of the string where each replacement field is replaced with the string value of the corresponding argument.

>>> "The sum of 1 + 2 is {0}".format(1+2) 'The sum of 1 + 2 is 3'

It is best to be explicit in your code about the data.

>>> print("Here is {one} and {two}".format(one=c['a'], two=c['b']))
Here is 3 and 6

or...

col1 = 'a'
col2 = 'b'
>>> print("Here is {one} and {two}".format(one=c[col1], two=c[col2]))
Here is 3 and 6

even better...

col1 = 'a'
col2 = 'b'

n = 0  # Get the first row.
one, two = df.ix[n, [col1, col2]]
>>> print("Here is {one} and {two}".format(one=one, two=two))
Here is 3 and 6

Given that a Series supports __getattr__ under the hood, you can also access the results indirectly using dot notation or like a dictionary lookup.

row = df.loc[n]

>>> print("Here is {row.a} and {row.b}".format(row=row))
Here is 3 and 6

Although it is always safer to access data with brackets incase any column name clashes with an existing Series property or method.

df['sum'] = df.sum(axis=1)

# Safe method.
>>> print("{row[a]} and {row[b]} make {row[sum]}".format(row=row))
    3 and 6 make 9

# Unsafe method.
print("{row.a} and {row.b} make {row.sum}".format(row=row))
3 and 6 make <bound method Series.sum of a      3
b      6
sum    9
Name: 0, dtype: int64>
Sign up to request clarification or add additional context in comments.

3 Comments

I thought that it (the ability to access data fields) was a feature of the new-style formatting?
Hi Alexander. I agree with you. The mixed format is really ugly even if it answers the question. As a side note: the 'even better' you propose is not always the best way as sometimes I just need to display small parts of a very lengthy row.
Really depends on your use case and readability. You want to make things as simple as possible for others to understand.
1

I think you've got that error due to nested substitution with format. You could mix % and format for that case...

print(("Here is {one[%(col)s]} and {two}" % {'col':'a'}).format(one=c, two=c['b']))

Here is 3 and 6

1 Comment

Your mix formatting style works. From the doc, this kind of nested replacement should have worked while only using format ... Sorry, I'm wrong. After careful reading, only the format specifier can get nested replacement fields!!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.