using a DataFrame with columns as named arguments to str.format()

Question

I have a DataFrame like:

import pandas as pd
df = pd.DataFrame({'author':["Melville","Hemingway","Faulkner"],
                   'title':["Moby Dick","The Sun Also Rises","The Sound and the Fury"],
                   'subject':["whaling","bullfighting","a messed-up family"]
                   })

I know that I can do this:

# produces desired output                   
("Some guy " + df['author'] + " wrote a book called " + 
   df['title'] + " that uses " + df['subject'] + 
   " as a metaphor for the human condition.")

but is it possible to write this more clearly using str.format(), something along the lines of:

# returns KeyError:'author'
["Some guy {author} wrote a book called {title} that uses "
   "{subject} as a metaphor for the human condition.".format(x) 
      for x in df.itertuples(index=False)]

Asish M. · Accepted Answer · 2016-08-24 19:48:23Z

3

>>> ["Some guy {author} wrote a book called {title} that uses "
   "{subject} as a metaphor for the human condition.".format(**x._asdict())
      for x in df.itertuples(index=False)]

['Some guy Melville wrote a book called Moby Dick that uses whaling as a metaphor for the human condition.', 'Some guy Hemingway wrote a book called The Sun Also Rises that uses bullfighting as a metaphor for the human condition.', 'Some guy Faulkner wrote a book called The Sound and the Fury that uses a messed-up family as a metaphor for the human condition.']

Note that _asdict() is not meant to be part of the public api, so relying on it may break in future updates to pandas.

You could do this instead:

>>> ["Some guy {} wrote a book called {} that uses "
   "{} as a metaphor for the human condition.".format(*x)
      for x in df.values]

edited Aug 24, 2016 at 19:48

answered Aug 24, 2016 at 19:39

Asish M.

2,6571 gold badge19 silver badges34 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

C8H10N4O2 Over a year ago

Got it, so the * does the tuple part for me. Brilliant, thanks -- not sure why someone downvoted us

C8H10N4O2 · Accepted Answer · 2016-08-24 20:30:11Z

You could also use DataFrame.iterrows() like this:

["The book {title} by {author} uses "
   "{subject} as a metaphor for the human condition.".format(**x) 
     for i, x in df.iterrows()]

Which is nice if you want to:

use named arguments, so the order of use didn't have to match the order of columns (like above)
not use an internal function like _asdict()

Timing: the fastest appears to be M. Klugerford's second solution, even if we note the warning about caching and take the slowest run.

# example
%%timeit
 ("Some guy " + df['author'] + " wrote a book called " + 
   df['title'] + " that uses " + df['subject'] + 
   " as a metaphor for the human condition.")
# 1000 loops, best of 3: 883 µs per loop

%%timeit
    ["Some guy {author} wrote a book called {title} that uses "
       "{subject} as a metaphor for the human condition.".format(**x._asdict())
          for x in df.itertuples(index=False)]
#1000 loops, best of 3: 962 µs per loop

%%timeit
    ["Some guy {} wrote a book called {} that uses "
     "{} as a metaphor for the human condition.".format(*x)
          for x in df.values]   
#The slowest run took 5.90 times longer than the fastest. This could mean that an intermediate result is being cached.
#10000 loops, best of 3: 18.9 µs per loop

%%timeit
    from collections import OrderedDict
    ["The book {title} by {author} uses "
       "{subject} as a metaphor for the human condition.".format(**x) 
         for x in [OrderedDict(row) for i, row in df.iterrows()]]
#1000 loops, best of 3: 308 µs per loop            

%%timeit 
    ["The book {title} by {author} uses "
       "{subject} as a metaphor for the human condition.".format(**x) 
         for i, x in df.iterrows()]
#1000 loops, best of 3: 413 µs per loop

Why the next-to-last is faster than the last is beyond me.

Collectives™ on Stack Overflow

using a DataFrame with columns as named arguments to str.format()

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related