You could also use DataFrame.iterrows() like this:
["The book {title} by {author} uses "
"{subject} as a metaphor for the human condition.".format(**x)
for i, x in df.iterrows()]
Which is nice if you want to:
- use named arguments, so the order of use didn't have to match the order of columns (like above)
- not use an internal function like
_asdict()
Timing: the fastest appears to be M. Klugerford's second solution, even if we note the warning about caching and take the slowest run.
# example
%%timeit
("Some guy " + df['author'] + " wrote a book called " +
df['title'] + " that uses " + df['subject'] +
" as a metaphor for the human condition.")
# 1000 loops, best of 3: 883 µs per loop
%%timeit
["Some guy {author} wrote a book called {title} that uses "
"{subject} as a metaphor for the human condition.".format(**x._asdict())
for x in df.itertuples(index=False)]
#1000 loops, best of 3: 962 µs per loop
%%timeit
["Some guy {} wrote a book called {} that uses "
"{} as a metaphor for the human condition.".format(*x)
for x in df.values]
#The slowest run took 5.90 times longer than the fastest. This could mean that an intermediate result is being cached.
#10000 loops, best of 3: 18.9 µs per loop
%%timeit
from collections import OrderedDict
["The book {title} by {author} uses "
"{subject} as a metaphor for the human condition.".format(**x)
for x in [OrderedDict(row) for i, row in df.iterrows()]]
#1000 loops, best of 3: 308 µs per loop
%%timeit
["The book {title} by {author} uses "
"{subject} as a metaphor for the human condition.".format(**x)
for i, x in df.iterrows()]
#1000 loops, best of 3: 413 µs per loop
Why the next-to-last is faster than the last is beyond me.