1

Say I have a dataframe with string values that contains some HTML

my_dict = {"a":{"content":"""
<p>list of important things</p>
<ul>
<li>c</li>
<li>d</li>
</ul>
"""}}

df = pd.DataFrame.from_dict(my_dict,orient='index')

The result is to be expected:

I'd like to export the dataframe as HTML such that my HTML string works inside the table cells.

What I've tried

I'm aware of DataFrame.to_html(escape=False), which produces:

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>content</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>a</th>
      <td>\n<p>list of important things</p>\n<ul>\n<li>c</li>\n<li>d</li>\n</ul>\n</td>
    </tr>
  </tbody>
</table>

which looks wrong:

enter image description here

because that HTML has a literal \\n, so I think the method has taken the repr of the string value when inserting it into the HTML conversion of the dataset.

I know I could get away replacing the scaped \\n into \n again, which looks like it should:

enter image description here

But I'd like to know if there is some way to tell pandas to insert the literal string values of the dataframe into the HTML, not the repr ones. I don't understand half of the kwargs for .to_html(), so I don't know if that's possible.

1 Answer 1

3

I'd like to export the dataframe as HTML such that my HTML string works inside the table cells.

If so, you may want to consider replacing \n by HTML new line character ie. <br> if you want to get newline for it or you can just replace it by an empty string.

df['content'] = df['content'].str.replace('\n', '<br>')
df.to_html('html.html', escape=False)

And if you don't want to replace the dataframe itself, you can let pandas handle it by passing it as a formatter:

df.to_html('html.html', 
           formatters = {'content': lambda k: k.replace('\n', '<br>')}, 
           escape=False)

And if you just completely want to get rid of new line, you can just replace it by empty string, either in dataframe itself or passing as a formatter.

df.to_html('html.html', 
           formatters = {'content': lambda k: k.replace('\n', '')}, 
           escape=False)
Sign up to request clarification or add additional context in comments.

1 Comment

Wow, thank you! I could not find an explanation of how to use the formatters kwarg, but I think I understand it now

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.