1

One of my dataframe element contains text in html format : This is <a href="https://www.google.com">google</a> and this is <a href="https://www.yahoo.com">yahoo</a>

I want to save this dataframe in excel file.

Can the excel file show the string as This is google and this is yahoo with two urls in one cell?

Thanks

1 Answer 1

1

You can do something like this:

import re
import pandas as pd

df = pd.DataFrame({"text": ['This is <a href="https://www.google.com">google</a> and this is <a href="https://www.yahoo.com">yahoo</a>']})

df["links"] = df.text.apply(lambda x: re.findall(r'<a href="(.+?)".+?', x))
df.text = df.text.str.replace(r"<a.+?>(.+?)</a>", r'\1', regex=True)
print(df)
#                               text                                            links
#0  This is google and this is yahoo  [https://www.google.com, https://www.yahoo.com]
Sign up to request clarification or add additional context in comments.

2 Comments

Followed by df.to_excel() probably
Thank you, Anwarvic. However, I want to produce the effect: when I export df by df.to_excel, that cell in excel will contain 2 clickable urls in one cell.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.