1

I am looking for a way, preferably in python, but PHP is also ok or even an online site, to convert a string like

"Wählen"

into a string like

"Wählen"

i.e. replacing each ISO 8859-1 character/symbol by its HTML entity.

0

4 Answers 4

3
echo htmlentities('Wählen', 0, 'utf-8');

^ PHP

PS: Learn the arguments based on where you need the encoded string to appear:

// does not encode quotes
echo htmlentities('"Wählen"', 0, 'utf-8');
// encodes quotes
echo htmlentities('"Wählen"', ENT_QUOTES, 'utf-8');
Sign up to request clarification or add additional context in comments.

1 Comment

I am astonished that PHP seems to provide a better solution (i.e. a simple one-liner) as opposed to python. I like simple solutions.
2

Something like this

 $html="Wählen";
$html = mb_convert_encoding($html, 'HTML-ENTITIES', 'ISO-8859-1');
// OR  $html = htmlentities($html, ENT_COMPAT, 'ISO-8859-1');
echo $new = htmlspecialchars($html, ENT_QUOTES);

Comments

2

For Python3

>>> import html.entities
>>> reventities = {k:'&'+v+';' for v,k in html.entities.entitydefs.items()}
>>> "".join(reventities.get(i, i) for i in "Wählen")
'Wählen'

Another (probably faster) way

>>> toentity = {k: '&'+v+';' for k,v in html.entities.codepoint2name.items()}
>>> "Wählen".translate(toentity)
'Wählen'

Comments

1

Python:

# -*- coding: utf-8 -*-
from htmlentitydefs import codepoint2name

def uni_to_html(s):
    new_s = ""
    for c in s:
        try:
            new_s += '&{};'.format(codepoint2name[ord(c)])
        except KeyError:
            new_s += c
    return new_s

print uni_to_html(u"Wählen")  # Wählen

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.