5
string = "Deepika Padukone, Esha Gupta or Yami Gautam - Who's looks hotter and sexier? Vote! - It's ... Deepika Padukone, Esha Gupta or Yami Gautam…. Deepika Padukone, Esha Gupta or Yami Gautam ... Tag: Deepika Padukone, Esha Gupta, Kalki Koechlin, Rang De Basanti, Soha Ali Khan, Yami  ... Amitabh Bachchan and Deepika Padukone to be seen in Shoojit Sircar's Piku ..."

fp = open("test.txt", "w+");

fp.write("%s" %string);

after running the above code I have got the following error.

File "encode_error.py", line 1

SyntaxError: Non-ASCII character '\xe2' in file encode_error.py on line 1, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details
3
  • 2
    Did you read the link provided? Commented Jun 24, 2014 at 10:26
  • @hd1: Really? In a source file, then run with Python 2? Don't just paste that into an interactive Python session. Commented Jun 24, 2014 at 10:31
  • +1 for Deepika Padukone. Commented Jun 24, 2014 at 10:55

2 Answers 2

6

You have a U+2026 HORIZONTAL ELLIPSIS character in your string definition:

... Deepika Padukone, Esha Gupta or Yami Gautam…. ...
                                               ^

Python requires that you declare the source code encoding if you are to use any non-ASCII characters in your source.

Your options are to:

  • Declare the encoding, as specified in the linked PEP 263. It's is a comment that must be the first or second line of your source file.

    What you set it to depends on your code editor. If you are saving files encoded as UTF-8, then the comment looks something like:

    # coding: utf-8
    

    but the format is flexible. You can spell it encoding too, for example, and use = instead of :.

  • Replace the horizontal ellipsis with three dots, as used in the rest of the string

  • Replace the codepoint with \xhh escape sequences to represent encoded data. U+2026 encoded to UTF-8 is \xe2\x80\xa6.
Sign up to request clarification or add additional context in comments.

4 Comments

how to do source code encoding in my source. I am newbie to python
THe string posted above was taken from a JSON object and I need to extract the string. So it may not be possible to replace the ellipsis with 3 dots.
@user3770743: Why not load the JSON data from a file or HTTP response with the json module then?
@user3770743: If you're reading it from a JSON object, you'll run into an overlapping but different set of problems than if you try to embed the string into source code.
5

add # coding: utf-8 to the top of your file.

# coding: utf-8
string = "Deepika Padukone, Esha Gupta or Yami Gautam - Who's looks hotter and sexier? Vote! - It's ... Deepika Padukone, Esha Gupta or Yami Gautam…. Deepika Padukone$

fp = open("test.txt", "w+");

fp.write("%s" %string);

Explanation:

The error is caused by the replacing standard characters like apostrophe (‘) by non-standard characters like quotation mark (`) during copying. It happens quite often when you copy text from a pdf file. The difference is very subtle, but there is a huge difference as far as Python is concerned. The apostrophe is completely legal to indicate a text string, but the quotation mark is not.

Technically, it’s not exactly illegal to use any kind of characters we want. It’s just that we have to tell Python what kind of encoding we are using so that it knows what to do with these non-standard characters. Adding # coding: utf-8 to the top of that file will tell python that your encoding is utf-8.

UTF-8 is an encoding format to represent the characters in the Unicode set. It is used very widely on the web. Unicode is the industry standard for representing and handling text on many different platforms including the web, enterprise software, printing etc. UTF-8 is one of the more popular ways used for encoding this character set.

1 Comment

Sure, but it has to be explicitly said. You can create a horizontal ellipsis in other encodings too.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.