How to compile Python file with encoding declaration from an InputStream or bytes using Jython?

Ask Question

Asked 4 years, 11 months ago

Modified 4 years, 11 months ago

Viewed 442 times

In Java I want to check some Python 2 files for syntax errors, so using Jython seemed like a good choice. In theory this should be easy, as indicated in another answer. As I'm reading from a file, I use a Reader. I would really prefer to use an InputStream.

Reader reader = openReaderToPythonFile();
new org.python.util.PythonInterpreter().compile(reader)

The only compile() options for PythonInterpreter take either String or Reader as parameters. This means that the content I feed it would already be in Unicode string form, not bytes.

The problem is that I want to check an existing Python file that has lines at the top indicating an encoding of UTF-8, following PEP 263. (This is because Python 2 source files are considered ASCII by default.) It looks something like this:

#!/usr/bin/python
# -*- coding: utf-8 -*-
…

Even if I manually read the file (correctly) as UTF-8, when I pass the string (or Reader instance) to PythonInterpreter to compile, I get this error:

encoding declaration in Unicode string

In other words PythonInterpreter is saying, "This file has an encoding declaration, but I can't respect the encoding declaration because you've already converted the bytes to a string before I had a chance to analyze it". But PythonInterpreter doesn't seem to provide a way to pass it the raw bytes or (preferably) an InputStream.

How can I compile a Python file with Jython if the file contains an encoding declaration? If that's not possible, as a workaround is it possible for Jython to ignore the encoding declaration and trust that I've correctly converted the bytes to a String or Reader?

asked Dec 14, 2020 at 18:00

Garret Wilson

22.2k38 gold badges177 silver badges339 bronze badges

Have you considered skipping the first two lines and pass the rest of the file content to the interpreter?

jonrsharpe
– jonrsharpe

2020-12-14 18:06:16 +00:00
Commented Dec 14, 2020 at 18:06
@jonrsharpe that would be a last-ditch workaround, and surely there must be a workaround better than that. If I for example were to have hundreds of source files, I would have to first read and parse the source files to see if they have this header (because some of them do and some of them don't), and the header has variations according to PEP 263. But why should I need to parse the source file before handing it to Jython? The whole point of Jython is that Jython is supposed to parse the source file, not me.

Garret Wilson
– Garret Wilson

2020-12-14 18:09:58 +00:00
Commented Dec 14, 2020 at 18:09

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

How to compile Python file with encoding declaration from an InputStream or bytes using Jython?

0

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest

Linked