Skip to main content

Python cross-version byte-code decompiler

Project description

buildstatus PyPI Installs Latest Version Supported Python Versions

packagestatus

decompyle3

A native Python cross-version decompiler and fragment decompiler. A reworking of uncompyle6.

I gave a talk on this at BlackHat Asia 2024.

Introduction

decompyle3 translates Python bytecode back into equivalent Python source code. It accepts bytecodes from Python version 3.7 on.

For decompilation of older Python bytecode, see uncompyle6.

Why this?

Uncompyle6 is awesome, but it has a fundamental problem in the way it handles control flow. In the early days of Python, when there was little optimization and code was generated in a very template-oriented way, figuring out control flow structures could be done by simply looking at code patterns.

Over the years, more code optimization, specifically around handling jumps, has made it harder to support detecting control flow strictly from code patterns. This was noticed as far back as Python 2.4 (2004), but since this is a difficult problem, so far it hasn’t been tackled in a satisfactory way.

The initial attempt to fix to this problem was to add markers in the instruction stream, initially this was a COME_FROM instruction, and then use that in pattern detection.

Over the years, I’ve extended that to be more specific, so COME_FROM_LOOP and COME_FROM_WITH were added. And I added checks at grammar-reduce time to make try to make sure jumps match with supposed COME_FROM targets.

However, all of this is complicated, not robust, has greatly slowed down deparsing and is not really tenable.

In this project, we began rewriting and refactoring the grammar.

However, even this isn’t enough. Control flow needs to be addressed by using dominators and reverse-dominators, which the python-control-flow project can give.

This I am finally slowly doing in yet another non-public project. It is a lot of work. Funding in the form of sponsorship while greatly appreciated isn’t commensurate with the amount of effort, and currently I have a full-time job. So it may take time before it is available publicly, if at all.

Requirements

The code here can be run on Python versions 3.7 or 3.8. The bytecode files it can read have been tested on Python bytecodes from versions 3.7 and 3.8.

Installation

You can install from PyPI using the name decompyle3:

pip install decompyle3

To install from source code, this project uses setup.py, so it follows the standard Python routine:

$ pip install -e .  # set up to run from source tree

or:

$ python setup.py install # may need sudo

A GNU Makefile is also provided, so make install (possibly as root or sudo) will do the steps above.

Running Tests

make check

A GNU makefile has been added to smooth over setting up and running the right command, and running tests from fastest to slowest.

If you have remake installed, you can see the list of all tasks including tests via remake --tasks

Usage

Run

$ decompyle3 *compiled-python-file-pyc-or-pyo*

For usage help:

$ decompyle3 -h

Verification

If you want Python syntax verification of the correctness of the decompilation process, add the --syntax-verify option. However since Python syntax changes. You should use this option if the bytecode is the right bytecode for the Python interpreter that will be checking the syntax.

You can also cross-compare the results with another Python decompiler like unpyc37 . Since they work differently, bugs here often aren’t in that, and vice versa.

There is an interesting class of these programs that is readily available to give stronger verification: those programs that, when run, test themselves. Our test suite includes these.

And Python comes with another set of programs like this: its test suite for the standard library. We have some code in test/stdlib to facilitate this kind of checking too.

Known Bugs/Restrictions

We support only released versions, not candidate versions. Note however that the magic of a released version is usually the same as the last candidate version prior to release.

We also don’t handle PJOrion or otherwise obfuscated code. For PJOrion try: PJOrion Deobfuscator to unscramble the bytecode to get valid bytecode before trying this tool; pydecipher might help with that.

This program can’t decompile Microsoft Windows EXE files created by Py2EXE, although we can probably decompile the code after you extract the bytecode properly. Pydeinstaller may help with unpacking Pyinstaller bundlers.

Handling pathologically long lists of expressions or statements is slow. We don’t handle Cython or MicroPython, which don’t use bytecode.

There are numerous bugs in decompilation. And that’s true for every other CPython decompilers I have encountered, even the ones that claimed to be “perfect” on some particular version like 2.4.

As Python progresses, decompilation also gets harder because the compilation is more sophisticated and the language itself is more sophisticated. I suspect that attempts there will be fewer ad-hoc attempts like unpyc37 (which is based on a 3.3 decompiler) simply because it is harder to do so. The good news, at least from my standpoint, is that I think I understand what’s needed to address the problems in a more robust way. But right now, until such time as the project is better funded, I do not intend to make any serious effort to support Python versions 3.8 or 3.9, including bugs that might come in. I imagine at some point I may be interested in it.

You can easily find bugs by running the tests against the standard test suite that Python uses to check itself. At any given time, there are dozens of known problems that are pretty well isolated and that could be solved if one were to put in the time to do so. The problem is that there aren’t that many people who have been working on bug fixing.

You may run across a bug, that you want to report. Please do so. But be aware that it might not get my attention for a while. If you sponsor or support the project in some way, I’ll prioritize your issues above the queue of other things I might be doing instead. In rare situations, I can do a hand decompilation of bytecode for a fee. However, this is expensive, usually beyond what most people are willing to spend.

See Also

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

decompyle3-3.9.3.tar.gz (873.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

decompyle3-3.9.3-py39-none-any.whl (273.0 kB view details)

Uploaded Python 3.9

decompyle3-3.9.3-py38-none-any.whl (273.0 kB view details)

Uploaded Python 3.8

decompyle3-3.9.3-py37-none-any.whl (273.0 kB view details)

Uploaded Python 3.7

decompyle3-3.9.3-py3-none-any.whl (273.4 kB view details)

Uploaded Python 3

File details

Details for the file decompyle3-3.9.3.tar.gz.

File metadata

  • Download URL: decompyle3-3.9.3.tar.gz
  • Upload date:
  • Size: 873.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for decompyle3-3.9.3.tar.gz
Algorithm Hash digest
SHA256 66cedaea6b998b3711cdc9993254e91e692bc45d18da7e309acade4f3b15173f
MD5 737bb891e5eb9dff1fee2d29c51a3a2e
BLAKE2b-256 0edba610ef067904ad273cb5fdb020c68bf5d1565a4a2de7f07c25877e462240

See more details on using hashes here.

File details

Details for the file decompyle3-3.9.3-py39-none-any.whl.

File metadata

  • Download URL: decompyle3-3.9.3-py39-none-any.whl
  • Upload date:
  • Size: 273.0 kB
  • Tags: Python 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for decompyle3-3.9.3-py39-none-any.whl
Algorithm Hash digest
SHA256 04590de9fac1d5eb7763261c742b956e0bd73b2991a3f9c320bdaf515241cdca
MD5 ac9d841fba1c12dbc621d7b6b2113436
BLAKE2b-256 ff542c2daf4242202151bb1bdd8a96ed82f5497d2d1ea03a87f8966abfc5017e

See more details on using hashes here.

File details

Details for the file decompyle3-3.9.3-py38-none-any.whl.

File metadata

  • Download URL: decompyle3-3.9.3-py38-none-any.whl
  • Upload date:
  • Size: 273.0 kB
  • Tags: Python 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for decompyle3-3.9.3-py38-none-any.whl
Algorithm Hash digest
SHA256 f05f82733bfa6b2561f32244b58fbfe2b37f60c0daad23dbd5ccc63fc8a2faaf
MD5 3c6c2c17d4062b2b08ba055c98ab8399
BLAKE2b-256 6becc0222370f16125c8ca5f628b87acab42c8efa8a6e381a010cf1efe78524b

See more details on using hashes here.

File details

Details for the file decompyle3-3.9.3-py37-none-any.whl.

File metadata

  • Download URL: decompyle3-3.9.3-py37-none-any.whl
  • Upload date:
  • Size: 273.0 kB
  • Tags: Python 3.7
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for decompyle3-3.9.3-py37-none-any.whl
Algorithm Hash digest
SHA256 356b9f8f506a835502d97c893fca8793bbf824998f7efaac39695a54dac5a017
MD5 7c547a9b66bb494406df58d1c6b4ef47
BLAKE2b-256 fe19538f9b82bcc54cb67ec9b9445b75ca24d0b2b5064abe6046381c801bcde2

See more details on using hashes here.

File details

Details for the file decompyle3-3.9.3-py3-none-any.whl.

File metadata

  • Download URL: decompyle3-3.9.3-py3-none-any.whl
  • Upload date:
  • Size: 273.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for decompyle3-3.9.3-py3-none-any.whl
Algorithm Hash digest
SHA256 8e648e8e20b91e2e96780befd0771c9e872fede9c13015032128b6502750b11d
MD5 70bc356e299f59a6aea84a82421e1d94
BLAKE2b-256 8f61ec0f299cc8e75ad9810d459850ac47cf59b0b5a1b0bcf922bef8831991a0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page