1

I need to get source codes for all files in a commit. Currently I am using Pydriller and it works well. But for performance reasons I need to use GitPython. I have tried this solution:

repo = Repo('path to repo') )
    commit = repo.commit('my hash')
with io.BytesIO(target_file.data_stream.read()) as f: 
    print(f.read().decode('utf-8'))

But I get this error:

Traceback (most recent call last):
File "D:\Programmi\Python36\lib\threading.py", line 916, in _bootstrap_inner
    self.run()
File "D:\Programmi\Python36\lib\threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
File "D:/Workspaces/PythonProjects/fixing- 
    commit/crop_data_preparing_gitpython.py", line 82, in 
get_commit_data_gitpython
print(f.read().decode('utf-8'))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x9f in position 18: invalid start byte

I have thought that this can be an encoding problem, but even changing the encoding from utf-8 to latin-1 doesn't help.

Does exist another strategy that would help me get the code for those files using GitPython?

2
  • 1
    PyDriller uses GitPython, so with a little searching, I think you can find happiness. Commented Jun 18, 2019 at 11:12
  • For a relative path path/to/foo.bar, try repo.git.show('%s:%s' % (commit.hexsha, 'path/to/foo.bar')). Commented Jun 18, 2019 at 11:39

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.