I need to get source codes for all files in a commit. Currently I am using Pydriller and it works well. But for performance reasons I need to use GitPython. I have tried this solution:
repo = Repo('path to repo') )
commit = repo.commit('my hash')
with io.BytesIO(target_file.data_stream.read()) as f:
print(f.read().decode('utf-8'))
But I get this error:
Traceback (most recent call last):
File "D:\Programmi\Python36\lib\threading.py", line 916, in _bootstrap_inner
self.run()
File "D:\Programmi\Python36\lib\threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "D:/Workspaces/PythonProjects/fixing-
commit/crop_data_preparing_gitpython.py", line 82, in
get_commit_data_gitpython
print(f.read().decode('utf-8'))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x9f in position 18: invalid start byte
I have thought that this can be an encoding problem, but even changing the encoding from utf-8 to latin-1 doesn't help.
Does exist another strategy that would help me get the code for those files using GitPython?
PyDrillerusesGitPython, so with a little searching, I think you can find happiness.repo.git.show('%s:%s' % (commit.hexsha, 'path/to/foo.bar')).