0

I tried to format git log to json but failed miserabely.

I used this command for the formatting, and I don't think this is where my problem lies, but hey you never know.

These are my functions.

def call_git_log():
    format: str = '{%n  "commit": "%H",%n  "abbreviated_commit": "%h",%n  "tree": "%T",%n  "abbreviated_tree": "%t",%n  "parent": "%P",%n  "abbreviated_parent": "%p",%n  "refs": "%D",%n  "encoding": "%e",%n  "subject": "%s",%n  "sanitized_subject_line": "%f",%n  "body": "%b",%n  "commit_notes": "%N",%n  "verification_flag": "%G?",%n  "signer": "%GS",%n  "signer_key": "%GK",%n  "author": {%n    "name": "%aN",%n    "email": "%aE",%n    "date": "%aD"%n  },%n  "commiter": {%n    "name": "%cN",%n    "email": "%cE",%n    "date": "%cD"%n  }%n},'
    output = subprocess.Popen(["git", "log", f"--pretty=format:{format}"], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    stdout, stderr = output.communicate()
    return stdout.decode("utf-8")
output = call_git_log()
    with open("output/test.json", "w") as file:
        print(str(output), file=file)

As a result I get this file - in the wrong JSON Format. Why is this and what is wrong. output/test.json

{
  "commit": "4099117e564e7106b7ee7e315e3e8b8458a8fdce",
  "abbreviated_commit": "4099117",
  "tree": "6b1eb2fbf81de876d14781ffa82b5ee5db973af6",
  "abbreviated_tree": "6b1eb2f",
  "parent": "37445d7254f726801ce5ed067b8ee8eb523b8b99",
  "abbreviated_parent": "37445d7",
  "refs": "HEAD -> master, master/master",
  "encoding": "",
  "subject": "ue04-plots - A.3 fertig",
  "sanitized_subject_line": "ue04-plots-A.3-fertig",
  "body": "",
  "commit_notes": "",
  "verification_flag": "N",
  "signer": "",
  "signer_key": "",
  "author": {
    "name": "Lorenz Bauer",
    "email": "",
    "date": "Wed, 14 Dec 2022 00:50:33 +0100"
  },
  "commiter": {
    "name": "Lorenz Bauer",
    "email": "",
    "date": "Wed, 14 Dec 2022 00:50:33 +0100"
  }
},
{
  "commit": "37445d7254f726801ce5ed067b8ee8eb523b8b99",
  "abbreviated_commit": "37445d7",
  "tree": "caa7df1bd70b5fd2319e903331c2a96d80f08152",
  "abbreviated_tree": "caa7df1",
  "parent": "cb484ec66468c5bbac1f78a8ed87852202207701",
  "abbreviated_parent": "cb484ec",
  "refs": "",
  "encoding": "",
  "subject": "ue04-plots - arrows",
  "sanitized_subject_line": "ue04-plots-arrows",
  "body": "",
  "commit_notes": "",
  "verification_flag": "N",
  "signer": "",
  "signer_key": "",
  "author": {
    "name": "Lorenz Bauer",
    "email": "",
    "date": "Wed, 14 Dec 2022 00:48:45 +0100"
  },
  "commiter": {
    "name": "Lorenz Bauer",
    "email": "",
    "date": "Wed, 14 Dec 2022 00:48:45 +0100"
  }
},
{
  "commit": "cb484ec66468c5bbac1f78a8ed87852202207701",
  "abbreviated_commit": "cb484ec",
  "tree": "73e2e71396290d9627b9301451ca5a1bb7ba6df4",
  "abbreviated_tree": "73e2e71",
  "parent": "becd22ff715defbe00e064181ee71266e3d1db45",
  "abbreviated_parent": "becd22f",
  "refs": "",
  "encoding": "",
  "subject": "ue04-plots - titel",
  "sanitized_subject_line": "ue04-plots-titel",
  "body": "",
  "commit_notes": "",
  "verification_flag": "N",
  "signer": "",
  "signer_key": "",
  "author": {
    "name": "Lorenz Bauer",
    "email": "",
    "date": "Wed, 14 Dec 2022 00:33:59 +0100"
  },
  "commiter": {
    "name": "Lorenz Bauer",
    "email": "",
    "date": "Wed, 14 Dec 2022 00:33:59 +0100"
  }
},

What do I have to change to make this a valid JSON document, which json.loads() can process.

1
  • Generally, it's a good idea to use someone else's already-debugged Git Python library rather than trying to hand-parse git log output. Get the data you want, put it into a data structure, and use an existing Python JSON library to generate the JSON. Commented Dec 14, 2022 at 3:45

1 Answer 1

2

Looks like you are manipulating a git log's output, making it a JSON file, then you'll transfer it to some other JSON parser, and found an error there?

Yes, your output is not a valid JSON: As an "array", a bracket wrapping the beginning and end are expected.

See https://stackoverflow.com/a/4600561/9035237https://gist.github.com/textarcana/1306223 for a post-processing example. All the code in your mentioned link said this too.

If you are using Python, you may:

output = "[" + output + "]"
output = output.replace("},]", "}]")

However, there are still problems in your format: JSON doesn't accept a line separator inside string, and a " in any field will break the format forever, but these will probably happen in a commit message. So your format should change.

As per https://gist.github.com/varemenos/e95c2e098e657c7688fd?permalink_comment_id=3260906#gistcomment-3260906 says, you can do a hack: use some string that will probably not occur in any field, for example ^^^^, as a temporary quote placeholder, then do any character escaping, for example \n\\n and \"\\", and ^^^^\" at last. Don't do JSON prettify at this step, hand it up to a JSON formatter.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.