1

I am very new to Python, I need to merge two json file with multiple json object based on "Id".

File1.json

{"id": 1, "name": "Ault", "class": 8, "email": "[email protected]"}
{"id": 2, "name": "john", "class": 8, "email": "[email protected]"}
{"id": 3, "name": "josh", "class": 8, "email": "[email protected]"}
{"id": 4, "name": "emma", "class": 8, "email": "[email protected]"}

File2.json

{"id": 4, "math": "A", "class": 8, "physics": "D"}
{"id": 2, "math": "B", "class": 8, "physics": "C"}
{"id": 3, "math": "A", "class": 8, "physics": "A"}
{"id": 1, "math": "C", "class": 8, "physics": "B"}

I have tried both json.loads(jsonObj) and json.load(path). Both throw errors. I know both files are not a valid json as a whole (combined), but each line in a file is a valid json. I want to read line by line and merge both.

1
  • list1 = [json.loads(line) for line in open(path)] and then you can post your attempt at merging (hint: plenty of questions on merging dicts here on SO) Commented Jul 9, 2021 at 9:34

2 Answers 2

3

You can read line by line then parse

# asuming

# File1.json

# {"id": 1, "name": "Ault", "class": 8, "email": "[email protected]"}
# {"id": 2, "name": "john", "class": 8, "email": "[email protected]"}
# {"id": 3, "name": "josh", "class": 8, "email": "[email protected]"}
# {"id": 4, "name": "emma", "class": 8, "email": "[email protected]"}

# File2.json

# {"id": 4, "math": "A", "class": 8, "physics": "D"}
# {"id": 2, "math": "B", "class": 8, "physics": "C"}
# {"id": 3, "math": "A", "class": 8, "physics": "A"}
# {"id": 1, "math": "C", "class": 8, "physics": "B"}


import json
merged = {}

with open('File1.json') as f:
    for line in f:
        jsonified = json.loads(line)
        merged[jsonified['id']] = jsonified
        
with open('File2.json') as f:
    for line in f:
        jsonified = json.loads(line)
        merged[jsonified['id']].update(jsonified) # asuming both file has same ids otherwise use try catch

merged = list(merged.values())
print(merged)
[{'id': 1,
  'name': 'Ault',
  'class': 8,
  'email': '[email protected]',
  'math': 'C',
  'physics': 'B'},
 {'id': 2,
  'name': 'john',
  'class': 8,
  'email': '[email protected]',
  'math': 'B',
  'physics': 'C'},
 {'id': 3,
  'name': 'josh',
  'class': 8,
  'email': '[email protected]',
  'math': 'A',
  'physics': 'A'},
 {'id': 4,
  'name': 'emma',
  'class': 8,
  'email': '[email protected]',
  'math': 'A',
  'physics': 'D'}]
Sign up to request clarification or add additional context in comments.

Comments

1

Here is my take on this using pandas.

import pandas as pd
import os

os.chdir(os.getcwd())
file_path_1 = 'file1.json'
file_path_2 = 'file2.json'

df1 = pd.read_json(file_path_1, lines=True)
df2 = pd.read_json(file_path_2, lines=True)

df = df1.merge(df2, on='id')

print(df)

Output:

   id  name  class_x              email math  class_y physics
0   1  Ault        8  [email protected]    C        8       B
1   2  john        8  [email protected]    B        8       C
2   3  josh        8  [email protected]    A        8       A
3   4  emma        8  [email protected]    A        8       D

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.