6

I am creating a ML model that will use a JSON file to understand the pattern and response format. As I have my data in excel format I converted it to JSON in python.

Here is the code:

import xlrd
from collections import OrderedDict
import simplejson as json
# Open the workbook and select the first worksheet
wb = xlrd.open_workbook('D:\\android\\testdata2.xlsx')
sh = wb.sheet_by_index(0)
# List to hold dictionaries
data_list = []
# Iterate through each row in worksheet and fetch values into dict
for rownum in range(1, sh.nrows):
    data = OrderedDict()
    row_values = sh.row_values(rownum)
    data['pattern'] = row_values[0]
    data['response'] = row_values[1]
    data_list.append(data)
# Serialize the list of dicts to JSON
j = json.dumps(data_list)
# Write to file
with open('data1.json', 'w') as f:
    f.write(j)

I am the getting the output as:

[{
    "pattern": "WALLSTENT NON COUVERTE ",
    "response": "ENDOPROTHESE STENT  VASCULAIRE "
}, {
    "pattern": "PRIMEADVANCED SURSCAN MRI ",
    "response": "NEUROSTIMULATEUR NERF VAGUE GAUCHE "
}, {
    "pattern": "AVASTIN  FLACON DE",
    "response": "BEVACIZUMAB"
}, {
    "pattern": "PERJETA SOLUTION A DILUER POUR PERFUSION",
    "response": "BRENTUXIMAB VEDOTIN"
}]

The desired output I am looking for is like this:

{
    "intents": [{
        "pattern": ["WALLSTENT, NON, COUVERTE "],
        "response": ["ENDOPROTHESE STENT  VASCULAIRE] "
    }, {
        "pattern": ["PRIMEADVANCED ,SURSCAN ,MRI"] ,
        "response": ["NEUROSTIMULATEUR NERF VAGUE GAUCHE "]
    }, {
        "pattern": ["AVASTIN , FLACON ,DE"],
        "response": ["BEVACIZUMAB"]
    }, {
        "pattern": ["PERJETA, SOLUTION, A, DILUER, POUR ,PERFUSION"],
        "response": ["BRENTUXIMAB VEDOTIN"]
    }]
}

What modification can I do in my function to get the output I am looking for.

2 Answers 2

2

That should do it:

import xlrd
from collections import OrderedDict
import simplejson as json
# Open the workbook and select the first worksheet
wb = xlrd.open_workbook('D:\\android\\testdata2.xlsx')
sh = wb.sheet_by_index(0)
# List to hold dictionaries
data_list = []
# Iterate through each row in worksheet and fetch values into dict
for rownum in range(1, sh.nrows):
    data = OrderedDict()
    row_values = sh.row_values(rownum)
    data['pattern'] = row_values[0]
    data['response'] = row_values[1]
    data_list.append(data)
data_list = {'intents': data_list} # Added line
# Serialize the list of dicts to JSON
j = json.dumps(data_list)
# Write to file
with open('data1.json', 'w') as f:
    f.write(j)

Note the added data_list = {'intents': data_list}.

Sign up to request clarification or add additional context in comments.

3 Comments

i added the line and it gives this error as i run the program. "list indices must be integers or slices, not str"
Apologies, I fixed it.
Note that in the python shell there needs to be a new line between data_list.append(data) and data_list = {'intents': data_list}
2

Give a shot to pyexcel_xlsx library in python. I have used this for converting xlsx to json. Sweet and simple one. And fast also as compared to other python libraries.

Sample code:

from pyexcel_xlsx import get_data;
import time;
import json;

data = get_data("D:\\android\\testdata2.xlsx")
sheetName = "Table A";

data_list = []
# Iterate through each row and append in above list
for i in range(0, len(data[sheetName])):
    data_list.append({
        'pattern' : data[sheetName][i][0],
        'response' : data[sheetName][i][1]
    })
data_list = {'intents': data_list} # Converting to required object
j = json.dumps(data_list)
# Write to file
with open('data1.json', 'w') as f:
    f.write(j)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.