1

I am receiving the following error when attempting to create a DataFrame from my JSON/Dict object. I'm pretty new to python and doing a learning exercise, so all help is appreciated.

The real issue is that I want 1 ROW(values) with 100 Columns, but it's telling me that the way I've set this up its expecting 100 values 100 columns.

I'm able to make this work if I do not provide the columns=json_input.keys(), but it defaults to 100 rows vs. 100 columns and 1 row.

import pandas as pd

json_input = {"x0": "9.521496806", "x1": "wed", "x2": "-5.087588682", "x3": "-17.21471427", "x4": "-2.486421073", "x5": "35.48653879", "x6": "-1.20495816", "x7": "-23.7174973", "x8": "105.7946327", "x9": "-5.951938559", "x10": "5.214871257", "x11": "0.303798139", "x12": "$296.43 ", "x13": "-0.194132881", "x14": "-2.188915191", "x15": "15.8504554", "x16": "-7.419140411", "x17": "6.931577729", "x18": "-33.76908811", "x19": "-1.932735617", "x20": "0.066503478", "x21": "0.014625357", "x22": "-2.826542568", "x23": "-9.51560375", "x24": "27.31797115", "x25": "-4.210150941", "x26": "-13.45071138", "x27": "17.51376958", "x28": "0.14235993", "x29": "6.49488499", "x30": "8.922856241", "x31": "1.264469019", "x32": "-14.22456453", "x33": "-22.51356894", "x34": "2.042085808", "x35": "7.996513763", "x36": "15.62250736", "x37": "-36.34086747", "x38": "2.665399772", "x39": "-1.354001761", "x40": "33.71068143", "x41": "11.74949803", "x42": "-2.793416547", "x43": "71.4392679", "x44": "-3.57085601", "x45": "-10.61019691", "x46": "63.36622572", "x47": "1.084953519", "x48": "0.965175942", "x49": "15.41097088", "x50": "38.02325393", "x51": "-4.601041878", "x52": "9.544564428", "x53": "5.171864325", "x54": "Aug", "x55": "3.238851899", "x56": "-1.444656373", "x57": "-24.85405723", "x58": "-0.127639937", "x59": "14.69515683", "x60": "-3.577237241", "x61": "12.67485", "x62": "-26.60833996", "x63": "22.3566647", "x64": "0.187033314", "x65": "-20.08925727", "x66": "0.3013055", "x67": "9.782791255", "x68": "-0.590871745", "x69": "-27.03617115", "x70": "0.178891203", "x71": "9.297257064", "x72": "-0.687360237", "x73": "23.1353161", "x74": "-1.692361883", "x75": "6.007302227", "x76": "-0.05636968", "x77": "20.23959571", "x78": "4.889493523", "x79": "0.02%", "x80": "20.11423271", "x81": "22.31711274", "x82": "asia", "x83": "-0.072090104", "x84": "volkswagon", "x85": "-0.14252212", "x86": "0.464293542", "x87": "-0.974325314", "x88": "7.131219017", "x89": "-2.506897555", "x90": "-0.069832619", "x91": "-11.84213839", "x92": "0.09761061", "x93": "15.27673142", "x94": "-1.927285625", "x95": "8.008175145", "x96": "0.659805361", "x97": "2.216918955", "x98": "-18.64465705", "x99\n": "-1.926577376"}

data_input = pd.DataFrame.from_dict(data=json_input, orient='index', columns=json_input.keys())

print(data_input)
C:\dev\anaconda3\lib\site-packages\sklearn\externals\joblib\__init__.py:15: DeprecationWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+.
  warnings.warn(msg, category=DeprecationWarning)
Traceback (most recent call last):
  File "C:\dev\anaconda3\lib\site-packages\pandas\core\internals\managers.py", line 1651, in create_block_manager_from_blocks
    placement=slice(0, len(axes[0])))]
  File "C:\dev\anaconda3\lib\site-packages\pandas\core\internals\blocks.py", line 3095, in make_block
    return klass(values, ndim=ndim, placement=placement)
  File "C:\dev\anaconda3\lib\site-packages\pandas\core\internals\blocks.py", line 2631, in __init__
    placement=placement)
  File "C:\dev\anaconda3\lib\site-packages\pandas\core\internals\blocks.py", line 87, in __init__
    '{mgr}'.format(val=len(self.values), mgr=len(self.mgr_locs)))
ValueError: Wrong number of items passed 1, placement implies 100

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "pandastest.py", line 18, in <module>
    data_input = pd.DataFrame.from_dict(data=json_input, orient='index', columns=json_input.keys())
  File "C:\dev\anaconda3\lib\site-packages\pandas\core\frame.py", line 1138, in from_dict
    return cls(data, index=index, columns=columns, dtype=dtype)
  File "C:\dev\anaconda3\lib\site-packages\pandas\core\frame.py", line 451, in __init__
    copy=copy)
  File "C:\dev\anaconda3\lib\site-packages\pandas\core\internals\construction.py", line 167, in init_ndarray
    return create_block_manager_from_blocks([values], [columns, index])
  File "C:\dev\anaconda3\lib\site-packages\pandas\core\internals\managers.py", line 1660, in create_block_manager_from_blocks
    construction_error(tot_items, blocks[0].shape[1:], axes, e)
  File "C:\dev\anaconda3\lib\site-packages\pandas\core\internals\managers.py", line 1691, in construction_error
    passed, implied))
ValueError: Shape of passed values is (100, 1), indices imply (100, 100)
2
  • Hi mike, in the data you gave, "x0" is a column and "9.52..." is the value of the first row for that column, right? Commented Sep 30, 2019 at 20:06
  • correct @Drago96 Commented Sep 30, 2019 at 20:22

2 Answers 2

2

-- Hi mike, you could add a transpose to the end of your line:

pd.DataFrame.from_dict(data=json_input, orient='index').T

This will give you the shape you are looking for.

Sign up to request clarification or add additional context in comments.

1 Comment

That was it! Fixed the issue, without having to convert the JSON dict.
1

It gives you 100 rows because you're passing orient=index. If you have control over your data structure, you can use the simpler pd.DataFrame.from_dict(data=json_input) with the input formatted like this:

{
    "column1": [value],
    "column2": [value],
    ...
}

Or in your case:

{
    "x0": ["9.521496806"],
    "x1": ["wed"],
    ...
}

3 Comments

Ok. It comes as JSON since the API is sending the data like ` {"column1": "value1"} ` So maybe i just convert to an array/list type for the value.
Yes. In you case, I tried with this and it worked: formatted = {x: [json_input[x]] for x in json_input}
As it turns out I found another way to resolve my situation. One issue is that I can receive a list of dicts or just one dict and I need to find the right way to create the dataframe. Actually wrapping the dict as a list solves the issue. json_list = [single_object]

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.