2

I know I can convert pandas object like Series, DataFrame to json as follows:

series1 = pd.Series(np.random.randn(5), name='something')
jsonSeries1 = series1.to_json() #{"0":0.0548079371,"1":-0.9072821424,"2":1.3865642993,"3":-1.0609052074,"4":-3.3513341839}

However what should I do when that series is encapsulated inside other datastructure, say dictionary as follows:

seriesmap = {"key1":pd.Series(np.random.randn(5), name='something')}

How do I convert above map to json like this:

{"key1":{"0":0.0548079371,"1":-0.9072821424,"2":1.3865642993,"3":-1.0609052074,"4":-3.3513341839}}

simplejson does not work:

 jsonObj = simplejson.dumps(seriesmap)

gives

Traceback (most recent call last):
  File "C:\..\py2.py", line 86, in <module>
    jsonObj = json.dumps(seriesmap)
  File "C:\Mahesh\Program Files\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\simplejson\__init__.py", line 380, in dumps
    return _default_encoder.encode(obj)
  File "C:\Mahesh\Program Files\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\simplejson\encoder.py", line 275, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "C:\Mahesh\Program Files\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\simplejson\encoder.py", line 357, in iterencode
    return _iterencode(o, 0)
  File "C:\Mahesh\Program Files\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\simplejson\encoder.py", line 252, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: 0   -0.038824
1   -0.047297
2   -0.887672
3   -1.510238
4    0.900217
Name: something, dtype: float64 is not JSON serializable

To generalize this even further, I want to convert arbitrary object to json. The arbitrary object may be simple int, string or of complex types such that tuple, list, dictionary containing pandas objects along with other types. In dictionary the pandas object may lie at arbitrary depth as some key's value. I want to safely convert such structure to valid json. Is it possible?

Update

I just tried encapsulating DataFrame as a value of one of the keys of a dictionary and converting that dictionary to json by encapsulating in another DataFrame (as suggested in below answer). But seems that it does not work:

import pandas as pd

d = {'one' : pd.Series([1., 2., 3.], index=['a', 'b', 'c']),
    'two' : pd.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])}

df = pd.DataFrame(d)

mapDict = {"key1":df}
print(pd.DataFrame(mapDict).to_json())

This gave:

Traceback (most recent call last):
  File "C:\Mahesh\repos\JavaPython\JavaPython\bin\py2.py", line 80, in <module>
    print(pd.DataFrame(mapDict).to_json())
  File "C:\Mahesh\Program Files\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\pandas\core\frame.py", line 224, in __init__
    mgr = self._init_dict(data, index, columns, dtype=dtype)
  File "C:\Mahesh\Program Files\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\pandas\core\frame.py", line 360, in _init_dict
    return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
  File "C:\Mahesh\Program Files\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\pandas\core\frame.py", line 5231, in _arrays_to_mgr
    index = extract_index(arrays)
  File "C:\Mahesh\Program Files\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\pandas\core\frame.py", line 5270, in extract_index
    raise ValueError('If using all scalar values, you must pass'
ValueError: If using all scalar values, you must pass an index

2 Answers 2

1

call pd.DataFrame on seriesmap then use to_json

pd.DataFrame(seriesmap).to_json()

'{"key1":{"0":0.8513342674,"1":-1.3357052602,"2":0.2102391775,"3":-0.5957492995,"4":0.2356552588}}'
Sign up to request clarification or add additional context in comments.

5 Comments

The underlying issue is that the json module does not know how to serializenumpy.float64, something that the pandas to_json method fixes.
so wrapping absolutely anything in DataFrame will work? I mean I may have DataFrame itself instead of that series. Also the pandas object may be nested multiple times at any depth inside any complex datastructure (tuple, dictionary, list). Or I may have tuple, list, dictionary as other keys say key2. Will all work?
@Mahesha999 No! Not absolutely anything will work. You suggested a series in the dict and this should work. If there were dataframes in there, I'd suggest a different strategy.
sorry, I said "series is encapsulated inside other datastructure, say dictionary". Btw, what is that other strategy?
actually I want to be able to serialize / deserialize arbitrary python type. I am preferring json. Python native types (int, string, tuple, list, dictionary) can be serialized/deserialized easily with simplejson...but stuff gets messy when pandas objects gets involved...
0

So far, there is not a single utility that can serialize or de-serialize nested Python structures containing Pandas objects. Even PyArrow (developed by Google) cannot handle complex numbers. So if you want to use that, you need to write your own code.

I have recently developed a library (https://github.com/xuancong84/pandas-serializer) that can serialize/de-serialize almost everything in Python. You can try that and report to me what cannot be identically de-serialized. Thanks! -:)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.