2

I m trying to create a new column to store all the date information as a list. It works well on one single row. However, it raises error when the function is applied on the entire data table. Can anyone help? Thanks.

The function,

def res(dr):
    return [dr["Current Date"],dr["End Date"],dr["Begin Date"]]

The data table,

Listed Code Current Date    Frequency   Price   Residual    Coupon  End Date    Begin Date
    696      1997-06-30               1     113.49     100  112.558174  2006-06-13  1996-06-14
    696      1997-05-31               1     113.49     100  112.558174  2006-06-13  1996-06-14

returns a list operating on a single row,

res(bond_info.iloc[0,:])
[Timestamp('1997-06-30 00:00:00'),Timestamp('2006-06-13 00:00:00'),Timestamp('1996-06-14 00:00:00')]

raises an error applying on the entire data table,

bond_info.apply(res,axis=1)

ValueError                                Traceback (most recent call last)
F:\Anaconda3\lib\site-packages\pandas\core\internals.py in create_block_manager_from_arrays(arrays, names, axes)
   4309         blocks = form_blocks(arrays, names, axes)
-> 4310         mgr = BlockManager(blocks, axes)
   4311         mgr._consolidate_inplace()

F:\Anaconda3\lib\site-packages\pandas\core\internals.py in __init__(self, blocks, axes, do_integrity_check, fastpath)
   2794         if do_integrity_check:
-> 2795             self._verify_integrity()
   2796 

F:\Anaconda3\lib\site-packages\pandas\core\internals.py in _verify_integrity(self)
   3005             if block._verify_integrity and block.shape[1:] != mgr_shape[1:]:
-> 3006                 construction_error(tot_items, block.shape[1:], self.axes)
   3007         if len(self.items) != tot_items:

F:\Anaconda3\lib\site-packages\pandas\core\internals.py in construction_error(tot_items, block_shape, axes, e)
   4279     raise ValueError("Shape of passed values is {0}, indices imply {1}".format(
-> 4280         passed, implied))
   4281 

ValueError: Shape of passed values is (2, 3), indices imply (2, 8)

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-104-e9d749798573> in <module>()
----> 1 bond_info.apply(res,axis=1)

F:\Anaconda3\lib\site-packages\pandas\core\frame.py in apply(self, func, axis, broadcast, raw, reduce, args, **kwds)
   4358                         f, axis,
   4359                         reduce=reduce,
-> 4360                         ignore_failures=ignore_failures)
   4361             else:
   4362                 return self._apply_broadcast(f, axis)

F:\Anaconda3\lib\site-packages\pandas\core\frame.py in _apply_standard(self, func, axis, ignore_failures, reduce)
   4471                 index = None
   4472 
-> 4473             result = self._constructor(data=results, index=index)
   4474             result.columns = res_index
   4475 

F:\Anaconda3\lib\site-packages\pandas\core\frame.py in __init__(self, data, index, columns, dtype, copy)
    273                                  dtype=dtype, copy=copy)
    274         elif isinstance(data, dict):
--> 275             mgr = self._init_dict(data, index, columns, dtype=dtype)
    276         elif isinstance(data, ma.MaskedArray):
    277             import numpy.ma.mrecords as mrecords

F:\Anaconda3\lib\site-packages\pandas\core\frame.py in _init_dict(self, data, index, columns, dtype)
    409             arrays = [data[k] for k in keys]
    410 
--> 411         return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
    412 
    413     def _init_ndarray(self, values, index, columns, dtype=None, copy=False):

F:\Anaconda3\lib\site-packages\pandas\core\frame.py in _arrays_to_mgr(arrays, arr_names, index, columns, dtype)
   5602     axes = [_ensure_index(columns), _ensure_index(index)]
   5603 
-> 5604     return create_block_manager_from_arrays(arrays, arr_names, axes)
   5605 
   5606 

F:\Anaconda3\lib\site-packages\pandas\core\internals.py in create_block_manager_from_arrays(arrays, names, axes)
   4312         return mgr
   4313     except ValueError as e:
-> 4314         construction_error(len(arrays), arrays[0].shape, axes, e)
   4315 
   4316 

F:\Anaconda3\lib\site-packages\pandas\core\internals.py in construction_error(tot_items, block_shape, axes, e)
   4278         raise ValueError("Empty data passed with indices specified.")
   4279     raise ValueError("Shape of passed values is {0}, indices imply {1}".format(
-> 4280         passed, implied))
   4281 
   4282 

ValueError: Shape of passed values is (2, 3), indices imply (2, 8)
0

1 Answer 1

2

Option 1
Use filter + tolist. You don't need the apply here.

df.filter(regex='.*Date$').values.tolist()

[['1997-06-30', '2006-06-13', '1996-06-14'],
 ['1997-05-31', '2006-06-13', '1996-06-14']]

Option 2
Alternatively, using str.endswith + loc:

df.loc[:, df.columns.str.endswith('Date')].values.tolist()


[['1997-06-30', '2006-06-13', '1996-06-14'],
 ['1997-05-31', '2006-06-13', '1996-06-14']]

Option 3
Column indexing

df[['Current Date', 'End Date', 'Begin Date']].values.tolist()

[['1997-06-30', '2006-06-13', '1996-06-14'],
 ['1997-05-31', '2006-06-13', '1996-06-14']]
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you! It helps me a lot. Actually, there are some other alternatives to implement such function. However, I am just wondering what is wrong with the apply function. I am trying to figure it out.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.