I have a dataframe in pandas that looks like this:
100 200 300 400
0 1 1 0 1
1 1 1 1 0
What I want to do is select specific columns from this data frame. But when I try the following code (the df_matrix being the dataframe displayed at the top) :
intermediary_df = df_matrix["100"]
It does not work and from what I can tell is because it is an integer. I tried to force it with str(100) but gave the same error as before:
File "pandas\_libs\hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
TypeError: an integer is required
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "A:\python project\venv\lib\site-packages\pandas\core\indexes\base.py", line 3078, in get_loc
return self._engine.get_loc(key)
File "pandas\_libs\index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 164, in pandas._libs.index.IndexEngine.get_loc
KeyError: '100'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
TypeError: an integer is required
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "A:/python project/testing/testing4.py", line 42, in <module>
intermediary_df = df_matrix["100"]
File "A:\python project\venv\lib\site-packages\pandas\core\frame.py", line 2688, in __getitem__
return self._getitem_column(key)
File "A:\python project\venv\lib\site-packages\pandas\core\frame.py", line 2695, in _getitem_column
return self._get_item_cache(key)
File "A:\python project\venv\lib\site-packages\pandas\core\generic.py", line 2489, in _get_item_cache
values = self._data.get(item)
File "A:\python project\venv\lib\site-packages\pandas\core\internals.py", line 4115, in get
loc = self.items.get_loc(item)
File "A:\python project\venv\lib\site-packages\pandas\core\indexes\base.py", line 3080, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas\_libs\index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 164, in pandas._libs.index.IndexEngine.get_loc
KeyError: '100'
Does anyone know how to get around this? Thanks!
EDIT 1:
After trying to use intermediary_df = df_matrix[100] it worked as expecte. Btw, if someone else is facing this problem and wants to select multiple columns at the same time, you can use:
intermediary_df = df_matrix[[100, 300]]
and the output will be:
100 300
0 1 0
1 1 1
intermediary_df = df_matrix[100]?Int64Index([100, 200, 300, 400], dtype='int64')