Here is a hypothetical scenario with multiindex dataframes in pandas. Trying to merge them will result in an error. Do I have to do reset_index() on either dataframe to make this work?
arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index1 = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
index2 = pd.MultiIndex.from_tuples(tuples, names=['third', 'fourth'])
s1 = pd.DataFrame(np.random.randn(8), index=index1, columns=['s1'])
s2 = pd.DataFrame(np.random.randn(8), index=index2, columns=['s2'])
Attempted merges:
s1.merge(s2, how='left', left_index=True, right_index=True)
- Editor's note: The error for this one was most likely
ValueError: cannot join with no overlapping index names. Tested with Pandas 2.2.3
s1.merge(s2, how='left', left_on=['first', 'second'], right_on=['third', 'fourth'])
- Editor's note: It's not clear what error occurred here. If you know, please add it.
RangeIndex, which is probably not what you want. I'm using Pandas 2.2.3. For the first one I getValueError: cannot join with no overlapping index names.np.random.seed(0). I might take the initiative and do this myself, and for the answers too. For reference see How to make good reproducible pandas examples. On that note, please also add your expected output.df1.merge(df2, how='left', left_index=True, right_index=True), you might as well just usedf1.join(df2)