I have two dataframes, df1, and df2. I am joining on two different column names. For some reason when I perform this join, the result creates exponential duplicated rows. How would I avoid this. I am using outer join.
Data
df1
ID Date
a 1/1/2022
a 1/1/2022
b 2/1/2022
b 2/1/2022
b 2/1/2022
df2
Quarter State
1/1/2022 ny
4/3/2023 ca
6/1/2024 ca
7/1/2021 wa
Desired
ID Date Quarter State
a 1/1/2022 1/1/2022 ny
a 1/1/2022 na na
b 2/1/2022 na na
b 2/1/2022 na na
b 2/1/2022 na na
Doing
join = pd.merge( df1, df2, left_on='Date', right_on='Quarter', how='outer'
)
However, the output is giving me much more rows than what I began with. I would think that a left join would solve this, but I am still getting duplicates. I am still researching this.Any suggestion is appreciated.