I have a data frame that has CategoryCodes for every row. Multiple rows have same CategoryCodes, and there are a few hundred unique CategoryCodes. I have to assign the names of the category for each row, pulling the category from a reference data frame. I tried to use below syntax, but this is giving me an output where number of rows in MyData have increased by times. The output should have same number of rows as MyData. Where am I going wrong?
Combineddf<-sqldf("select * from MyData left join
ReferenceDf using (CategoryCodes)")
Reference Data:
CategoryCodes Class
5 120500 Tools
6 166300 Spare Parts
7 280200 Spare Parts
8 280200 Spare Parts
9 295200 Spare Parts
10 165000 Spare Parts
MyData (over 30 columns):
X Z CategoryCodes Y
5 OW EA 120300 S
6 ANB EA 120500 S
7 ANB FOT 120300 S
8 ANB EA 120500 S
9 ANB EA 120300 S
10 MIS EA 120500 S