I have a dataframe with 2 columns GL and GLDESC and want to add a 3rd column called KIND based on some data that is inside of column GLDESC.
DF:
GL GLDESC
1 515100 Payroll-ISL
2 515900 Payroll-ICA
3 532300 Bulk Gas
4 551000 Supply AB
5 551000 Supply XPTO
6 551100 Supply AB
7 551300 Intern
For each row of the data table:
If
GLDESCcontains the wordPayrollanywhere in the string then I wantKINDto bePayroll.If GLDESC contains the word
Supplyanywhere in the string then I wantKINDto beSupply.In all other cases I want
KINDto beOther.
Then, I found this:
DF$KIND <- ifelse(grepl("supply", DF$GLDESC, ignore.case = T), "Supply",
ifelse(grepl("payroll", DF$GLDESC, ignore.case = T), "Payroll", "Other"))
But with that, I have everything that matches Supply, for example, classified. However, as in DF lines 4 and 5, the same GL has two Supply, which for me is unnecessary. In fact, I need only one type of GLDESC to be matched if for the same GL the string is repeated.
Edit: I can not delet any row. I want to have this as output:
GL GLDESC KIND
A Supply1 Supply
A Supply2 N/A
A Supply3 N/A
A Supply4 N/A
A Supply5 N/A
A Supply6 N/A
A Payroll1 Payroll
B Supply2 Supply
B Payroll Payroll