We have a string column in our database with values for sports teams. The names of these teams are occasionally prefixed with the team's ranking, like such: (13) Miami (FL). Here the 13 is Miami's rank, and the (FL) means this is Miami Florida, not Miami of Ohio (Miami (OH)):
We need to clean up this string, removing (13) and keeping only Miami (FL). So far we've used gsub and tried the following:
> gsub("\\s*\\([^\\)]+\\)", "", "(13) Miami (FL)")
[1] " Miami"
This is incorrectly removing the (FL) suffix, and it's also not handling the white space correctly in front.
Edit
Here's a few additional school names, to show a bit the data we're working with. Note that not every school has the (##) prefix.:
c("North Texas", "Southern Methodist", "Texas-El Paso",
"Brigham Young", "Winner", "(12) Miami (FL)", "Appalachian State",
"Arkansas State", "Army", "(1) Clemson",
"(14) Georgia Southern")