I have a very large "work request" data set that I need to clean up. The data set has some consistent elements, a series of numbers that are a set length this changes about about half way through the data set but the change is predictable. One issue with the data set is that there are multiple deliminators in places, sometimes no deliminator, sometimes text in front etc. I pulled a sample of the variables that I am dealing with and separated them manually to show the desired result.
+----+--------------------------------+------------+--------+----------------------+
| | A | B | C | D |
+----+--------------------------------+------------+--------+----------------------+
| 1 | Work Request | Cell 1 | Cell 2 | Cell 3 |
| 2 | 2097947.A | 2097947 | A | |
| 3 | 2590082.A/4900 REPLACE DXAC | 2590082 | A | 4900 Replace DXAC |
| 4 | 2679314.C | 2679314 | C | |
| 5 | 2864142B/DEMOLISH STRUCTURES | 2864142 | B | DEMOLISH STRUCTURES |
| 6 | 3173618 | 3173618 | | |
| 7 | 3251628/4800 REPLACE ASPHALT | 3251628 | | 4800 REPLACE ASPHALT |
| 8 | 4109066A | 4109066 | A | |
| 9 | 4374312D | 4374312 | D | |
| 10 | 4465402, Building 4100 | 4465402 | | Building 4100 |
| 11 | 4881715 DESIGN | 4881715 | | DESIGN |
| 12 | 4998608\ | 4998608 | | |
| 13 | ADMIN | ADMIN | | |
| 14 | PGM MGMT | PGM MGMT | | |
| 15 | FWR # 4958989 /Bldg 4000 | 4958989 | | Bldg 4000 |
| 16 | NICC FEDISR000744416/4000 UPS | R000744416 | | 4000 UPS |
| 17 | R000451086/4300 MODS TO RM5006 | R000451086 | | 4300 MODS TO RM5006 |
+----+--------------------------------+------------+--------+----------------------+
As you can see there are a few predictable variables and some that are user input errors. Notice that in some cases the numbers have a single character behind the 7 digit work request number most of the time separated by a "." but sometimes no separation as in A8 and A9. Sometime there are deliminators, "/" or "space", or "," but this isn't consistent. I am currently working with a VBA that manages to strip the numbers for some but fails when it encounters no numbers or extra numbers. Eventual the work request numbers were changed to add the R00 this is the "new" number and over half of the data uses this in some form.
The VBA that I am using:
Option Explicit
Public Function Strip(ByVal x As String, LeaveNums As Boolean) As Variant
Dim y As String, z As String, n As Long
For n = 1 To Len(x)
y = Mid(x, n, 1)
If LeaveNums = False Then
If y Like "[A-Za-z ]" Then z = z & y 'False keeps Letters and spaces only
Else
If y Like "[0-9. ]" Then z = z & y 'True keeps Numbers and decimal points
End If
Next n
Strip = Trim(z)
End Function
=NUMBERVALUE(Strip(A1,TRUE))
=Strip(A1,FALSE)
This works in some places but not others. It also doesn't separate out C and D respectively. The most important issue is stripping out the work request number as seen in B.
Thanks for any help.
