0

I have a text file I'm trying to process with vbscript, it looks like this:

111 ,   ,       ,Yes    ,Yes
222 ,   ,       ,Yes    ,Yes
333 ,   ,       ,Yes    ,Yes
444 ,   ,       ,Yes    ,Yes
555 ,   ,       ,Yes    ,Yes
666 ,   ,       ,Yes    ,Yes

What I want is to remove the carriage returns and tabs, commas and 'yes' (or the regex "\t,\t,\t\t,Yes\t,Yes") to give this output:

('111','222','333','444','555','666')

I'm using this code:

Const ForReading = 1
Const ForWriting = 2

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile(filePath, ForReading)

strText = objFile.ReadAll
objFile.Close
'chr(010) = line feed chr(013) = carriage return
strNewText = Replace(strText, "\t,\t,\t\t,Yes\t,Yes" & chr(013) & chr(010), "','") 

Set objFile = objFSO.OpenTextFile(filePath, ForWriting)
objFile.WriteLine strNewText
objFile.Close

This isn't giving the desired output however, If I take the ""\t,\t,\t\t,Yes\t,Yes" &" out of the replace it removes the carriage returns, which is fine but I also need the commas tabs and 'yes' removed, as well as having a (' at the start and ') at the end. I'm guessing it's the way I've used the regex but I've not used much vbscript so I'm not sure

2 Answers 2

1

Instead of hunting down what you don't want, it's easier and less errorprone to concentrate on what you want:

  Dim sExp   : sExp   = "('111','222','333','444','555','666')"
  Dim aLines : aLines = Array( _
      "111 ,   ,       ,Yes    ,Yes" _
    , "222 ,   ,       ,Yes    ,Yes" _
    , "333 ,   ,       ,Yes    ,Yes" _
    , "444 ,   ,       ,Yes    ,Yes" _
    , "555 ,   ,       ,Yes    ,Yes" _
    , "666 ,   ,       ,Yes    ,Yes" _
  )     
  Dim sAll : sAll = Join( aLines, vbCrLf )
  WScript.Echo sAll
  Dim reCut : Set reCut = New RegExp
  reCut.Global    = True
  reCut.MultiLine = True
  reCut.Pattern   = "^\d+"
  Dim oMTS : Set oMTS = reCut.Execute( sAll )
  If 0 = oMTS.Count Then
     WScript.Echo "Bingo A!"
  Else
     ReDim aNums( oMTS.Count - 1 )
     Dim nI
     For nI = 0 To UBound( aNums )
         aNums( nI ) = oMTS( nI ).Value
     Next
     Dim sRes : sRes = "('" & Join( aNums, "','" ) & "')"    
     If sRes = sExp Then
        WScript.Echo "QED:", sRes
     Else   
        WScript.Echo "Bingo B!"
     End If
  End If

output:

111 ,   ,       ,Yes    ,Yes
222 ,   ,       ,Yes    ,Yes
333 ,   ,       ,Yes    ,Yes
444 ,   ,       ,Yes    ,Yes
555 ,   ,       ,Yes    ,Yes
666 ,   ,       ,Yes    ,Yes
QED: ('111','222','333','444','555','666')

Annotations:

I use an array to build my string to process (sAll). Your string (strText) comes from a file. So:

  Dim sAll : sAll = Join( aLines, vbCrLf )
  ==>
  Dim sAll : sAll = objFile.ReadAll

The string is parsed by an RegExp (reCut), its pattern ^\d+ looks for a sequence (+) of digits (\d) at the start (^) of a line (not the whole string; that's why the MultiLine attribute is set to True). The result of .Execute is a Match Collection (oMTS), containg Matches.

To make the the concatenation of the expected result easier, the values of the Matches are copied to an array (aNums).

The "('" & Join( aNums, "','" ) & "')" expression combines the array's elements using the separator (combinator?) ',' - to complete the result, we need just a suitable head (' resp. tail ').

Sign up to request clarification or add additional context in comments.

4 Comments

QED!!! :) Fantastic way of doing it, I don't understand the code at all though - can you add another line in showing where I read my text file into the array?
I'm getting a wrong number of arguments error when I try to write the output? using this line of code: strNewText = Replace(sAll, sRes)
Replace takes 3 arguments (see your own code). But I don't see any reason for this operation - if you want to write the result to a file, just do "objFile.WriteLine sRes".
all sorted, I was being a tool I just needed: objFile.WriteLine sRes
0

Try this

(.*?)(?:\s*,){3}Yes\s*,Yes\r?

you need to take care of the linebreaks, with Regexr \r was fine. I put the line breaks into the regex because I wanted to have it optional using the ? afterwards. Otherwise the last row will not be replaced if it does not end with a line break.

and replace it with

'$1',

Here you will get a additional comma at the end. I am at the moment not sure how to handle this.

$1 is the content of the first capturing group, in your case the part before the first comma should be in it.

See it here on Regexr

2 Comments

thanks for reply strNewText = Replace(strText, "(.*?)(?:\s*,){3}Yes\s*,Yes\r?", "','") doesn't appear to make any changes to the file?
@4rd2, you have to replace with '$1',, but if it does nothing it has not matched. Try replacing the \r with \r\n or \n to match the proper line break.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.