5

I have a text file with following text

161624.406 : Send:[sometext1]
161624.437 : Send:[sometext2]
161624.468 : Send:[sometext3]
161624.499 : Send:[sometext4]
161624.531 : Send:[sometext5]

I want to keep only the sometext part in that file. Desired output is

sometext1
sometext2
sometext3
sometext4
sometext5

I am using the following code in Excel-VBA

Public Sub testa()
    a = "C:\Users\pankaj.jaju\Desktop\test.log"

    Set objFSO = CreateObject("Scripting.FileSystemObject")
    Set objTxtFile = objFSO.OpenTextFile(a, 1)
    strText = objTxtFile.ReadAll
    objTxtFile.Close
    Set objTxtFile = Nothing


    Set objRegEx = CreateObject("VBScript.RegExp")
    With objRegEx
        .Global = True
        .MultiLine = True
        .Pattern = "\[([^]]+)\]"
        Set objRegMC = .Execute(strText)
        b = objRegMC(0).SubMatches(0)
    End With
    Set objRegEx = Nothing

    Debug.Print b
End Sub

The problem is the output is displayed as sometext1 only. How do I ReplaceAll in the text file and save the file with the desired text only.

3
  • I assume it is ok to essentially delete the file and recreate it or do you need it to operate on the same file? Commented May 20, 2014 at 17:45
  • That is not a critical thing ... I can do either Commented May 20, 2014 at 17:46
  • @pnuts - Yes. But my requirement is to change the file itself as I am going to use that file in some other application too which is not Excel based and that app needs the file with stripped text only. Commented May 20, 2014 at 17:51

4 Answers 4

3

The regex.Replace method should do the trick.

Separate your pattern into groups like this: "(.*?)(\[)([^]]+)(\])(.*?)"

And now you can replace your input string with the matching group which is group three in this case: objRegEx.Replace(strText, "$3")


Here is a helpful link to different examples of Regex within Excel.

Sign up to request clarification or add additional context in comments.

Comments

2

I know you've already got your answer. But for a simple program like this, why even utilize regular expressions? Here's a 4 line script that can do the same thing - yes it has been tested.

Dim a, text: a = "C:\testenv\test.log"
text = split(CreateObject("Scripting.FileSystemObject").OpenTextFile(a, 1).ReadAll, vbcrlf)
for i=0 to ubound(text) : text(i) = replace(split(text(i), "[")(1), "]", "") : next
CreateObject("Scripting.FileSystemObject").OpenTextFile(a, 2).Write(Join(text, vbcrlf))

4 Comments

Personally I would expand this to be more readable so its easier to understand, maintain and learn from. Other than that, +1 Nice simple solution for the text format given by OP.
@PortlandRunner Thanks man! :) I must admit, I've found some rather unusual formats to manipulate text due to the fact that Regular Expressions can make my brain throb.
@Rich - Nice one .... but I wanted to use regular expression as my actual problem is bit complex. I posted my test code and data to get some insights from SO community.
@PankajJaju Makes sense. Glad you got it resolved. Thanks for the +1's gents!
0

The regex obviously only grabbing the first match of the string. I'm not proficient enough with regex to come up with a regex only solution that takes into account \n

A simple workaround would be to use objTxtFile.ReadLine instead of ReadAll

Read each string in one by one, apply regex and get output.

Public Sub testa()
    a = "C:\Users\pankaj.jaju\Desktop\test.log"

    Set objFSO = CreateObject("Scripting.FileSystemObject")
    Set objTxtFile = objFSO.OpenTextFile(a, 1)

    Do Until (objTxtFile.AtEndOfStream) 'Loop till EOF
        strText = objTxtFile.ReadLine 'Single line read instead of ReadAll

        Set objRegex = CreateObject("VBScript.RegExp")
        With objRegex
            .Global = True
            .MultiLine = True
            .Pattern = "\[([^]]+)\]"
            Set objRegMC = .Execute(strText)
            b = objRegMC(0).SubMatches(0)
        End With
        Set objRegex = Nothing

        Debug.Print b 'replace this with the code to output the string where you want etc
    Loop
    objTxtFile.Close
End Sub

7 Comments

Thanks for this but I have found a faster way to do it.
Try setting your text line to this: objRegEx.Replace(strText, "") and then you wont need objRegMC. Here is a link that has an example of replacing the original excel cell using regex which is similar.
@PortlandRunner - Using Replace, I ended up replacing the required substring.
@PortlandRunner - Actually I used your idea and I think it can be done. I just need to change my patter to .Pattern = "(.*?)(([^]]+)\])(.*?)" and objRegEx.Replace(strText, "$2"). The only problem now is to remove the square brackets.
@PankajJaju I see now what you meant. Can you move the square bracket into a new grouping ([)
|
0

OK ... found a way to do it (Thanks MSDN ... sometimes the docs are useful :-)). Sharing the answer

Public Sub testa()
    a = "C:\Users\pankaj.jaju\Desktop\test.log"

    Set objFSO = CreateObject("Scripting.FileSystemObject")
    Set objTxtFile = objFSO.OpenTextFile(a, 1)
    strText = objTxtFile.ReadAll
    objTxtFile.Close
    Set objTxtFile = Nothing

    b = ""
    Set objRegEx = CreateObject("VBScript.RegExp")
    With objRegEx
        .Global = True
        .MultiLine = True
        .Pattern = "\[([^]]+)\]"
        Set objMatches = .Execute(strText)

        For Each Match In objMatches
            For Each SubMatch In Match.SubMatches
                b = b & SubMatch & vbCrLf
            Next SubMatch
        Next Match

    End With
    Set objRegEx = Nothing

    Set objTxtFile = objFSO.OpenTextFile(a, 2)
    objTxtFile.Write b
    objTxtFile.Close
    Set objTxtFile = Nothing
End Sub


EDIT - Based on @PortlandRunner's suggestion

Public Sub testa()
a = "C:\Users\pankaj.jaju\Desktop\test.log"

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objTxtFile = objFSO.OpenTextFile(a, 1)
strText = objTxtFile.ReadAll
objTxtFile.Close
Set objTxtFile = Nothing

b = ""
Set objRegEx = CreateObject("VBScript.RegExp")
With objRegEx
    .Global = True
    .MultiLine = True
    .Pattern = "(.*?)(\[)([^]]+)(\])(.*?)"
    b = objRegEx.Replace(strText, "$3")
End With
Set objRegEx = Nothing

Set objTxtFile = objFSO.OpenTextFile(a, 2)
objTxtFile.Write b
objTxtFile.Close
Set objTxtFile = Nothing
End Sub

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.