2

I have one text file that contains around 100K lines. Now I would like to search a string from the text file. If that string is present then I want to get the line number at which it's present. At the end I need all the occurrence of that string with line numbers from the text file.

* Ordinary Method Tried * We can read the whole text file line by line. Keep a counter variable that increases after every read. If I found my string then I will return the Counter Variable. The limitation of this method is, I have to traverse through all the 100K lines one by one to search the string. This will decrease the performance.

* Quick Method (HELP REQUIRED)* Is there any way that will directly take me to the line where my searchstring is present and if found I can return the line number where it's present.

* Example *

Consider below data is present in text file. (say only 5 lines are present)

enter image description here

Now I would like to search a string say "Pune". Now after search, it should return me Line number where string "pune" is present. Here in this case it's present in line 2. I should get "2" as an output. I would like to search all the occurrence of "pune" with their line numbers

3
  • Have you tried both suggestions you made? with about 100k lines, was there a notable performance issue on the first method? does the second method actually supply all occurences? etc... Commented Nov 11, 2013 at 14:13
  • @Amber - I don't know how to achieve second/advanced method.that's why I have posted this thread over here. Obviously there will be a performance difference. Commented Nov 11, 2013 at 14:20
  • @Solution Seeker- Have you got your solution? I have same scenario now. Commented Jan 6, 2017 at 9:02

3 Answers 3

1

I used a spin off of Me How's code example to go through a list of ~10,000 files searching for a string. Plus, since my html files have the potential to contain the string on several lines, and I wanted a staggered output, I changed it up a bit and added the cell insertion piece. I'm just learning, but this did exactly what I needed and I hope it can help others.

Public Sub ReadTxtFile()

    Dim start As Date
    start = Now

    Dim oFSO As Object
    Set oFSO = CreateObject("Scripting.FileSystemObject")

    Dim oFS As Object

    Dim filePath As String

    Dim a, b, c, d, e As Integer
    a = 2
    b = 2
    c = 3
    d = 2
    e = 1

    Dim arr() As String

    Do While Cells(d, e) <> vbNullString

            filePath = Cells(d, e)

            ReDim arr(5000) As String
            Dim i As Long
            i = 0

            If oFSO.FileExists(filePath) Then

                On Error GoTo Err

                Set oFS = oFSO.OpenTextFile(filePath)
                Do While Not oFS.AtEndOfStream
                    arr(i) = oFS.ReadLine
                    i = i + 1
                Loop
                oFS.Close
            Else
                MsgBox "The file path is invalid.", vbCritical, vbNullString
                Exit Sub
            End If

            For i = LBound(arr) To UBound(arr)
                If InStr(1, arr(i), "Clipboard", vbTextCompare) Then
                    Debug.Print i + 1, arr(i)
                    Cells(a + 1, b - 1).Select
                    Selection.Insert Shift:=xlDown
                    Cells(a, b).Value = i + 1
                    Cells(a, c).Value = arr(i)
                    a = a + 1
                    d = d + 1
                End If
            Next
            a = a + 1
            d = d + 1
    Loop

    Debug.Print DateDiff("s", start, Now)

    Exit Sub

Err:
    MsgBox "Error while reading the file.", vbCritical, vbNullString
    oFS.Close
    Exit Sub

End Sub
Sign up to request clarification or add additional context in comments.

Comments

0

the following fragment could be repalaced like:

 Dim arr() As String
    Dim i As Long
    i = 0

    If oFSO.FileExists(filePath) Then

        On Error GoTo Err

        Set oFS = oFSO.OpenTextFile(filePath)
        Do While Not oFS.AtEndOfStream
        ReDim Preserve arr(0 To i)
            arr(i) = oFS.ReadLine                        'to save line's content to array
            'If Len(oFSfile.ReadLine) = 0 Then Exit Do   'to get number of lines only
            i = i + 1
        Loop
        oFS.Close
    Else
        MsgBox "The file path is invalid.", vbCritical, vbNullString
        Exit Sub
    End If

Comments

0

Here's another method that should work fairly quickly. It uses the shell to execute the FINDSTR command. If you find the cmd box flickers, do an internet search for how to disable it. There are two options provided: one will return the line number followed by a colon and the text of the line containing the keyword. The other will just return the line number.

Not sure what you want to do with the results, so I just have them in a message box.


Option Explicit
'Set reference to Windows Script Host Object Model
Sub FindStrings()
    Const FindStr As String = "Pune"
    Const FN As String = "C:\users\ron\desktop\LineNumTest.txt"
    Dim WSH As WshShell
    Dim StdOut As Object
    Dim S As String

Set WSH = New WshShell

Set StdOut = WSH.Exec("cmd /c findstr /N " & FindStr & Space(1) & FN).StdOut
Do Until StdOut.AtEndOfStream
    S = S & vbCrLf & StdOut.ReadLine
    'If you want ONLY the line number, then
    'S = S & vbCrLf & Split(StdOut.ReadLine, ":")(0)
Loop
S = Mid(S, 2)

MsgBox (S)

End Sub

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.