1

I wrote the function below to find a pattern in a text:

bool match(char* patt,char* text){ 

    int textLoc=0, pattLoc=0, textStart=0;  

    while(textLoc < (int) strlen(text) && pattLoc < (int)strlen(patt)){ 

        if( *(patt+pattLoc) == *(text+textLoc) ){     
            textLoc= textLoc+1;          
            pattLoc= pattLoc+1;      
        }   
        else{          
            textStart=textStart+1;  
        textLoc=textStart;           
        pattLoc=0;   
        }     
    }   


    if(pattLoc >=    (int) strlen(patt)) 
        return true;  
    else return false; 

} 

As it appears, the function takes two parameters of the type char*. I would like to use this function to find a pattern in a binary file, what you suggest to carry out this issue?

3
  • By binary file, are you mapping the file into memory yourself or are you loading it then injecting a DLL to it? Commented Mar 10, 2012 at 12:42
  • @ Mike Kwan Yes I am going to map the file in the memory by CreateFilemMapping .. etc. Commented Mar 10, 2012 at 16:14
  • Consider what would happen if your binary file contained no null bytes. You can't treat a binary stream of data as null-terminated char strings. Commented Mar 10, 2012 at 22:39

3 Answers 3

1

There is no right or wrong here. The only difference I would consider here is to use buffer/size approach instead of strings.

You should also consider how you would like to read the file. Are you going to read the entire file to memory or are you going to read it in sections ?

If you are going to read it in sections, always save the last part of each section ( the size of your search pattern ) and append it to the beginning of your next section. This way the cuts-off for each section will be evaluated as well.

Sign up to request clarification or add additional context in comments.

Comments

1

It seems to me that you tried to implement the popular strstr function on your own. But that will not help you, since you asked for finding a binary pattern. The function you should use in that case is called memmem.

2 Comments

Is it for linux? I use Windows.
You're right, the memmem function is not available everywhere. But there is a good and simple implementation from the git project, called gitmemmem. You can use that instead of doing it yourself.
1

You sound more like you are looking for the best way to find patterns in files. If so, there is a very good document for single and multiple pattern checking:

Given a pattern P = a1a2...an, Fnd all occurrences of P in a text T = b1b2...bm.

Extension to multipattern cases: Given a set of patterns, P1, P2, ..., Pl ,

Fnd all occurrences of P in a text T = b1b2...bm.

You can check this document for simple explanation and this one for more detailed and different implementations/codes.

1 Comment

@Adban sorry. Updated links now.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.