4

I am trying to echo out the names/paths of the files that are written in logfile.txt. For that, I use a regex to match everything before the first ocurrence of : and output it. I am reading the logfile.txt line by line:

<?php

$logfile = fopen("logfile.txt", "r");

if ($logfile) {
    while (($line = fgets($logfile)) !== false) {
        if (preg_match_all("/[^:]*/", $line, $matched)) {
            foreach ($matched as $val) {
                foreach ($val as $read) {
                    echo '<pre>'. $read . '</pre>';
                }
            }
        }
    }

    fclose($logfile);
} else {
    die("Unable to open file.");
}

?>

However, I get the entire contents of the file instead. The desired output would be:

/home/user/public_html/an-ordinary-shell.php
/home/user/public_html/content/execution-after-redirect.html
/home/user/public_html/paypal-gateway.html

Here is the content of logfile.txt:

-------------------------------------------------------------------------------

/home/user/public_html/an-ordinary-shell.php: Php.Trojan.PCT4-1 FOUND
/home/user/public_html/content/execution-after-redirect.html: {LDB}VT-malware33.UNOFFICIAL FOUND
/home/user/public_html/paypal-gateway.html: Html.Exploit.CVE.2015_6073

Extra question: How do I skip reading the first two lines (namely the dashes and emtpy line)?

1
  • use preg_match instead of preg_match_all Commented Aug 3, 2016 at 18:08

2 Answers 2

3

Here you go:

<?php
# load it as a string
$data = @file("logfile.txt");

# data for this specific purpose
$data = <<< DATA
-------------------------------------------------------------------------------

/home/user/public_html/an-ordinary-shell.php: Php.Trojan.PCT4-1 FOUND
/home/user/public_html/content/execution-after-redirect.html: {LDB}VT-malware33.UNOFFICIAL FOUND
/home/user/public_html/paypal-gateway.html: Html.Exploit.CVE.2015_6073
DATA;

$regex = '~^(/[^:]+):~m';
# ^ - anchor it to the beginning
# / - a slash
# ([^:]+) capture at least anything NOT a colon
# turn on multiline mode with m

preg_match_all($regex, $data, $files);
print_r($files);
?>


It even skips both your lines, see a demo on ideone.com.

Sign up to request clarification or add additional context in comments.

6 Comments

I would've favored this answer but it outputs two arrays. The first one includes the : and the second starts without a slash before the names of the files.
Do you need the slashes? Updated the answer, loop over $files[1] to have your files. Additionally updated the ideone.com demo.
Let's say I do. Just because they are present in the file and I might need them for specific cases in the end. But I would also like to have just one array.
If you can be sure that there is always a colon (:) in every line with a slash at the beginning, you could as well use: $regex = '~^/[^:]+~m'; (no parentheses anymore) - bear in mind though that it will match much more if there's no colon (anything not a colon includes newline characters as well).
What purpose did the parentheses serve? And I accepted your answer, thank you.
|
3

preg_match_all returns all occurrences for the pattern. For the first line, it will return:

/home/user/public_html/an-ordinary-shell.php,
an empty string,
Php.Trojan.PCT4-1 FOUND and an other empty string

that don't contain :.

to obtain a single result, use preg_match, but to do that using explode should suffice.

To skip lines you don't want, you can for example build a generator function that gives only the good lines. You can also use a stream filter.

1 Comment

Silly me. I forgot the difference between preg_match_all and preg_match. Did it with preg_match and explode. Thank you for your time. I would've accepted your answer as well but I can only accept one.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.