3

I have a text file which is a listing of directories that I would like to turn into an array. I figured space delimiting will work but the number of spaces varies between each item and the spaces in the directory name would be a problem. I would like to parse the text into a PHP array.

The text file has a very rigid structure that looks like this:

04/17/2013  09:49 PM    <DIR>          This is directory 1 (1994)
03/11/2013  06:48 PM    <DIR>          Director 2 (1951)
04/15/2013  08:34 PM    <DIR>          This is going to be number 3 (2000)
08/17/2012  09:50 PM    <DIR>          Four (1998)
10/17/2011  05:12 PM    <DIR>          And lastly 5 (1986)

I only need to keep the folder date (not time), the complete name of the directory (as one entry) and the year in parenthesis. Thanks in advance!

3 Answers 3

3

Sure, use preg_split:

<?php
$str = "04/17/2013  09:49 PM    <DIR>          This is directory 1 (1994)
03/11/2013  06:48 PM    <DIR>          Director 2 (1951)
04/15/2013  08:34 PM    <DIR>          This is going to be number 3 (2000)
08/17/2012  09:50 PM    <DIR>          Four (1998)
10/17/2011  05:12 PM    <DIR>          And lastly 5 (1986)";

function sp($x) {
    return preg_split("/\s\s+|\s*\((\d{4}).*\)/", $x,0,PREG_SPLIT_DELIM_CAPTURE);
}
$array = preg_split("/\n/", $str);
$processed = array_map('sp', $array);

print_r($processed);

This will create an array of arrays. Each line will become an array, containing an array for each item. For instance, $processed[0][3] will contain This is directory 1

Keep in mind this code assume that spaces working as division must be 2 or more; only 1 space is considered as part of the same field. (You'll probably need to hand hack that according to your needs)

Edit: I added the part to get the year as a separated element of the array. Now $processed[0][4] has 1994. (you don't need the (), right?)

See it working with this change here: http://codepad.org/in973ijV

Sign up to request clarification or add additional context in comments.

1 Comment

This is great but I still need the (year) to be a separate array entry. I'm sure thats easy to modify though.
0

Why you dont forget about this txt and use scandir?

http://php.net/manual/en/function.scandir.php

$mydir = "/home/folder/";
$scan = scandir($mydir);
$i = 2 /* bypass dot and 2dots dirs */;

while($i < count($scan)){
    echo $scan[$i];
    echo "<hr>";
    $i++;
} 

1 Comment

Because the directory I need to scan is not on a server running php. The file is created locally then uploaded to a server which DOES have php. Otherwise that would be a rather easy solution!
0

The most simple (to read) pattern is:

$pattern = '~^(?<date>\S+).*<DIR>\s+(?<name>.*) \((?<year>\d{4})\)$~m';
preg_match_all($pattern, $subject, $matches, PREG_SET_ORDER);

foreach ($matches as $match) {
    printf("<br>date: %s, name: %s, year: %s",
           $match['date'], $match['name'], $match['year']);
}

But you can optimize a little being more explicit:

$pattern = '~^(?<date>\S++)'                         . '\s++(?:\S++\s++){3}'
         . '(?<name>(?>[^(]++|\((?!\d{4}\)\s*+$))+)' . '\s++\('
         . '(?<year>\d{4})'                          . '\)\s*+$~m';

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.