1

I need to find and replace, within a potentially big SQL dump file (~2-3MB), all date occurrences with their actual value increased with a given value. This is needed as my company uses this SQL dump file to deploy demos of a particular software, and the dates need to be translated to correctly fit the period the demo will be usable.

This is a small extract to serve as an example:

INSERT INTO ordini (id, id_fornitore, data, oggetti_ordine, data_consegna, controllo, data_consegna_prevista, resp_controllo, DDT, nr_DDT, iknow_iddocu, spedizione, pagamento) VALUES (10, 204, '2011-11-29', 'Palline gialle##Palline rosse##Palline verdi##§§1000##200##360##§§12 €##10 €##11.5 €##', '2012-12-29', 0, '2011-12-05', 0, '', '', 0, 'A mano', '30 gg.'), (13, 204, '2011-11-30', 'Palline verdi##§§12##§§5.750##', '2013-04-23', 0, '1970-01-01', 0, '', '', 0, '', ''), (14, 204, '2011-11-30', 'Palline verdi##Palline rosse##§§12##22##§§5.750##5.750##', '2013-02-22', 0, '2011-12-31', 0, '', '', 0, 'A mano', 'Ri.Ba. 30 gg.');

As you can see, all the dates within the file are in mysql YYYY-MM-DD DATETIME format, like: '2013-03-12'. Occasionally, some of them may include the time as well after the date, but being this irrelevant to my needs, times should be left unchanged.

I eventually arranged this simple script:

<?php
$push_period = "30";

print "<h1>Parsing file...</h1>";
print "<h2>Pushing dates ahead of {$push_period} days.</h2>";

$file=implode("\n",file('db.sql'));
print($file);

preg_match_all('@(\d\d\d\d)-(\d\d)-(\d\d)@', $file, $match, PREG_OFFSET_CAPTURE);

print "<br /><br />";

print "<table border=1 align='center'>";
print "<th colspan='3'><b>Dates moved {$push_period} days ahead</b></th>";

$array_new_dates = array();

foreach ($match[0] as $occurrence) {

    print "<tr><td>";
    print "<pre>";
    print_r($occurrence);
    print "</pre>";
    print "</td><td width='40' align='center'>";
    print "=>";
    print "</td><td>";
    print "<pre>";

    $temp_array = array();
    $modified_value = date('Y-m-d', strtotime($occurrence[0] . " +".$push_period." days"));
    $temp_array[0] = $modified_value;
    $temp_array[1] = $occurrence[1];
    $array_new_dates[] = $temp_array;
    print_r($temp_array);
    print "</pre>";
    print "</td></tr>";

    $file = substr_replace($file, $modified_value, $occurrence[1], 10);
}

print "</table>";

print($file);
$file = str_replace("\n", "", $file);

$fp=fopen('updated_db.sql','w');

// Dumping updated file
fwrite($fp,$file,strlen($file));
?>

Now, my problem is that if I run this script with large files, I am predictably prompted this error:

Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 64 bytes) in /srv/www/htdocs/mysoftware_dev/date_replacer.php on line 10

I hence need to elaborate the input in steps. Problem is, if I split the input file in fixed-size blocks, I might happen to trunk a date (and consequently not pushing it ahead of the needed value). What would a good approach to this problem be? (apart from manually pre-splitting the input file into several smaller files). Thank you for any help.

2
  • 1
    Try using fopen/fgets. This will allow you to read the file line-by-line, which will avoid excessive memory usage and almost guarantee you won't get half-cut entries. Commented May 12, 2014 at 15:50
  • 1
    fgets is the way to go if you can get the SQL dump w/one insert on each line. If its generated via mysqldump use --extended-insert=FALSE Commented May 12, 2014 at 16:05

2 Answers 2

3

This can be much simpler with a preg_replace_callback() call, which allows you to use a callback function to do logic on your match:

$string = ''; // Data from file

$string = preg_replace_callback(
    '/\d{4}-\d{2}-\d{2}/',
    function($matches) {
        $date = new DateTime(reset($matches));
        $oneDay = new DateInterval('P1D');
        $date->add($oneDay);
        return $date->format('Y-m-d');
    },
    $string
);

Notice how I tweaked your Regex, and used {} to specify how many digits and removed the capturing groups. We use PHP's DateTime class, and then ::add() a DateInterval to the value, before returning the date with our original ::format().

I'd also take @NietTheDarkAbsol's advice and take a look at fgets() if you still have memory issues. However, my cleaned up logic will reduce memory usage (since you won't be storing all of the matches in a variable and looping through them one by one).

Sign up to request clarification or add additional context in comments.

2 Comments

I started from your answer and modified my original script. Unfortunately, for legacy reasons we're stuck with PHP 5.2.5, so no DateInterval() for me, I had to stick to date() and strtotime(). I had to move the function outside the preg_replace_callback block too, as it was giving me an error. Thanks to @NietTheDarkAbsol as well as his tip was very helpful. Thanks a lot, your assistance is very appreciated!
No problem, date() and strtotime() work just as well I just prefer to use OOP when I can :) Not sure why you were getting an error with preg_replace_callback(), maybe it has to do with PHP 5.2's implementation..but glad you got it resolved
0

I suppose that you will use this script on request, not on daily base. Maybe the easiest solution is to increase your memory limit whitch is currently on 128MB.

I didn't try if your script really works what it should, but try to increase memory limit with this in PHP script:

ini_set("memory_limit","512M");

or with this in php.ini:

memory_limit = 512M

Also take a look at this

1 Comment

Unfortunately this is not an option, as demos are supposed to be potentially deployed in different servers, not just our own.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.