5

This is my first question on SO, though I've searched substantially; I apologize if this has already been touched on.

The question/issue has to do with PHP's serialize() functionality. I am using serialization to store objects in a database. For example:

class Something {
  public $text = "Hello World";
}

class First {

  var $MySomething;

  public function __construct() {
    $this->MySomething = new Something();
  }
}

$first_obj = new First();
$string_to_store = serialize($first_obj);

echo $string_to_store

//  Result:  O:5:"First":1:{s:11:"MySomething";O:9:"Something":1:{s:4:"text";s:11:"Hello World";}}

Now, later on in the project life, I want to modify my class, First, to have a new property: $SomethingElse that will also correspond to a Something object.

The question is, for my old/existing objects, when I unserialize to the new version of my First class, it seems that the only way to initialize the new property (SomethingElse) is to look for it in the __wakeup() method. In which case, I need to document any new properties there. Is this correct? Properties need to be treated as in the constructor, having their initial value set (which ultimately duplicates the code).

I find that if I initalize the variable when declaring it, then it will get picked up by unserialize, for example, if I changed the Something class to:

class Something {
  public $text = "Hello World";
  public $new_text = "I would be in the unserialized old version.";
}

...

$obj = unserialize('O:5:"First":1:{s:11:"MySomething";O:9:"Something":1:{s:4:"text";s:11:"Hello World";}}');

print_r($obj);

//  Result:  First Object ( [MySomething] => Something Object ( [text] => Hello World [new_text] => I would be in the unserialized old version. ) ) 

But you cannot initialize new properties to objects when declaring them, is has to be done in the constructor (and __wakeup()?).

I hope I explained this well enough. I want to know if there's some programming pattern around this that I am missing, or if duplicating initialization code (or referencing an init method) in __wakeup() is typical, or if I simply need to be prepared to migrate old objects to new versions via. migration scripts.

Thanks.


Update: In thinking about what was said by the commenters so far, I thought I'd post the updated First class with an init() method:

class Something {
  public $text = "Hello World2";
  public $new_text = "I would be in the unserialized old version.2";
}

class First {

  var $MySomething;
  var $SomethingElse;

  public function __construct() {
    $this->init();
  }

  public function __wakeup() {
    $this->init();
  }
  private function init() {
    if (!isset($this->MySomething)) {
      $this->MySomething = new Something();
    }
    if (!isset($this->SomethingElse)) {
      $this->SomethingElse = new Something();
    }
  }
}

$new_obj = unserialize('O:5:"First":1:{s:11:"MySomething";O:9:"Something":1:{s:4:"text";s:11:"Hello World";}}');

print_r($new_obj);

//  Result:  First Object ( [MySomething] => Something Object ( [text] => Hello World [new_text] => I would be in the unserialized old version.2 ) [SomethingElse] => Something Object ( [text] => Hello World2 [new_text] => I would be in the unserialized old version.2 ) ) 

So really I'm not sure, because that seems like a workable pattern to me. As classes gain new properties they take their default values upon first restoration.

5
  • I dont implement __wakeup much, but if what you are saying is true then i would jsut make a protected init method and call this from both __wakeup and __construct - thus consolidating the code. That said - i fell like regardless having concrete migration scripts is a good way to go... Commented Feb 27, 2013 at 1:18
  • fair enough, thanks. see my thoughts about migration scripts below in response to hek2mgl's answer. Commented Feb 27, 2013 at 1:37
  • @DanL Of course it will work. But I'm still thinking that it will get messy after a few updates. And the problem with the renamed class remains. (Where renaming classes is a common refactoring topic). But if you manage to hide that complexity behind somewhat 'intelligent' framework, it could get more interesting Commented Feb 27, 2013 at 2:28
  • Still thinking this a very good first question! :) Commented Feb 27, 2013 at 2:35
  • haha thanks... I've been programming for quite a while, just never very socially :) Commented Feb 27, 2013 at 2:43

2 Answers 2

2

Good question! It cannot been answered in general. And I would say, that's not just related to serialize().. When you have a SQL database, and your code changes it will not work with old data too. That's a common problem of version management with data (bases).

When integrating data from an older software version into a newer one, you'll mostly have the problem that old data has to be translated into a newer format. That's even true with config files etc...

It is usual to write a script that translates the old data into the new format in such cases. I've done this a couple of years at work when creating upgrade packages for a PHP 'firmware' of a server product. :) And so do the most package managers on Linux distributions.

Note: If you will be safe against data loss between upgrades you'll have to take care during development and have the 'upgradability' of data in mind.


Update: I think serialized data can make the update process of data even worse. What is if you serialize a class and the rename it? will be hard to retrieve the data. I never thought about this but it sounds like a problem in case of version upgrades.

Sign up to request clarification or add additional context in comments.

7 Comments

I'm using this to try to get around version management with databases. The idea is that if I can save serialized objects attached to an ID, I can persist my objects via. serialization, and when I want to change my classes, I only have to update the PHP objects instead of having to update two places: PHP class AND DB schema. My concern about migration scripts is that this software will be a SaaS platform, which will end up containing a large amount of data that scales. As the data grows, I fear it will become untenable to migrate all the data if I'm routinely changing/refactoring my clases.
You see, it will get even worse. :) Thats a general problem regardless of the format the data is stored in. I think it's not good for code quality when actual code has to mess with older data formats. Translation should be done by a '3rd party script'
@DanL Though about this. I think serialized data can make the update process of data even worse. What is if you serialize a class and the rename it? will be hard to retrieve the data
@DanL (Maybe off topic) Do you know about tools like mysqldiff.org ? This eases SQL updates.
Thanks for the suggestion for mysqldiff. I wasn't aware of it. The more I'm thinking about it as well, the more I am agreeing with you. Tracking/processing changes and migrating data through versions seems like the best way to go. And I should rethink serializing objects.
|
0

Last time I had a class with changing properties, I stored all the data in an associative array called $classVar . If you do that, then all variables you add will be tracked with one simple serialize call no matter how many variables you add to that.

When it comes to usage, just check if the variable is initialized and if not, set the default. You could even serialize a classversion variable to handle more complicated cases such as variables not used anymore or variables that need conversion

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.