Tuesday, July 17, 2012

PHP, Objects, and Zvals

PHP, for all its faults, is a powerful and versatile language.  Its syntax is fairly simple for what it allows you to, and the ability to write extensions for it allows it do pretty much anything that it can't do on its own.

However, that being said, there is next to no documentation for writing an extension.  If you want to do something more complicated than the few examples that exist on the Web, well, you're going to be in for hours and hours of pain and suffering.  Just creating an extension that compiles, installs, and doesn't crash PHP is sadly an achievement.

Today, we're going to briefly cover working with PHP "objects" within an extension and memory leaks, the one thing that you certainly do not want to introduce into your scripting language (other than segfaults).

Objects and Arrays

Objects and arrays in PHP behave fairly similarly.  For example, you can "foreach" through the properties of an object just like you can "foreach" through the elements in an [associative] array.  In fact, the functions to get the properties of an object or the elements of an array (from a "zval" pointer) each return a "HashTable" pointer.

To get the propery HashTable for an object, you call:
HashTable* your hash table = Z_OBJPROP_P( your "zval*" here );

For an array, you call:
HashTable* your hash table = Z_ARRVAL_P( your "zval*" here );

Creating Objects and Arrays

For me, one of the main reasons of creating a PHP extension in the first place is to speed up some task that was too slow for PHP on its own (for example, creating large structures from some input).  In particular, I like to create objects for things that should be treated as objects.

Fortunately, creating objects and arrays in PHP is pretty easy (if not pretty).

Your first step is to declare, allocate, and then initialize a new "zval".
zval* shinyNewZval = NULL;
ALLOC_INIT_ZVAL( shinyNewZval );

For an object, call:
object_init( shinyNewZval );

For an array, call:
array_init( shinyNewZval );

Adding Properties

Adding properties (or assciative elements) is straightforward after that.

Here are some functions that do exactly what they say they do.  They add a new property (of a particular type) to the object.
add_property_long( your "zval*" here, property name, integer value );
add_property_double( your "zval*" here, property name, floating-point value );
add_property_bool( your "zval*" here, property name, boolean value );

For example:
add_property_long( shinyNewZval, "accountId", 9001 );
add_property_double( shinyNewZval, "balance", 501.40 );
add_property_bool( shinyNewZval, "isActive", true );

For associative arrays, just replace "property" with "assoc" and you have essentially the same functions, but for arrays.
add_assoc_long( your "zval*" here, property name, integer value );
add_assoc_double( your "zval*" here, property name, floating-point value );
add_assoc_bool( your "zval*" here, property name, boolean value );

For example:
add_assoc_long( shinyNewZval, "accountId", 9001 );
add_assoc_double( shinyNewZval, "balance", 501.40 );
add_assoc_bool( shinyNewZval, "isActive", true );

Now, what if we wanted to add a zval as a property?  For example, we may want to add a zero-indexed array as one of the properties.  Well, here are the two functions to do so:
add_property_zval( your "zval*" here, property name, property "zval*" );
add_assoc_zval( your "zval*" here, property name, property "zval*" );

Logically, the two functions would work just about identically.  One would add a property to an object; the other would do the same to an associative array.

Wrong!

The object version (and only the object version) also increments the reference counter on the "zval".  Why is that important?  When a new "zval" is created, its reference counter is set to "1"; that is, you have a reference to it, since you made it. Which makes perfect sense.  When all references to a "zval" are gone (that is, when the reference counter hits zero), then the "zval" is actually freed, allowing its memory to be reclaimed.

For an integer or something similarly small, this would usually go unnoticed in the short term.  However, for giant structures, that memory can add up fast.  Your script could easily run out of memory and quit, or you could cause the box to start swapping (which may or may not be bad for your platform, but for mine, swapping is considered the onset of death).

So, how do we fix this?  Simple!  Just decrement the reference counter that "add_property_zval" so rudely incremented on its own.  The function for this is "zval_ptr_dtor", where "dtor" is PHP's shorthand for "destructor".  Since the Zend API is in C, you're essentially saying, "Please destroy my 'zval'".
zval_ptr_dtor( address of your "zval*" here );

To conclude our example, here is the code for the object version:
// Note: this will increment the reference count on "someOtherZval".
add_property_zval( shinyNewZval, "transactions", someOtherZval );
// So, we need to decrement that reference count afterward.
zval_ptr_dtor( &someOtherZval );

And here is the code for the associative array version.  Note that the reference count is not incremented by the array version of the function, so we do not need to do anything special here.
add_assoc_zval( shinyNewZval, "transactions", someOtherZval );