Monday, September 24, 2012

Net-SNMP and read_all_mibs

For better or for worse, Net-SNMP is the SNMP implementation for use with C/C++ (and Linux).  My feelings on Net-SNMP are a topic for another time, however.

Here, we'll be focused on using the "read_all_mibs" function to load the MIBs that have been deployed and how to interpret the result of that function.

Our major use cases revolve around:

  • Finding information about an OID (known by its numeric form); and
  • Showing the contents of the OID tree (from some arbitrary location).

Initializing Net-SNMP

If your MIBs are in a non-standard directory, or if you just want to take charge and tell Net-SNMP what to do, you'll need to specify the MIB directory before calling "init_snmp".  For example:
netsnmp_set_mib_directory( "/usr/local/snmp/mibs" );

Once you've done that, your next step is to initialize Net-SNMP.  You'll need to give it some arbitrary text (which is used for logging purposes, I believe).  This should normally be the name of your application or something otherwise unique.
init_snmp( "sense-codons" );

"init_snmp" will then load all of the MIBs in the directory that you specified, and it will spew errors all over the screen if your MIBs are in any way not perfect.  And trust me, if you put some proprietary MIBs in that directory, you'll see what I'm talking about.  The point is that the "init_snmp" function might take a minute or two to run; once that function returns, you can rest assured that the MIBs have been loaded and, as best as it could, Net-SNMP has organized them into an OID tree for you to use.

Root nodes

The top-level structure of the OID tree looks like this:
  • ccitt(0)
  • iso(1)
  • joint-iso-ccitt(2)
This much will be understood by Net-SNMP even if there are no MIBs present.

Since this is understood to be a tree, I would have expected there to be an uppermost, root node in the tree, but that's not the case with Net-SNMP.  Rather, all of the top-level items are the roots of their own trees, and there is no root node.

Getting a root node

To obtain the first (that is, lowest-numbered) root node (remember, there are multiple root nodes), you call "read_all_mibs".  This returns a "struct tree*", which has the following useful fields:
  • label, a "char*" that is the name of the node.
  • subid, an unsigned integer that is the numeric ID of the node.
  • next_peer, a "struct tree*" pointer to the next sibling of the node.  This will be NULL if there is no sibling node.
  • child_list, a "struct tree*" pointer to the first child of the node.  This will be NULL if there is no child node.
With a basic set of MIBs, you should get back a node with subid=0, label=ccitt and a sibling of subid=1, label=iso.

Trusting node IDs

This whole setup seems pretty straightforward; you'd think that the usual tree-navigation strategies would work.  However, there is just one more item that I need to mention: node IDs are not technically unique.

Due to the magic and wonder of SNMP, it often is the case that two manufacturers (or even one manufacturer at two times) will redefine a node.  That is, the same node ID will have different names ("label" in the structure).

As far as I can tell, Net-SNMP will always store both names; it will choose one of them to be authoritative, and the others will be more or less shortcuts to the "real" name.  These are stored in the tree as siblings.

So, if you are looking for node ".0.0.8.341.1", under node ".0.0.8.341" there might be multiple "1" nodes.  Once you find the first subid=1 node, you should keep iterating through all of the "next_peer" nodes that have subid=1.  The last such node is the node that you actually one.

For example, assume that you've found the first such node; we'll call it "node".  This little one-line for-loop will update "node" to the last node with that node ID.  This will work even if it is the only such node.
int currentSubid = node->subid;
for( ; node && node->next_peer && node->next_peer->subid == currentSubid; node = node->next_peer );

Wednesday, August 1, 2012

init, setrlimit, and ZeroMQ

Linux has the notion of "resource limits" per process.  This conceptually cool because you could cap potentially dangerous processes with regard to memory size, number of threads, real-time priority, and a bunch of other things (including number of open files, which we'll come to shortly).  Practically, this protects a box from crippling itself under the weight of a malicious (or simply buggy) process.

Resource limits come with two notions of limit: a "hard" limit and a "soft" limit.  The hard limit is the limit set by "the system" and acts as a true upper bound on the particular resource.  The soft limit is a limit that may be changed by the process--but only up until the hard limit.

This way, a process may be able to create a vast number of threads, but it must specifically increase its own limit in order to do so.  It's a tiny little check to make you really think about what you're doing, since you have to manually increase the resource limit within the process.

In the shell: ulimit

When in a shell, the "ulimit" command can be used to change these limits from that moment forward.  This is often used to allow any processes spawned by the shell to have a nonzero (usually infinite) sized core file, which can be used to debug a program that crashes.

In the code: setrlimit

In C/C++, you can use "setrlimit" to change the limits.

How hard is "hard"?

In addition to changing the soft limit, you can change the hard limit if you are root.  Let me say that one more time: you can change hard limits if you are root.

Why is this important?  When the "init" process starts, it sets the hard limit for the maximum number of files open at any one time to 1024.  I ran into this limit years ago, and I tried to change the soft limit, but to no avail.  I couldn't get my process (which must run out of "init") to set this limit higher than 1024, not even as root.  It wasn't until recently that I realized that I didn't try changing the hard limit as root.  I naively thought that a limit documented as "hard" would mean that nothing could change it.  Nope.  One quick call and that number was anything I wanted.

So, why was that important?  We've recently begun to use ZeroMQ in our code at work, and it's pretty cool.  Without getting into too many details, ZeroMQ is a a message-passing system that abstracts communication into "messages" that are sent and received from "sockets".  In theory these sockets have nothing to do with Linux sockets; they are magical mailboxes for magical postmen that deliver magical messages.  In practice, they are super clean wrappers around TCP, UDP, and Unix sockets.

What did I have that needed a thousand network connections?  Nothing.  That's where things got interesting.  ZeroMQ's documentation leads you to believe that you should give up on managing global task lists and mutexes and signaling in favor of its notion of a simple message queue and workers that pull from it.  It wants you to focus on the communication aspect of what you're doing rather than getting lost in mutex hell.  As icing on the cake, ZeroMQ provides "in-process" sockets that don't hit the networking stack at all and can only be used within a process.  Perfect for eliminating task lists and custom thread pool code.

What they don't tell you is that every single ZeroMQ "socket" internally uses the "socketpair" call to generate two real sockets for some internal signalling mechanism--in addition to any networking sockets that are needed.  So, if you really and truly adopted their paradigm, throwing in-process sockets everywhere to simplify all of your code, you would easily run out of file descriptors on a process that spawned from "init".

Another limit?

After allowing my process an unlimited number of open files, I found that ZeroMQ was throwing exceptions about "too many open files", even when I was only around 1,000 open files.  This made no sense at all, since my process limit was now over 65,000.  So what was going on?

It turns out that ZeroMQ internally has a maximum number of files that it wants to respect.  That's great, except that that limit can only be changed in "config.hpp", a random file in the source code of ZeroMQ.  In order to increase that limit to acceptable levels, I had to manually patch ZeroMQ's source code, recompile it, and redistribute it to my equipment.

Summary

  • Resource limits place limits on how much a process may do.
  • "root" can arbitrarily change these limits, including the unchangeable "hard" limits.
  • "nofiles" is the limit for the maximum number of open files.  "no" here represents an abbreviation of the word "number" (for whatever reason that those four extra letters couldn't be trusted).
  • To set the limit in C/C++, use "setrlimit" with RLIMIT_NOFILES.
  • ZeroMQ creates two sockets for every ZeroMQ "socket" that it creates.
  • "max_sockets" is ZeroMQ's own constant (defined in "config.hpp") for the number of sockets that it will allow itself to open.

Tuesday, July 17, 2012

PHP, Objects, and Zvals

PHP, for all its faults, is a powerful and versatile language.  Its syntax is fairly simple for what it allows you to, and the ability to write extensions for it allows it do pretty much anything that it can't do on its own.

However, that being said, there is next to no documentation for writing an extension.  If you want to do something more complicated than the few examples that exist on the Web, well, you're going to be in for hours and hours of pain and suffering.  Just creating an extension that compiles, installs, and doesn't crash PHP is sadly an achievement.

Today, we're going to briefly cover working with PHP "objects" within an extension and memory leaks, the one thing that you certainly do not want to introduce into your scripting language (other than segfaults).

Objects and Arrays

Objects and arrays in PHP behave fairly similarly.  For example, you can "foreach" through the properties of an object just like you can "foreach" through the elements in an [associative] array.  In fact, the functions to get the properties of an object or the elements of an array (from a "zval" pointer) each return a "HashTable" pointer.

To get the propery HashTable for an object, you call:
HashTable* your hash table = Z_OBJPROP_P( your "zval*" here );

For an array, you call:
HashTable* your hash table = Z_ARRVAL_P( your "zval*" here );

Creating Objects and Arrays

For me, one of the main reasons of creating a PHP extension in the first place is to speed up some task that was too slow for PHP on its own (for example, creating large structures from some input).  In particular, I like to create objects for things that should be treated as objects.

Fortunately, creating objects and arrays in PHP is pretty easy (if not pretty).

Your first step is to declare, allocate, and then initialize a new "zval".
zval* shinyNewZval = NULL;
ALLOC_INIT_ZVAL( shinyNewZval );

For an object, call:
object_init( shinyNewZval );

For an array, call:
array_init( shinyNewZval );

Adding Properties

Adding properties (or assciative elements) is straightforward after that.

Here are some functions that do exactly what they say they do.  They add a new property (of a particular type) to the object.
add_property_long( your "zval*" here, property name, integer value );
add_property_double( your "zval*" here, property name, floating-point value );
add_property_bool( your "zval*" here, property name, boolean value );

For example:
add_property_long( shinyNewZval, "accountId", 9001 );
add_property_double( shinyNewZval, "balance", 501.40 );
add_property_bool( shinyNewZval, "isActive", true );

For associative arrays, just replace "property" with "assoc" and you have essentially the same functions, but for arrays.
add_assoc_long( your "zval*" here, property name, integer value );
add_assoc_double( your "zval*" here, property name, floating-point value );
add_assoc_bool( your "zval*" here, property name, boolean value );

For example:
add_assoc_long( shinyNewZval, "accountId", 9001 );
add_assoc_double( shinyNewZval, "balance", 501.40 );
add_assoc_bool( shinyNewZval, "isActive", true );

Now, what if we wanted to add a zval as a property?  For example, we may want to add a zero-indexed array as one of the properties.  Well, here are the two functions to do so:
add_property_zval( your "zval*" here, property name, property "zval*" );
add_assoc_zval( your "zval*" here, property name, property "zval*" );

Logically, the two functions would work just about identically.  One would add a property to an object; the other would do the same to an associative array.

Wrong!

The object version (and only the object version) also increments the reference counter on the "zval".  Why is that important?  When a new "zval" is created, its reference counter is set to "1"; that is, you have a reference to it, since you made it. Which makes perfect sense.  When all references to a "zval" are gone (that is, when the reference counter hits zero), then the "zval" is actually freed, allowing its memory to be reclaimed.

For an integer or something similarly small, this would usually go unnoticed in the short term.  However, for giant structures, that memory can add up fast.  Your script could easily run out of memory and quit, or you could cause the box to start swapping (which may or may not be bad for your platform, but for mine, swapping is considered the onset of death).

So, how do we fix this?  Simple!  Just decrement the reference counter that "add_property_zval" so rudely incremented on its own.  The function for this is "zval_ptr_dtor", where "dtor" is PHP's shorthand for "destructor".  Since the Zend API is in C, you're essentially saying, "Please destroy my 'zval'".
zval_ptr_dtor( address of your "zval*" here );

To conclude our example, here is the code for the object version:
// Note: this will increment the reference count on "someOtherZval".
add_property_zval( shinyNewZval, "transactions", someOtherZval );
// So, we need to decrement that reference count afterward.
zval_ptr_dtor( &someOtherZval );

And here is the code for the associative array version.  Note that the reference count is not incremented by the array version of the function, so we do not need to do anything special here.
add_assoc_zval( shinyNewZval, "transactions", someOtherZval );

Monday, January 2, 2012

GLib, GHashTable, and g_direct_hash

One of the things that I love about scripting languages (such as PHP and bash) is the built-in concept of a "map."  A map is a data structure that allows a mapping of a key to some arbitrary value.  Other words for this structure are "dictionary," "associative array," and "hash."  The key must be unique (hence the term "dictionary:" a given key can have only one entry in the dictionary).  It is also usually implemented via hashing (hence the term "hash").  In many languages, the means of manipulating a map are the same as those for manipulating an array (hence the term "associative array").

Maps come in two varieties: unidirectional and bidirectional.  A unidirectional map (arguably the most common kind) is a typical "key  value" mapping; the keys must be unique, but the values may overlap.  A bidirectional map allows you to go the other way as well; you can get the key for a given value (a "key ↔ value").

For this article, I'm assuming that you have a basic knowledge of maps.  To brush up, or for more information, you can visit one of these sites:


For a sweet comparison of mapping support in various languages, you should visit:


Maps in C: GLib's GHashTable
Today, we'll be concerned with "C," since that's the language that my company uses for most of its work.

In C, there is no native map support.  In order to use maps, you will need to either use a library that provides them or make your own.  It is now the year 2012, and if you are planning on making your own mapping library, then you are either (1) crazy, (2) interested in learning about how maps work, or (3) really smart and have either a very unique take on maps or a very unique situation to optimize for.  While there are a few map implementations available in C, I'll be focusing on GLib's "GHashTable," since GLib provides numerous data structures (among other things).

A GHashTable is a unidirectional map, implemented via a hash table.  It does just about everything that you could ever want from a map in C; you can find the documentation for it here:


Since this is C, so you have no template magic to handle types.  Without creating multiple different implementations, a generic mapping implementation must use "void*" pointers to arbitrary values (such as integers, doubles, structures, etc.), and this is what a GHashTable does.  Remember, you must police yourself in terms of data types for the map; GLib cannot do it for you.

To see GLib's GHashTable implementation, you can go here:


The key parts of the structure can be summed up by:
gpointer* keys;
guint* hashes;
gpointer* values;
A "gpointer" is essentially "void*," so both "keys" and "values" are arrays of "void*" pointers, exactly what you'd expect for a general mapping implementation in C.  "hashes" is an array of unsigned integers ("guint").  These three arrays have the same number of elements, and an index in one is an index in the others.

(Note that the actual implementation is not as simple as I'm going to explain.)  If you want to map the text "some key" to the text "some value," then "some key" would (naturally) be the key, and "some value" would likewise be the value.  A GHashTable is a hash, which means that the key is first run through a hashing function that returns an integer (hopefully unique) for the key.  That hash value is stored in "hashes" array at an index that has had some more math magic applied to it (conceptually, the hashes array would be a 1:1 mapping with itself (index 3 would be equal to 3), but GHashTable does some additional work to keep the table from clustering and such); the pointer to the string "some key" is stored in that index in "keys;" and the string "some value" is stored at that index in "values."

GHashTable does not copy the content of the data that you give it; it just sticks the pointer in its arrays.  This means that you have to allocate and free the data for the GHashTable.  So, before you add a value to a GHashTable, you have to make sure that it is allocated in such a way that it will not go out of scope and cause all kinds of dereferencing trouble later.  The logical way to do this is to "malloc" everything that goes into the GHashTable and "free" it when you're done.  There are some conveniences that GHashTable provides to make this easier, but we'll get into that later.

In order for GHashTable to know how to hash the key and compare it to other keys, it needs to know two things: (1) the hash function, and (2) the equality function.  Since two keys can hash to the same value, a further (or "final") step in retrieving or updating values is to do a comparison on the key at the hashed location to determine if it is identical.  If not, GHashTable has some built-in logic around looking in other places for that same key.

Trivial cases: g_int_hash, g_int64_hash, g_double_hash, and g_str_hash
GLib provides four "common" hash and equality functions:
  1. g_int_hash (equality function: "g_int_equal"): This allows you to use a 32-bit integer as a key (remember, the value is 100% up to you).
  2. g_int64_hash (equality function: "g_int64_equal"): This allows you to use a 64-bit integer as a key.
  3. g_double_hash (equality function: "g_double_equal"): This allows you to use a double-precision number as a key.
  4. g_str_hash (equality function: "g_str_equal"): This allows you to use a string as a key.
All four of these functions will dereference the key given to them, which means that you must pass a pointer to the data, and that pointer must remain valid for the entirety of its time in the hash.

For example, this is the implementation of "g_int_equal:"
gboolean g_int_equal( gconstpointer v1, gconstpointer v2 ) {
   return *((const gint*) v1) == *((const gint*) v2);
}

The first thing that this function does is cast the "void*" pointer to an integer pointer, and then it dereferences it.  For strings, g_str_equal calls "strcmp" (which will dereference the pointers).

When using a GHashTable in real life, it is usually a really good idea to use the "g_hash_table_new_full" function, which lets you specify a function to call (i.e., a "destructor") when a key should be removed (and also one for the value).  This saves you the trouble of manually looping through the GHashTable to free up memory.

Here is a complete example of how to create a mapping of an integer to its textual, English name.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <glib.h>

/** This prints out a key/value combination, for use in "g_hash_foreach".
  **/
void printKeyValue( gpointer key, gpointer value, gpointer userData ) {
   int realKey = *( (int*)key );
   char* realValue = (char*)value;

   printf( "%d => %s\n", realKey, realValue );
   return;
}

/** The main function.
  **/
int main( int argc, char** argv ) {
   //! This is our map.
   GHashTable* map = NULL;
   //! This is the pointer to our key data.
   int* key = NULL;
   //! This is the pointer to our value data.
   char* value = NULL;

   // Create the GHashTable.
   map = g_hash_table_new_full(
      g_int_hash, g_int_equal, //< This is an integer hash.
      free, //< Call "free" on the key (made with "malloc").
      free //< Call "free" on the value (made with "strdup").
   );

   // Insert some values.
   // 7 => seven
   key = malloc( sizeof(*key) );
   *key = 7;
   value = strdup( "seven" );
   g_hash_table_insert( map, key, value );

   // 10 => ten
   key = malloc( sizeof(*key) );
   *key = 10;
   value = strdup( "ten" );
   g_hash_table_insert( map, key, value );

   // 9001 => over nine-thousand!
   key = malloc( sizeof(*key) );
   *key = 9001;
   value = strdup( "over nine-thousand!" );
   g_hash_table_insert( map, key, value );

   // Print out the contents of the map.
   g_hash_table_foreach( map, printKeyValue, NULL );

   // Destroy the map.
   g_hash_table_destroy( map );
   map = NULL;

   return 0;
}

The output is naturally:
7 => seven
10 => ten
9001 => over nine-thousand!

The important thing to note is that no memory is leaked.  A GHashTable created with "g_hash_table_new_full" will do a good job of cleaning up after itself when destroyed (or when removing or duplicating keys).

What would happen if we didn't pass "key" the pointer to "g_hash_table_insert," but instead just passed the integer that we wanted to use?  We would immediately segfault because "g_int_hash" would dereference "7" (our first integer) and find that 0x0...07 is not anywhere in our process space and crash.

As you can see, GHashTable is pretty cool, but it gets ugly when it comes to allocating memory for the keys and values.  It makes sense to allocate memory for the value (since I used a string in this case), but we're not used to allocating memory for integers.  However, all four of the "trivial" hash types provided by GLib will work this way.

Down and dirty case: g_direct_hash
So, now that we have an example under our belts and we understand how GHashTable is using our data, we can look at an "advanced" mechanism for hashing, "g_direct_hash."

Going back to the implementation of GHashTable, we can see that "keys" and "values" are arrays of "gpointer"s.  What this really means is that we can stick anything we want in "keys" and "values" as long as it can fit within "sizeof( gpointer )".  On a 32-bit system, that is 4 bytes; on a 64-bit system, that is 8 bytes.  So, in reality, we could stick an actual integer into that "gpointer" array as long as we can tell GHashTable not to dereference it.

GHashTable provides exactly this mechanism via "g_direct_hash" (and "g_direct_equal," its equality counterpart).  By using the "g_direct_hash" function, the "pointer" given to GHashTable will be used as the input to the hash function, not the value that it "points" to (since it is not a pointer and doesn't point to anything).

Using a "g_direct_hash" has the following advantages over the other hash types:
  1. No memory is allocated for the key.  Memory allocation is expensive, whether or not you may realize it.  In a high-performance application, this can be a huge saving.
  2. Hashing and comparison are faster.  Since we don't need to dereference the pointer to the key, we save an operation (potentially expensive, since the value may well be outside of the CPU cache).
  3. There is no cleanup for the key.  Since we didn't allocate anything going in, we don't have to free anything going out.
However, a "g_direct_hash" has some disadvantages:
  1. It is architecture-dependent.  On a 64-bit system, it can store 8 bytes of data, whereas on a 32-bit system, it can only store 4 bytes of data.  It is up to you to make sure that you know what you are doing.
  2. It can only store structures that are no larger than a "gpointer."  Your key should essentially be a 64-bit value on a 64-bit system (or use 32-bit values if you want compatibility with 32-bit systems).
Since the "values" array is implemented exactly as the "keys" array, you can apply the same logic to storing the values.  This makes for very efficient mappings between integers and doubles, but not so much for strings over 7 characters (7 characters + "\0" = 8 bytes = 64-bits).

To make a GHashTable accept a non-pointer value, you have to coerce it into doing so using one of GLib's type-conversion macros: GINT_TO_POINTER.  To get your key (or value) back, you need to convert the other way: GPOINTER_TO_INT.  Both of these will be used below in our example.

For example, to insert our ( 7, "seven" ), we would do:
value = strdup( "seven" );
g_hash_table_insert( map, GINT_TO_POINTER(7), value );

Here is the example from above using "g_direct_hash".
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <glib.h>

/** This prints out a key/value combination, for use in "g_hash_foreach".
  **/
void printKeyValue( gpointer key, gpointer value, gpointer userData ) {
   int realKey = GPOINTER_TO_INT( key );
   char* realValue = (char*)value;

   printf( "%d => %s\n", realKey, realValue );
   return;
}

/** The main function.
  **/
int main( int argc, char** argv ) {
   //! This is our map.
   GHashTable* map = NULL;
   //! This is the pointer to our value data.
   char* value = NULL;

   // Create the GHashTable.
   map = g_hash_table_new_full(
      g_direct_hash, g_direct_equal, //< This is an integer hash.
      NULL, //< There is nothing to free for our key.
      free //< Call "free" on the value (made with "strdup").
   );

   // Insert some values.
   // 7 => seven
   value = strdup( "seven" );
   g_hash_table_insert( map, GINT_TO_POINTER( 7 ), value );

   // 10 => ten
   value = strdup( "ten" );
   g_hash_table_insert( map, GINT_TO_POINTER( 10 ), value );

   // 9001 => over nine-thousand!
   value = strdup( "over nine-thousand!" );
   g_hash_table_insert( map, GINT_TO_POINTER( 9001 ), value );

   // Print out the contents of the map.
   g_hash_table_foreach( map, printKeyValue, NULL );

   // Destroy the map.
   g_hash_table_destroy( map );
   map = NULL;

   return 0;
}

This concludes our discussion of GHashTable.  Remember, you have to allocate and free everything that goes into a GHashTable on your own; you may use "g_hash_table_new_full" to have GHashTable do the freeing for you, however.  You may take advantage of the knowledge of your architecture to use non-pointer data for your keys, using "g_direct_hash" (and your values, using your own code).