Tuesday, February 5, 2013

It's 2013; use "getaddrinfo" for sockets

If you're anything like me, you've come to accept this as true:

  • Connecting to anything via a scripting language is easy; if it's a domain name, then it'll be translated.  If there's a port number at the back, then it'll be extracted.  IPv4 or IPv6?  No problem.
  • Doing the same amount of work in Linux using C/C++ is a nightmare.
Well, it turns out that that's not the case.  However, because of the usual man page obfuscation involving networking, it has seemed that way to me for years.

Calling "getaddrinfo"

The function "getaddrinfo" is meant to be the starting place for network connections.  It takes two strings, optional hints, and a pointer to use for your result structure.  Seems simple, right?

Conceptually, you call it in this way:
struct addrinfo* targetAddress = NULL;
int result = getaddrinfo(
   "your-thing.example.com",
   "80",
   NULL,
   &targetAddress
);
if( result != 0 ) {
   cout << "Error resolving address: " << gai_strerror( result ) << endl;
   // ...
}
// ...
freeaddrinfo( targetAddress );

Ultimately, this takes "your-thing.example.com", resolves it to an IP address (it could be IPv4 or IPv6, or both); takes "80", and resolves it to a port number (this could have been "http" and it would still work); and puts the resulting information into "targetAddress", which it allocates for you.

When you're done, free "targetAddress" using "freeaddrinfo".

And yes, this thing has its own error function, "gai_strerror".

Using a "struct addrinfo*"

The result of "getaddrinfo" can be immediately used by "socket":
int socketHandle = socket(
   targetAddress->ai_family,
   targetAddress->ai_socktype,
   targetAddress->ai_protocol
);
if( socketHandle < 0 ) {
   cout << "Could not create socket." << endl;
   // ...
}
int result = connect(
   socketHandle,
   targetAddress->ai_addr,
   targetAddress->ai_addrlen
);
if( result != 0 ) {
   cout << "Could not connect to target address." << endl;
   // ...
}

But don't you normally have to specify the kind of socket?  Is it a stream- or datagram-based?  Is it IPv4 or IPv6?  What network protocol should it use?

That's where things get interesting.  So, a "struct addrinfo*" actually represents a linked list.  "getaddrinfo" then returns a list of possible results.  If we told it only the domain name and port, then it would return many different "struct addrinfo*" instances, all linked together starting from the original one.  To access the next one, use "targetAddress->ai_next".

Now, that might be kind of cool in the sense that you can simply loop through the results and keep trying them until one works.  But you probably know in advance what you're looking for.  I know that I generally only want IP-version-agnostic name translation.

This is where hints come in.

The third parameter to "getaddrinfo" is another "struct addrinfo*", but this time, you get to fill it out, and it'll use the contents of that structure as filter criteria.

For example, if you wanted to connect to "your-thing.example.com:80" over UDP, then you would just set up the hint accordingly:
struct addrinfo hint;
memset( &hint, 0, sizeof( decltype(hint) ) );
// Allow either IPv4 or IPv6.
hint.ai_family = AF_UNSPEC;
// Use a datagram socket (since that's what UDP is).
hint.ai_socktype = SOCK_DGRAM;
// Don't worry about flags.
hint.ai_flags = 0;
// Use UDP, specifically.  UDP is protocol number 17.
hint.ai_protocol = 17;

struct addrinfo* targetAddress = NULL;
int result = getaddrinfo(
   "your-thing.example.com",
   "80",
   &hint, //< Specify the hint here.
   &targetAddress
);

Now, since your protocol information is all filled out already, you're either going to get back something that you can immediately use, or "getaddrinfo" will fail.  Either way, the guesswork is gone.

No comments:

Post a Comment