Wednesday, November 27, 2013

Chrome/Chromium, Roboto, and the horrible text nightmare

I began doing Android development a year or so ago, and the first thing that I noticed was that the official Google Android documentation looked absolutely horrible in Chromium (and Chrome too).  I tried updating fonts, disabling fonts, and installing new fonts, and none of that helped.  On other people's computers, the text looked fine.  In Firefox on my own computer, the text looked fine  On my computer, in Chromium: total crap.

One of the directions that I researched was around the Roboto font (which apparently was released with Android Ice Cream Sandwich).  However, it turns out that the font itself has nothing to do with the problem.

Today, I finally figured it out (and fixed it!), and my Internet (and Android development) experience has been much, much better.

First, let me tell you that I've been running Kubuntu (the KDE Ubuntu variety) this whole time.  So, this covers Kubuntu 11.10, Kubuntu 12.04, Kubuntu 12.10, Kubuntu 13.04, Kubuntu and 13.10.  I had the problem with all of them.

Here's what it looks like:


To make a very long story short, I had to enable anti-aliasing for fonts for the entire system.  I typically set all of my graphics settings to the lowest possible levels in order to not have to see stupid animations or other things that slow down my experience just to be on par with Windows, and it looks like font anti-aliasing is one of the settings that got turned off.

So, a quick tweak to the drop down box here:

And everything now looks nice and pretty.  Here's the exact same page from before, this time with system-wide anti-aliasing enabled:


Problem solved.

Monday, October 28, 2013

Beware the void* trap

I'd like to take a little bit of time today to share a problem that took me numerous days to track down and solve because it was so obscure and stealthy.  At work, we use ZeroMQ to handle any inter-process communication, and our primary language is C++.  I had begun a fairly intense project to remove ZeroMQ from the intra-process communications of a particular daemon, leaving it only for when we need to cross process boundaries.  Basically, once the messages that we want get into our daemon, there is no reason to pass them off to threads and such by serializing them to byte arrays (since that what a ZeroMQ message is) when they could be added (as objects) the lists, passed around to thread pools, etc.

As I was nearing the completion of the project, I found that all outbound communication from one of the sections of code seemed to be lost.  This was strange because the inbound communication was handled properly.  I had refactored both directions, so it was quite possible that I had messed something up.  However, no matter how many times I checked the logic, no matter how many log statements I made at each line, nothing looked amiss.  I struggled for days trying to understand why the "send()" calls seemed to be going nowhere, and the answer shocked me.

Here is a simplistic version of the code that I had.  Basically, this code receives (from somewhere else in the program) a list of messages to send to a ZeroMQ socket as a single atomic message.  This is accomplished by adding the "ZMQ_SNDMORE" ("send more") flag to the call.
void sendMessageToEndpoint1( std::list<zmq::message_t*>& messages ) {
   while( messages.size() > 0 ) {
      //! This is the ZeroMQ message that we're going to send.
      //! It has been prepared for us elsewhere.
      zmq::message_t* message = messages.front();
      // Remove the message from the list.
      // The size of this list is now the number of remaining messages to send.
      message.pop_front();
      
      //! This contains the flags for the "send" operation.  The only flag that
      //! we're actually going to set is whether or not we have more messages
      //! coming, and that's for every message except the last one.
      int flags = messages.size() > 0 ? ZMQ_SNDMORE : 0;
      // Send the message to the endpoint.
      // Note that "endpoint1" is of type "zmq::socket_t*".
      endpoint1->send( message, flags );
      
      // We can now delete the message.
      delete message;
   }
}

I promise you that I had logged something before and after every statement, and everything was exactly as I expected it to be.  No exceptions were thrown.  There were no compiler warnings.  But no client on the other side of "endpoint1" ever got any of the messages that were being sent.  This drove me crazy.

The answer is that I was passing the wrong thing to "zmq::socket_t::send()".  Unlike the "recv()" ("receive") call, which takes a pointer to a "zmq::message_t", the "send()" call merely takes a reference to a "zmq::message_t".  Clearly this is a type error, so the compiler should have caught it.  But it didn't.

Here's the signature of the "send()" function.  Basically, it sends a message with some optional flags.
bool send( zmq::message_t& message, int flags = 0 );

I was sending it a "zmq::message_t*" and "int", so the compiler should have reported an error, since the type of the first argument was incorrect.  However, no error (or warning) was printed, and it compiled fine.  Even stranger, nothing bad happened when I called "send()".  Nothing good happened, either, but the code ran with the only strange symptom being that my "send()" calls seemed to do nothing.  The client on the other side of the ZeroMQ socket simply never received the message.

So, what's up with that?

It turns out that there is another "send()" function, one that takes three parameters.  It sends an arbitrary number of bytes with some optional flags.
size_t send( void* buffer, size_t length, int flags = 0 );

And there's the rub.

We've already established that the first "send()" function shouldn't work.  But here's a second "send()" function that does meet our signature.  As for the first parameter, a "zmq::message_t*" will be implicitly cast to "void*" in C++.  As for the second parameter, "int" will be implicitly cast to "size_t", which is just an unsigned integral type.  As for the third parameter, it is not specified, so it'll be set to zero.

This second "send()" is clearly not what I wanted to use, but the compiler doesn't know that I thought that the function required a pointer to a message, not a reference to it.  Since "ZMQ_SNDMORE" is defined to be the number 2, this call to "send()" only attempts to transmit two bytes.  And because a "zmq::message_t" is certainly larger than two bytes (it is actually at least 32 bytes), the data to copy, from the second "send()" function's perspective, is always present.  This means that in addition to not getting any warnings or errors, I am also guaranteed to have this code never crash, since it will always send the first two bytes of the "zmq::message_t" structure.

Naturally, the fix was to send the dereferenced version of the message, and everything worked fine after that.  The moral of the story here is to watch out for implicit "void*" conversion.  And if you are making a library that accepts a byte buffer for reading/writing purposes, please set the type of that buffer to some byte-oriented type, such as "char*" or "uint8_t*".  These would require explicit casts, thus preventing accidental use as in my case.

Sunday, April 21, 2013

Don't let std::stringstream.str().c_str() happen to you

If you're coming to C++ from C, then you will quickly learn to love std::stringstream.  These things let you quickly build out a (possibly huge) string by just tacking on string literals or any other variables to the end.  It's useful for building on-the-fly SQL queries or constructing configuration or connection strings that involve numbers (such as port numbers), since you don't have to pre-define a buffer of known length and snprintf onto the end of it and check for length issues and such.

And you'll love std::string, since that'll save you countless "strdup" calls and null checks.  std::string also has some extra powers that make him way more useful than character buffer manipulation, but still less amazing (and heavy) than std::stringstream.

Anyway, you'll also quickly find that most functions don't accept std::stringstream or std::string; rather, they accept "const char*", which is fine by me.  In fact, std::string has a "c_str" function that will return just such a pointer, and std::stringstream has a "str" function that will return a std::string, so that's great, right?

Yes, absolutely.

But watch out!

But watch out for this:
//! This is our string stream; we're just going to put something
//! in it for fun.  This example will use a made-up connection string.
std::stringstream myStringStream;
// Set up the "connection string"; note for example purposes that
// these could be variables of any type; much like the thing at the
// end is an integer.
myStringStream << "tcp://" << "localhost" << ":" << 9001;

// Create a character pointer so that another function can use it.
const char* myPointer = myStringStream.str().c_str();

// Use that in some function.
someCStyleFunction( myPointer );

Did you see the problem?

When "myPointer" was created, it called "c_str" on a string that was only alive for the duration of that line.  After that line is over, the string that generated the character pointer has been deleted; thus, the pointer to its data is invalid.

Valgrind will complain about this as accessing some memory that was deleted by the destructor of std::string, but you'll probably be too confused to realize what's going on.

In a single-threaded situation, you might be able to slide by without noticing this because nothing has used that memory just yet.  However, in a multi-threaded situation, that memory is essentially instantly whisked up by other threads for other uses.  And now your character pointer points to random other data.  Welcome to what might be hours of troubleshooting and debugging.

The proper solution

Since "c_str" returns a pointer to the internal buffer of a std::string, and since you don't have to free it, it means that the character pointer that it returns is only valid for the lifetime of the std::string that it came from.

Our earlier example could be addressed in one of two ways.

The sneaky way

Don't let the std::string go out of scope by ending the line.  The "str" function's result, a std::string, won't be cleaned up until after "someCStyleFunction" completes, so this gets around the problem.  However, later expansion or debugging of the code might inadvertantly re-introduce it.  Avoid this method.
//! This is our string stream; we're just going to put something
//! in it for fun.  This example will use a made-up connection string.
std::stringstream myStringStream;
// Set up the "connection string"; note for example purposes that
// these could be variables of any type; much like the thing at the
// end is an integer.
myStringStream << "tcp://" << "localhost" << ":" << 9001;

someCStyleFunction( myStringStream.str().c_str() );

The classy way

Actually store the std::string so that it goes out of scope when you want it to.  This makes it clear what the string is for and what its scope is.
//! This is our string stream; we're just going to put something
//! in it for fun.  This example will use a made-up connection string.
std::stringstream myStringStream;
// Set up the "connection string"; note for example purposes that
// these could be variables of any type; much like the thing at the
// end is an integer.
myStringStream << "tcp://" << "localhost" << ":" << 9001;

//! This is the string that we have created with our string stream.
std::string myString = myStringStream.str();

// Create a character pointer so that another function can use it.
const char* myPointer = myString.c_str();

// Use that in some function.
someCStyleFunction( myPointer );

Hopefully this might save you some time.  I spent hours researching the thread-safety of the STL for my current g++ version and was lead down all kinds of wrong paths for a simple, simple scoping issue.

Sunday, February 24, 2013

Why C++11's std::chrono stuff is awesome

C++11 introduced the "std::chrono" namespace, and one of the coolest things about it are the standard "duration" objects.  These ones are already defined:

  • nanoseconds
  • microseconds
  • milliseconds
  • seconds
  • minutes
  • hours
(You can make your own, but that's not the point, here.)

Here are the two important points to take away from this, if nothing else:
  1. All of these are different template definitions of "std::chrono::duration".
  2. All of these may be converted to and from each other (as best as they can).

Real-world example

How often have you seen something like this?
void SomeClient::setTimeout( int timeout );

I've seen this over and over, and it's never easy to figure out what's going on.  Is the timeout in seconds?  Milliseconds?  Microseconds?  None of the above?  There's no easy way to tell; you have to rely on the documentation (if there even is any).

Then you end up with a call like this:
myClient.setTimeout( 500000 );

That doesn't really help.  So, whatever the function expects, this call is sending a lot of that unit of time.  Now it's (1) still dubious as to the units, and (2) hard to read, since there are just a bunch of numbers in a row.  We can't easy know what's going on.

Enter chrono.

Imagine that the function instead looked like this:
void SomeClient::setTimeout( std::chrono::microseconds timeout );

This immediately tells us that the preferred units of measurement are microseconds.  It probably means that any more precise and the function won't care.

So then my call ends up looking like this:
myClient.setTimeout( std::chrono::milliseconds( 500 ) );

Now it's clear that the timeout is being set to 500 milliseconds, and who cares what the function wants.  It doesn't matter.  The units that we pass in will be properly converted.  In this case, they'll be multiplied by 1,000.  Done deal.

What if...?

Finally, what happens if we tried to do it the old way?
myClient.setTimeout( 500000 );

This results in a compile-time error.  Unreadability, you're doin' it wrong.

Tuesday, February 5, 2013

It's 2013; use "getaddrinfo" for sockets

If you're anything like me, you've come to accept this as true:

  • Connecting to anything via a scripting language is easy; if it's a domain name, then it'll be translated.  If there's a port number at the back, then it'll be extracted.  IPv4 or IPv6?  No problem.
  • Doing the same amount of work in Linux using C/C++ is a nightmare.
Well, it turns out that that's not the case.  However, because of the usual man page obfuscation involving networking, it has seemed that way to me for years.

Calling "getaddrinfo"

The function "getaddrinfo" is meant to be the starting place for network connections.  It takes two strings, optional hints, and a pointer to use for your result structure.  Seems simple, right?

Conceptually, you call it in this way:
struct addrinfo* targetAddress = NULL;
int result = getaddrinfo(
   "your-thing.example.com",
   "80",
   NULL,
   &targetAddress
);
if( result != 0 ) {
   cout << "Error resolving address: " << gai_strerror( result ) << endl;
   // ...
}
// ...
freeaddrinfo( targetAddress );

Ultimately, this takes "your-thing.example.com", resolves it to an IP address (it could be IPv4 or IPv6, or both); takes "80", and resolves it to a port number (this could have been "http" and it would still work); and puts the resulting information into "targetAddress", which it allocates for you.

When you're done, free "targetAddress" using "freeaddrinfo".

And yes, this thing has its own error function, "gai_strerror".

Using a "struct addrinfo*"

The result of "getaddrinfo" can be immediately used by "socket":
int socketHandle = socket(
   targetAddress->ai_family,
   targetAddress->ai_socktype,
   targetAddress->ai_protocol
);
if( socketHandle < 0 ) {
   cout << "Could not create socket." << endl;
   // ...
}
int result = connect(
   socketHandle,
   targetAddress->ai_addr,
   targetAddress->ai_addrlen
);
if( result != 0 ) {
   cout << "Could not connect to target address." << endl;
   // ...
}

But don't you normally have to specify the kind of socket?  Is it a stream- or datagram-based?  Is it IPv4 or IPv6?  What network protocol should it use?

That's where things get interesting.  So, a "struct addrinfo*" actually represents a linked list.  "getaddrinfo" then returns a list of possible results.  If we told it only the domain name and port, then it would return many different "struct addrinfo*" instances, all linked together starting from the original one.  To access the next one, use "targetAddress->ai_next".

Now, that might be kind of cool in the sense that you can simply loop through the results and keep trying them until one works.  But you probably know in advance what you're looking for.  I know that I generally only want IP-version-agnostic name translation.

This is where hints come in.

The third parameter to "getaddrinfo" is another "struct addrinfo*", but this time, you get to fill it out, and it'll use the contents of that structure as filter criteria.

For example, if you wanted to connect to "your-thing.example.com:80" over UDP, then you would just set up the hint accordingly:
struct addrinfo hint;
memset( &hint, 0, sizeof( decltype(hint) ) );
// Allow either IPv4 or IPv6.
hint.ai_family = AF_UNSPEC;
// Use a datagram socket (since that's what UDP is).
hint.ai_socktype = SOCK_DGRAM;
// Don't worry about flags.
hint.ai_flags = 0;
// Use UDP, specifically.  UDP is protocol number 17.
hint.ai_protocol = 17;

struct addrinfo* targetAddress = NULL;
int result = getaddrinfo(
   "your-thing.example.com",
   "80",
   &hint, //< Specify the hint here.
   &targetAddress
);

Now, since your protocol information is all filled out already, you're either going to get back something that you can immediately use, or "getaddrinfo" will fail.  Either way, the guesswork is gone.