brought to you by REVSYS
Christmas is weird.  What other time of year do you sit in front of a dead tree and eat candy out of your socks?

Unix Sockets

Most larger sites that are seeking performance can't use this tip, but it's one of those things that is often forgotten about but should be in your bag of tricks.

If your apps are all contained on the same server or cloud instance, you can get some big wins by not using a usual TCP socket when connecting to other software. We don't really think about it often, but there is a lot of work going on with TCP. Work that is only really necessary for communicating to remote systems. By using a Unix Domain Socket instead you skip that extra unecessary work entirely.

So where can we use this trick? Some of the best contenders for this trick are:

  • PostgreSQL / MySQL
  • Memcached
  • Redis


Configuration is pretty simple, instead of using you end up specifying a file system path to the socket. For example, to setup Django and memcached you would use:

    'default': {
        'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',
        'LOCATION': 'unix:/tmp/memcached.sock',

All we've had to do here is change the LOCATION to be our Unix Socket's path instead of the usual Conversely, when we out grow a single box scenario we just need to adjust this setting to point to the IP and port of our, likely dedicated, memcached system.

The configuration for individual server software will vary, but you get the idea.

How much faster is it?

Quite a bit faster actually. This redis benchmark shows a roughly 40% boost for simple GET and SET operations.

Bruce Momjian, of the PostgreSQL project, ran some numbers and saw a 30% improvement using Unix Sockets over a TCP/IP loopback.

Which makes sense, considering Unix Domain Sockets end up copying less data and incurr less CPU context switches on most systems. With less work, comes better performance.

Other Scenarios

We've already talked about the usual single server instance scenario that many web apps can take advantage of, but here are some other ideas where you can make use of this:

  • With Jenkins or your CI system to speed up your database related testing
  • Worker type scenarios. Where a system is given tasks to perform and the local state information doesn't need to be shared. For example, a single box running a web spider keeping a bloomfilter of visited links in Redis. Ultimately the results of the work/task are pushed to a central location, but the rest of the in process state isn't shared.
  • Local data processing. Often you need to process a large amount of data locally and only once, before pushing the results into another system.

While this may not be the most useful tip in our 12 Days of Christmas Peformance, we hope you learned something new this holiday season!