How programmers multiply

This afternoon, a discussion about base64 strings prompted the question “How much is 64^2?”. I didn’t have a calculator in hand, and I figured I should be able to calculate 64*64 in my head. The thought process went something like this:

64 is 256/4, and I know by heart that 256 = 2^8, and 4 = 2^2, thus 64 = 2^(8-2) = 2^6.

By the same method, 2^6 * 2^6 = 2^(6+6) = 2^12.

I have memorized a few more powers of two besides 256, and the easiest to remember is 2^10 = 1024.

2^12 = 2^10 * 2^2, so the answer I’m looking for is 1024 * 4 = 4096.

64^2 = 4096. Easy to do in your head when you know arithmetic (or you’ve worked too much on CPUs that don’t have MUL or DIV operators).

Code Kata: Prefix search (Part 3: more fun with tries)

In my last post, I demonstrated that crit-bit trees are really fast. This time, I want to talk a little bit about my specific implementation of these tries, and some interesting applications you can use them for.

I also want to acknowledge some of the prior art that I looked at before deciding to roll my own implementation. Professor Bernstein himself has written an implementation himself for his portable qhasm assembler. But since it’s part of a larger software product, I didn’t want to disentangle it. Adam Langley has an implementation on github with some outstanding documentation that was very useful, but it only stores zero-terminated strings, not arbitrary data. While I didn’t read his code too closely to keep my own implementation challenging, I did steal the clever bit-mask trick from it.

More Things To Do With Tries

While they are really good at string lookup, what crit-bit trees are best at is finding strings with a common prefix. I implemented two separate functions for this:

const void * cb_find(critbit_tree * cb, const void * key, size_t keylen);

int cb_foreach(critbit_tree * cb, const void * key, size_t keylen, int (*match_cb)(const void * match, const void * key, size_t keylen, void *), void *data);

The first function is straightforward, it looks for an exact match of the first keylen bytes of key in the given tree. The second function calls the match_cb callback on every match, which is a much easier pattern to implement than trying to return all matches. Where would you store them? Dynamic memory allocation is right out, of course. I compromised a bit by writing the b_find_prefix function, which takes a buffer supplied by the caller, but beare that paging through the results by calling it repeatedly is quadratic in complexity, so you shouldn’t do that. It’s really just a hack.

Use As A Fast Key-Value Store

One really nifty thing that I have been using this code for is as a fast key-value store. You could insert a couple of strings like “config_value=42” in the tree and look for the first match with the prefix “config_value=”. There are two wrapper macros in critbit.h to simplify this, and being able to store data other than zero-terminated strings means I am not limited to storing string values. Here’s some example code that shows their use:

int i = 42;
char buffer[20];
const char * key = "herpderp";
void * match;
size_t len;

len = cb_new_kv(key, strlen(key), &i, sizeof(int), buffer);
cb_insert(&cb, buffer, len);
if (cb_find_prefix(&cb, key, strlen(key)+1, &match, 1, 0)) {
cb_get_kv(match, &i, sizeof(int));
}

What these macros essentially do is this: cb_new_kv creates a key|0|value memory block (written into buffer). Once you insert it, you can search for it by prefix-search for the key plus its zero-terminator, and use cb_get_kv to extract the value from the first match.

Testing

I approached this project with a test-first attitude, writing tests before writing the actual implementation, and fixing bugs until the tests passed. I wrote almost as many lines of tests for this as I wrote actual code, and it still wasn’t enough – one or two small issues slipped through. Lots of pointers and memory-pokery is always tricky, and I think this is one of those cases where it should be undisputed that unit tests are just incredibly useful.

Code Kata: Prefix search (Part 2: crit-bit trees)

The problem with the prefix search problem I described in my previous post is that there is most likely no good data structure in your language to do this with. There certainly isn’t one in C. But the good news is that we get to build our own! How often do we still get to write data structures from scratch in this day and age? And when is it ever a good idea?

Of course, you should always try to go with the simplest thing that works: In my case, that’s building an array of your million strings, and to search for a prefix, just strncmp each entry. We’ll benchmark that, see how well it performs, and use it as our baseline for a good implementation. 

Here’s a link to the naive implementation that I wrote. Finding a needle in a haystack of a million strings takes almost not time at all, but building the array takes a while, so I’ll just do 1000 searches, each for a single needle, in a set of  a million strings, followed by 1000 searches for a prefix that occurs exactly 11 times:

$ bin/naive -x 100 1m 111111
init: 0.30s
find: 0.82s
$ bin/naive -x 1000 1m 111111
init: 0.26s
find: 7.51s

8 ms per lookup may not sound like much, but it’s actually not a very good time at all. Excellent! We like this in a naive solution that we want to beat 🙂

Introducing crit-bit trees

The data structure we want for this sort of thing is conventionally known as a radix-tree or trie. I think that’s either pronounced like try or like tree, but either way it’s really confusing. Also, it really helps to think of the data we are operating on as a string of bits, not characters. The specific type of trie that I decided to implement is from this paper by the always amazing djb.

The general idea of this approach is that we have a binary tree, and if some of our bit-strings have a common prefix x such that one string has a prefix of x0 and the other x1, then the tree shall contain a node where all the strings beginning with x0 are found to the left of the node, and all strings starting with x1 are found to the right.

What’s neat is taht we don’t need a node for every possible bit position either, but only for those where actual strings in or dataset diverge. To give an example, if we had only two strings 0001001 and 0100100, (these start to differ at the second bit), then we create an internal node in my tree that contains the bit-position, and two child pointers which point at external, or leaf, nodes that contain our two strings. To find a string in this, we follow a trail of internal nodes from the root, check the bit position in our needle string, and recurse down to the respective child node. Once we reach a leaf, we compare our needle to the string stored here, because it’s possible that our needle and the stored string differ at other bits that we did not check on our way down.

Apart from being a data structure for key lookups, an interesting property of tries is that all entries beginning with a particular substring are children of the same node. That makes this data structure the perfect candidate for our problem of prefix search. So, how fast is it? Actually, it’s way too fast to measure with just 100 searches, but we can do simple divisions, so let’s do a million lookups and prefix searches instead:

$ bin/benchmark -x 1m 1m 111111
init: 0.62s
find: 0.25s
$ bin/benchmark -x 1m 1m 11111
init: 0.66s
find: 1.09s

250 nanoseconds per lookup, and a microsecond per prefix-search. That’s a pretty neat improvement of four orders of magnitude! In my next post, I am going to talk a little bit about the actual implementation and interface. I have put the source code on github.

Some Thoughts

  1. My dataset is pretty small. A million keys like this can probably fit into a modern CPU’s L2 cache if they are small enough. But trust me, the numbers are only getting better when the dataset grows!
  2. Each internal node is really small: I need two pointers, a word to store the bit position (my implementation is excessive and uses size_t, plus a byte to store a bit mask. You could go a little smaller at the cost of performance and/or size limitations, but it’s on the order of one or two dozen bytes. All internal nodes are the same size, so you would want to pair this with a clever memory allocator.
  3. Initialization is somewhat slower, because creating the tree is more complicated than just doing a couple of allocations. However, you could conceivably store the final tree in a compact format on disk once it’s generated, and load it as a big chunk of memory, where you replace a pointer to an internal node with an integer index into your array of internal nodes.
  4. The crit-bit tree also allows fast deletion, which I haven’t benchmarked, mostly because I have nothing to compare it to.
  5. All times are taken with clock() on a JoyentCloud VM. I have no idea what kind of CPU that translates to.

Code Kata: Prefix search (Part 1: a challenge)

Algorithms and data structures are fun. The other week, I had the chance to implement something I needed for a project: Given a large set of strings, find all the ones that begin with a given prefix. For example, say you have a phone book, and you want to find all the people whose name starts with Joe. Or a large list of IP addresses, and you want to find all the ones that start with 192.168.

This is a fun problem, because I don’t think any popular programming languages have good data structures to do this. Let me know if I’m wrong, but I can’t think of a naive way to implement this in Javascript or with the STL in C++.

I wrote my own solution in C, based on a paper by D. J. Bernstein. It is super fast, and has some neat additional properties that I want to write about, but I’ll save that for one of my next posts (have to get this post count up). I’m putting the resulting code on github for the world, but I’ll give you a chance to think about the problem before I spoil it 😉

If you are tempted to write your own solution, here’s a clever little benchmark I came up with: Insert all the numbers from 0 to 999,999 into your data structure (as strings), then find all the ones where the first character is a ‘1’. You should get exactly 111,111 matches, and the entire process should probably take less than a second.

SDL on Android

Here’s a quick summary of how I got SDL to work on Android. First off, this is my setup:

  • Android NDK r6
  • Android SDK r12
  • SDL-2.0.0-6284

The first piece of bad news is that Android support is only in the upcoming SDL 2.0, not in the current stable 1.2. Although I’m sure it could be backported, SDL2 is pretty similar, so I decided to upgrade. The README.android file does a pretty good job of explaining the steps, but there were a few kinks:

  1. in jni/src/Android.mk, the line LOCAL_SHARED_LIBRARIES := SDL is wrong. That needs to be SDL2
  2. Java didn’t like that all the classes in src/org/libsdl/app/SDLActivity.java are in one file. I had to make separate SDLMain.java and SDLSurface.java files.
  3. In the onTouch method, I get a compile error (cannot find symbol) for event.getActionIndex() and event.getActionMasked(), so I commented out some of that code and haven’t used touch events yet. I still need to crack this nut.
  4. Because my SDK is installed in C:Googleandroid-sdk-windows, I had to edit local.properties to say sdk.dir=/Google/android-sdk-windows

Now I have one of the SDL test app running on my phone, drawing rectangles. Time to see if I can do something useful with that!

Windows SDK installation woes. Update: Solved, sort of.

My computer doesn’t have a dxsdkver.h file. That should be in the DirectX SDK. Except that no longer exists, and is now part of the Windows SDK (latest version: 7.1). That should have been installed with Visual Studio, but somehow my computer is very special, once again.

So today I tried installing the Windows SDK, and I get this feeling of deja vu:

A problem occurred while installing selected Windows SDK components.

Setup could not find the file WinSDKhelpWinSDKHelp_x86.msi at any of the specified source locations http://download.microsoft.com/download/A/6/A/A6AC035D-DA3F-4F0C-ADA4-37C8E5D34E3D/setup

I cannot install the Windows SDK, and I have been here before! In fact, my Firefox history still remembers half the links that a cursory Google search lists as visited. I hope it doesn’t mean I have to reinstall Visual Studio.

Update 1: I downloaded the Windows SDK 7.1 ISO, ran the setup from that, and now I get this message:

Installation of the “Microsoft Windows SDK for Windows 7” product has reported the following error: Please refer to SamplesSetupHTMLConfigDetails.htm document for further information.

What the Hell? That tells me nothing. Google points me to this page of regedit fuckery, which seems like something I tried last time I was here, but will be the next thing I try.

Update 2: Well, that didn’t work either. The next advice I followed was from this forum post, which suggested to uncheck the compilers (since I believe I have the SP1 compiler installed), but again, no dice. Time to try installing one component at a time, and fine-combing the log file each time one fails.

Update 3: Never mind what I just wrote. It turns out even when I deselect every single option in the installer, the installation still fails.

=== Logging started: 2/19/2012  18:05:16 ===

Action start 18:05:16: INSTALL.

Action start 18:05:16: DDSE_CA_Uninstall_InstallExecuteSequenceStarts_x86.

02/19/12 18:05:16 DDSet_Status: LANGID: 1033

02/19/12 18:05:16 DDSet_Entry: ImmediateDispatch: DDSE_CA_Uninstall_InstallExecuteSequenceStarts entry

02/19/12 18:05:16 DDSet_Error: Patch Hooks: Missing required property ‘ProductFamily’: Setup cannot continue.

02/19/12 18:05:16 DDSet_Warning: Setup failed while calling ‘getDLLName’. System error: Cannot create a file when that file already exists.

Every log file has entries like this, which seems to indicate that something that’s already installed is causing an issue with this installer. WTF.

Update 4: Progress!

Further on, the log file said that

vcredist_x86.exe installation failed with return code 5100

On a whim, I uninstalled the Visual C++ 2010 x86 Redistributable from the control panel, then tried the “empty” installation again, and Heureka! It worked.

Maybe now I can install the full product?

Installation of the “Microsoft Windows SDK for Windows 7 Compilers for x86” product has reported the following error: Fatal error during installation.

Let me just uninstall “Microsoft Visual C++ Compilers 2010 SP1 Standard – x86. I’m pretty sure I can install them again later. and oh wonder, the installation finishes!

Still doesn’t include dxsdkver.h, though. I think my initial research must have been faulty.