The pi searcher uses a combination of two search techniques:
*Linear search* and *indexed search*. We'll see why
in a few minutes. This has changed over time - see the
Pi Searcher Updates list if you'd
like to learn more.

Before Pi day 2006, this was the only way the pi searcher
operated. The pi searcher had a file with the first 200M
digits of pi. When a visitor entered a search, the program
would open the file and go through it digit by digit, comparing
them against the search string.
For example, if you
searched for `9265`, the program did:

Pi: 141592653589793238462643... Query: 9265 (nope) Pi: 141592653589793238462643... Query: 9265 (nope) Pi: 141592653589793238462643... Query: 9265 (nope) Pi: 141592653589793238462643... Query: 9265 (nope) Pi: 141592653589793238462643... Query: 9265 (yep, it matches!)

For a computer, having a lot of Pi is like a human having a lot of pie: It makes it fat and slow. If you store Pi in a normal, human-readable text file (like you can create in Notepad, Emacs, vi, or whatever your favorite editor is), then Pi requires one byte per digit. So 200,000,000 digits of pi takes up 200,000,000 bytes, or 200 megabytes. Even with fast computers, it can still take a few seconds to read 200 megabytes of data from a hard disk, and a significant fraction of a second even to read through it when it's already in memory!

Version 2 of the Pi searcher used only half this space.
Observe that Pi in decimal consists of only 10
characters (0, 1, 2, 3, 4, 5, 6, 7, 8, 9), unlike general
files that can contain 256 different ASCII characters (0-9, a-z, A-Z, and
all of the punctuation characters, and then a big set of "unprintable"
characters). Computers store information as **bits** - ones and zeros.
Normally, one character is stored as a byte, or 8 bits. (2^{8} = 256, the
number of different characters). (Using the more modern Unicode
representation, they can be even larger.)
But we only have 10 possibilities, so
the Pi searcher can store those using 4 bits each (2^{4} = 16, just a little
more than we need).

So, the solution: The pi searcher stores the digits of Pi packed two digits per byte. As a result, the file of Pi is totally unreadable to a human, but it's half the size of the human-readable version!

If you were counting carefully, you'll have noticed that we use 4 bits
(which can hold 16 values) to store 10 different values. So there's
actually more room for savings: We technically need only log_{2} (10) =
3.322 bits to store each number. Unfortunately, saving that extra .6
bits per digit makes the code a lot more complicated, and slower.

For longer search strings,
strings, the Pi Searcher gets a little bit more clever
and packs the first four digits of the search string
the same way it packs the digits of Pi. It then goes through and
compares the *packed* digits.

Pi: 14 15 92 65 35 89 79 32 38 46 26 43... Query: 65 35 ^^ both checked in one step, because they're just one byte Pi: 14 15 92 65 35 89 79 32 38 46 26 43... Query: 6 53 5 ^^^ still has to do multiple comparisons because they bytes don't line up

With this optimization, every other step, the Pi searcher only has to do one comparison to check two digits, instead of one comparison per digit. We also throw in a few little optimizations (like comparing four digits in a row before checking to see if any of the comparisons worked) that make the linear search faster on a modern CPU. This happens because it makes the resulting code have fewer branches (if-then choices), which can be fairly slow.

When developing the pi searcher, I benchmarked a number of cool, fancy
linear search algorithms that are supposed to be faster than the basic
algorithm. In general, the linear search I described above - even the
Pi searcher's optimized one - takes time proportional to the number of
digits of Pi *and* the number of digits in the search string:
Pretend you're comparing two strings: 11110 and 11111. You compare
the first digits -- match! So then you compare the second digits, and
the third... and you don't realize that they're different until you
get to the last digit.

There are improved search algorithms, such as Boyer-Moore or Knuth-Morris-Pratt, that require fewer comparisons. With the Pi searcher, it turns out that the complexity of the "faster" algorithms actually makes them slower for searching Pi. They are, however, cool algorithms; Wikipedia's String searching algorithm entry has more information about them if you're curious.

All of the optimizations I mentioned above help, but in the worst case, it still takes a few hundred million comparisons to figure out whether a string was found, which took about a second or two on our old hardware. A second isn't bad, but during the busiest periods of Pi day, the pi searcher handles about 5 searches every second. Clearly, taking one second is way too long.

The pi searcher now has a pre-computed index of
the location of all of the substrings of Pi. The index is computed
using the
suffix array
code from an implementation by
Sean Quinlan and Sean Dorward.
(Cool trivia: Suffix arrays were invented in 1990 by Udi Manber,
who subsequently was chief scientist at Yahoo!, then the chief
algorithms officer at Amazon, and who is now at Google.)
The index takes up 8 times the space
of the compressed digits, but that's only about a gigabyte.
This pre-computed index lets us jump almost immediately to the
place where a particular search string is.
But how can we index every search string, when people could search
for *anything*?

The answer is the beauty of suffix arrays. Let's say we're indexing the first 10 digits of pi:

1415926535

The suffix array maintains a list in lexicographical order of where strings start in pi. If we look at those 10 digits, we see that the "smallest" string is the one that starts with "141", and the next smallest is "159", and then "265", and so on. So the suffix array stores the positions at which each of those substrings begins: 0, 2, 5, 8, 1, 9, 7, 3, 6, 4, which correspond to all possible substrings of the first ten digits of pi, in increasing order:

0: 1415926535 2: 15926535 5: 26535 8: 35 1: 415926535 9: 5 7: 535 3: 5926535 6: 6535 4: 926535

So now, if someone asks for a string starting with "5", we can
look in the index and find it quickly, because the index is
in sorted order. A technique known as
binary search
does the trick nicely: We look in the middle of the list. If the
value at the place we look is *larger* than what we're
looking for, we know that what we want is in the first half, otherwise
it's in the second half. And so on, and so on. It's much like how
you might search through a phone book to find someone's number.

If you were sharp-eyed, you noticed that there were possibly
*many* things in the index that match a particular query. For
instance, if you searched for "5", you would have three possibilities.
To make things worse, the entries in the index are stored in
lexicographical order; the order has nothing to do with *where*
in Pi the entries occur. Note how the first "5" in there is actually
the one at the very end! But the Pi searcher needs to return the
results in order.

Resolving this isn't too hard: Once we find the values in the
index, we do a linear search through all of them that match the
search query and take the smallest. Unfortunately, that could
be a *lot* of values. For example, 10% of the substrings
of Pi begin with the digit "5". So if someone wants to find the
first 5 in Pi, we might have to scan through twenty million
entries still. :( That's no good. So this is where we combine
linear search and our index search:

- If the search string is 5 digits or smaller, do linear search. With five digits, we know that this should take around 100,000 digits to find.
- If the search string is six digits or longer, do an index search. We know that only 1 in 1,000,000 of the substrings will be a false match to a six digit index search, so we only have to do a linear search on 200 entries.

The observant will note that I didn't split things up optimally. The best way to do this would be to perform linear search on 4 digit strings (10,000 comparisons), and then do an index search where you then have to scan the remaining 2,000 entries, making the maximum number of comparisons smallest overall. (roughly the square root of the number of digits). The reason it's unbalanced is because the 5 digit linear search is really, really optimized, and the linear search in the index is slower. So the time balances out better.

In 2013, we introduced the interactive version of the Pi searcher, which searches dynamically as you type. Doing this substantially increased the number of queries that the pi searcher server had to handle, so we introduced two new improvements:

**Client-side computation**: When you visit the Pi Searcher now, your browser downloads the first ten thousand digits of Pi. When you first start typing into the box, your browser will search in those local digits instead of contacting our server. Only once you start asking for information not found in the first 10,000 will it move on.**Server-side improvements**: We've written these up as a blog post. The basic idea is that the Pi searcher code now runs 100% of the time, instead of just when someone makes a query, so it doesn't have to start up and shut down each time.

Hopefully, with these improvements, we'll be able to handle Pi Day 2014!

In 2013, we moved the Pi Searcher to Amazon EC2. It ran on a "small" instance, which is the cheapest and smallest type of instance we could buy that still has enough memory to run the pi searcher.

In 2015, we moved from EC2 to Google Compute Engine. I'm trying out running on a bigger instance - 1 vCPU and 3.75GB of RAM - to improve performance. It's a bit more expensive, but it gives more headroom to handle traffic spikes and Pi day.

From 2010 to 2013, The pi searcher ran on a 1.6Ghz Intel Atom-based server with 2GB of RAM and a 2 terabyte hard drive, purchased in late 2010. This is a very modest server by normal standards - but it's extremely energy efficient. The server itself is inspired by Dave's research in his day job, in the Fast Array of Wimpy Nodes (FAWN) project, which is working to improve the energy efficiency of large scale Internet services.

It took a little bit of creativity to fit this small computer into the rackmount case that the old pi searcher server (a 1.7Ghz pentium 4 server with 512M of RAM and an 80GB hard drive) came in, but it worked, and you can see the results here:

The rackmounted Pi Searcher was hosted by Xmission in their very cool co-location facility in Salt Lake City, UT. (Their fingerprint scanner for door access rocks.) The server runs a version of FreeBSD and uses the Apache web server. The first version of the Pi searcher was written in C, and I re-coded it in 2011 in C++.

Prior to that, the Pi searcher ran on a server at an ISP I helped start, ArosNet, from about 1996 to 1999.

Back to the Pi Searcher or...