Ruby on Rails: Sphinx, thinking-sphinx and PostgreSQL on Mac OS X

Premise

On a project of mine, I needed a full text search feature and after a bit of digging, decided to go with Sphinx. It seems like a very proven search engine and with Rails, it’s easy to use through the thinking-sphinx plugin. Normally I just go with the standard SQLite database, if the application doesn’t require a high powered database backend. Unfortunately Sphinx does not yet work with SQLite and as far as I know, needs to run against either MySQL or PostgreSQL. The choice of either MySQL or PostgreSQL is a bit religious I feel. They are both battle hardened DBMS’s and I won’t make compelling argument towards either one. This time however, PostgreSQL is the favored candidate.

Requirements

Before starting, make sure you can compile custom software on your system. For this it is assumed you have the following installed:

PostgreSQL

First things first, we need to install our database server and enable access from rails. PostgreSQL is readily available through MacPorts, so open up a terminal and enter the following command:

sudo port install postgresql84 postgresql84-server

When that is done we need to add the bin folder of the PostgreSQL installation to our PATH:

nano ~/.bash_profile

And then, depending on where your MacPorts installation puts your ports, mine is under /opt/local, add the following line to the file:

export PATH=/opt/local/lib/postgresql84/bin:$PATH

Now we can start up the server using the following command:

sudo launchctl load -w /Library/LaunchDaemons/org.macports.postgresql84-server.plist

Or the shortcut version:

sudo port load postgresql84-server

This goes to the background and makes sure the server is started up again after reboot. Next we want to create a default database and make the server listen for connections:

sudo mkdir -p /opt/local/var/db/postgresql84/defaultdb
sudo chown -R postgres:postgres /opt/local/var/db/postgresql84
sudo su postgres
initdb -D /opt/local/var/db/postgresql84/defaultdb
pg_ctl -D /opt/local/var/db/postgresql84/defaultdb -l ~/postgresql.log start

The logfile postgresql.log is placed in /opt/local/var/db/postgresql84/.

Although I’m a fan of using the command line, I recommend using a tool such as pgAdmin for everyday administration tasks, such as adding new users etc.

Per default, you can login with postgres as user, with a blank password. This is the default superuser account so I suggest changing it sooner rather than later.

Sphinx

Now Sphinx is available through MacPorts as well, even with a postgresql84 variant. I’ve tried installing this version, but couldn’t seem to get through without error. Somehow it still maintained some library dependencies for mysql5 and thus wouldn’t compile. So, instead we opt for a manual installation. First we need to install a couple of dependencies for Sphinx. Download these two archives and extract their content:

You should now have two folders named expat-2.0.1 and libiconv-1.13. From a terminal, navigate to each folder and type the following commands:

./configure --prefix=/usr/local
sudo make && sudo make install

Now it’s time to install Sphinx. Download it and extract as before:

Navigate to the sphinx-0.9.9 folder and enter:

export LDFLAGS="-L/usr/lib"
./configure --prefix=/usr/local --with-pgsql --without-mysql
sudo make && sudo make install

And that is it for Sphinx.

thinking-sphinx

Assuming you already have a Rails project, install the thinking-sphinx plugin like so:

script/plugin install git://github.com/freelancing-god/thinking-sphinx.git

This will add a bunch of new features and a couple of rake tasks. Now, let’s say we have a model Employee, with first names and last names and we would like to be able to search employees by either one. In our model, we can define what we want to have indexed, using the define_index method:

class Employee < ActiveRecord::Base
 
    define_index do
        indexes first_name
        indexes last_name
        set_property :enable_star => true
        set_property :min_infix_len => 4
    end
 
end

The set_property calls, are to enable wildcard searching with asterisks, *, and to index substrings. Eg. If we have an employee named “McLovin”, then we would get matches on “McLo”, “cLov”, “Lovi” and so forth.

Before we can utilize these indices, we need to tell Sphinx to do an index run. In your projects directory, issue these two rake tasks:

rake thinking_sphinx:index
rake thinking_sphinx:start

The Sphinx server should now be running in the backround and in your controller, you can now search for Employee names like so:

class SomeController < ApplicationController
 
    def index
        @employee = Employee.search params[:first_name]
    end
 
end

For more detailed information, see the documentation for Sphinx and thinking-sphinx:

References