Sphinx Setup on Mac OS X 10.7

My previous post was talking about how to use thinking sphinx (ruby gem for sphinx) with MySQL and now I would like to share about how to set sphinx up, especially at Mac OS X 10.7 machine. I assume you guys have installed RDBMS (MySQL or PostgreSQL) at your machine.

First, you need to download sphinx at sphinx download page and download the source tarball. Extract the file and get into the folder. Then, execute these commands to install sphinx to your mac machine.

> ./configure
> make
> sudo make install

Yap! Sphinx has been installed to your mac machine and there’s one thing that you need to do, make some configuration for your sphinx. You can create a file named sphinx.conf which will be placed at /usr/local/etc. In my case, that folder is the default location where sphinx will look its configuration file. You can use custom name and location, but you need to add the exact location and file name after --config parameter, whenever you call one of sphinx commands (indexer, search, searchd, etc).

This is the content of my sphinx.conf file

source users {
	type			= mysql
	sql_host		= localhost
	sql_user		= root
	sql_pass		=
	sql_db			= researchsql_development
	sql_port		= 3306	# optional, default is 3306
	sql_sock		= /Applications/XAMPP/xamppfiles/var/mysql/mysql.sock

	sql_query		= SELECT id, email, name, bio, created_at FROM users

	sql_field_string	= email
	sql_field_string	= name
	sql_field_string	= bio

	sql_attr_timestamp	= created_at
	sql_ranged_throttle	= 0

	sql_query_info		= SELECT * FROM users WHERE id=$id
}

index usersidx {
	source			= users

	path			= /var/data/test1
	docinfo			= extern
	mlock			= 0
	morphology		= none

	min_word_len		= 1
	charset_type		= sbcs

	html_strip		= 0
}

indexer {
	mem_limit		= 32M
}

searchd {
	listen			= 9312
	listen			= 9306:mysql41

	log			= /var/log/searchd.log
	query_log		= /var/log/query.log
	read_timeout		= 5
	client_timeout		= 300
	max_children		= 30
	pid_file		= /var/log/searchd.pid
	max_matches		= 1000
	seamless_rotate		= 1
	preopen_indexes		= 1
	unlink_old		= 1
	mva_updates_pool	= 1M
	max_packet_size		= 8M
	max_filters		= 256
	max_filter_values	= 4096
	max_batch_queries	= 32
	workers			= threads
}

There are some notes regarding the configuration above

  1. Don’t forget to change the sql configuration in the source block. In my case, I need to define the sql socket, so you need to now yours.
  2. In my case, I have users table as an example. FYI, I have 1 million records in it, so I think you should add a lot of records too. So you may have a valid test.
  3. The rest is default setting, so you need to read sphinx documentation to have more comprehensive understanding about sphinx itself, sphinx’s indexes, and so on.

After you finished with the configuration stuff, next thing is start the sphinx by executing command > searchd. If you have problem with permission, then run it with sudo. After that, you need to index the data by executing command > indexer --all. Next thing is you need to test the sphinx search, in my case, I test the sphinx search by using sphinx API for PHP. So I ran this command > php test.php <query_string>. You can find these apis at api folder in the sphinx folder which you extracted before. If you have a lot of records, you may get massive amounts of data in only milli seconds. Wow!

I guess that’s it for this simple setup and configuration. Please remember that if you need an advance configuration or test case, I think you should learn about sphinx in more detail. So, happy reading and cheers!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s