My previous research is about creating a search engine and now, I want to share about it to you guys.
If anyone asks you about how to perform a simple search in MySQL, you may suggest add INDEX or FULLTEXT INDEX to the columns and use LIKE or MATCH query. Yes, it’s not wrong, but for some circumstances I suggest you to reconsider using those queries. A couple of days ago, I performed a simple search with MATCH query which involved about a million of records. Do you know how long it took to retrieve the data? Well, it’s about 2 minutes. My boss was mad at me and said that nobody is willing to use the application if it has that kind of performance.
One of the solutions that you can use is using sphinx. It provides you some methods in indexing the data, so the application will never actually hit the database for some query. In a simple explanation, you give some query that you want to index to sphinx, then sphinx will create one or more indexes file (depend on how many query that you want to have), next the application perform the search through the index file. I’ll tell you this, to retrieve about 80000 records in table which is containing 1000000 records, sphinx will give you the results in milli seconds. What a performance, right?
What I want to share right now is how to perform sphinx search via Ruby on Rails, which is involving the thinking sphinx. I assume you have installed the sphinx so I won’t explain about the installation here. But in a condition you need it, I may complete this post with that topic or maybe in a different post, sometime.
Let’s begin with installing the thinking sphinx gem by execute the command
gem install thinking-sphinx --version 2.0.10. Or you can add this code to your
Gemfile and run
gem 'thinking-sphinx', '2.0.10'
Now add a file under
config folder and name it,
sphinx.yml. After that, you add this code into the file
# Don't forget to replace RAILS_ROOT with the actual location to your rails project folder development: address: 127.0.0.1 port: 9312 config_file: "RAILS_ROOT/config/sphinx/development.sphinx.conf" searchd_log_file: "RAILS_ROOT/log/sphinx/searchd.log" query_log_file: "RAILS_ROOT/log/sphinx/searchd.query.log" pid_file: "RAILS_ROOT/log/sphinx/searchd.development.pid" test: port: 9312 production: port: 9312
I assume you already setup the config file for sphinx and copy it to
RAILS_ROOT/config/sphinx and rename it with
development.sphinx.conf. The location can be changed and it depends on the configuration that you created at
sphinx.yml. If it’s all set, so you need to execute this command >
rake thinking_sphinx:reindex. This command will index all query you provided in the configuration file without creating the new configuration file like
rake thinking_sphinx:index does.
Next thing that you need to do is starting the search daemon by executing this command >
rake thinking_sphinx:start. I assume you haven’t run the
searchd command yet, but if you do, you just run
searchd --stop to stop it, then you can run the rake command once more.
Now, let’s get into the code part which is very easy to do. This is my model code and for your case, you can modify it to what columns that you added to be indexed by sphinx.
class User < ActiveRecord::Base attr_accessible :bio, :email, :name define_index do indexes name, as: :name, sortable: true indexes bio indexes email end end
And this is the part of my controller which is showing the number of results found from the query passed.
def search search_input = params[:search] @result = User.search_count search_input end
Then you just need to print the variable
@result to get the number of results found. Easy, isn’t it?
Well, actually I’m still exploring this thinking sphinx and sphinx itself to get more understanding about it. So I hope there will be more posts about this topic that I can write. Umm.. I think I can start from the sphinx setup, which I haven’t explained to you before.
See you and cheers!