If you ever used Sphinx Search you’ve probably tried one of the configurations listed below. Each of them is suitable for particular types of projects. Let’s take a deeper look at them.
Before we start I’d like to point that when deciding on Sphinx Search configuration for your project you should ask following questions:
- How large is the amount of data you want to make searchable?
- How fast is data growth in your system?
- What are your hardware capabilities (number of CPUs, memory, network)?
- How many search queries your system needs to serve?
Main + delta scheme
This is configuration scheme with two indexes – goal of this scheme is to make fast indexes updates as easy as possible with Sphinx Search even for larger amounts of data.
When you have more documents (say more than 100,000 pages or so) and the amount of your data is continuously growing with frequent content updates (e.g. a large forum or a news website), it is best to implement so called “Main+Delta” scheme which uses two indexes. Main – as core index with most of the data – it will be updated not so frequently and will grow in size over time. And Delta – as an incremental index which will contain only the latest new information which is not covered by Main index yet.
To ensure we have the latest documents searchable we need to rebuild indexes very often. And with large amounts of data it is not possible to perform full re-index as it can take several hours or even days. But having this configuration we need to rebuild only Delta index frequently which will take seconds or minutes and your search engine will always have fresh data.
To keep Delta index small you have to append it to the Main index and reset it periodically. Since Sphinx version 0.9.9 you can merge indexes so it is not necessary to rebuild Main index each time – you can just merge Main and Delta.
To use the search you should query both indexes:
$sphinxClient->Query(“your search query”, “mainIndex deltaMainIndex”);
./indexer --merge mainIndex deltaMainIndex --merge-dst-range deleted 0 0
–rotate switch will be required if DSTINDEX is already being served by searchd.
–merge-dst-range will delete the old index (if any) in the mainIndex and update it with a new in the deltaMainIndex. (However it did not work for me i.e. it did not really merged, and Im avoiding this switch)
Problems and Solutions
1. If indexer is not rotated then stop and run searchd again.