[mysql] MySQL match() against() - order by relevance and column?

Okay, so I'm trying to make a full text search in multiple columns, something simple like this:

SELECT * FROM pages WHERE MATCH(head, body) AGAINST('some words' IN BOOLEAN MODE)

Now i want to order by relevance, (how many of the words are found?) which I have been able to do with something like this:

SELECT * , MATCH (head, body) AGAINST ('some words' IN BOOLEAN MODE) AS relevance 
FROM pages
WHERE MATCH (head, body) AGAINST ('some words' IN BOOLEAN MODE)
ORDER BY relevance

Now here comes the part where I get lost, I want to prioritize the relevance in the head column.

I guess I could make two relevance columns, one for head and one for body, but at that point I'd be doing somewhat the same search in the table three times, and for what i'm making this function, performance is important, since the query will both be joined and matched against other tables.

So, my main question is, is there a faster way to search for relevance and prioritize certain columns? (And as a bonus possibly even making relevance count number of times the words occur in the columns?)

Any suggestions or advice would be great.

Note: I will be running this on a LAMP-server. (WAMP in local testing)

This question is related to mysql full-text-search

The answer is


Just adding for who might need.. Don't forget to alter the table!

ALTER TABLE table_name ADD FULLTEXT(column_name);

I have never done so, but it seems like

MATCH (head, head, body) AGAINST ('some words' IN BOOLEAN MODE)

Should give a double weight to matches found in the head.


Just read this comment on the docs page, Thought it might be of value to you:

Posted by Patrick O'Lone on December 9 2002 6:51am

It should be noted in the documentation that IN BOOLEAN MODE will almost always return a relevance of 1.0. In order to get a relevance that is meaningful, you'll need to:

SELECT MATCH('Content') AGAINST ('keyword1 keyword2') as Relevance 
FROM table 
WHERE MATCH ('Content') AGAINST('+keyword1+keyword2' IN BOOLEAN MODE) 
HAVING Relevance > 0.2 
ORDER BY Relevance DESC 

Notice that you are doing a regular relevance query to obtain relevance factors combined with a WHERE clause that uses BOOLEAN MODE. The BOOLEAN MODE gives you the subset that fulfills the requirements of the BOOLEAN search, the relevance query fulfills the relevance factor, and the HAVING clause (in this case) ensures that the document is relevant to the search (i.e. documents that score less than 0.2 are considered irrelevant). This also allows you to order by relevance.

This may or may not be a bug in the way that IN BOOLEAN MODE operates, although the comments I've read on the mailing list suggest that IN BOOLEAN MODE's relevance ranking is not very complicated, thus lending itself poorly for actually providing relevant documents. BTW - I didn't notice a performance loss for doing this, since it appears MySQL only performs the FULLTEXT search once, even though the two MATCH clauses are different. Use EXPLAIN to prove this.

So it would seem you may not need to worry about calling the fulltext search twice, though you still should "use EXPLAIN to prove this"


I was just playing around with this, too. One way you can add extra weight is in the ORDER BY area of the code.

For example, if you were matching 3 different columns and wanted to more heavily weight certain columns:

SELECT search.*,
MATCH (name) AGAINST ('black' IN BOOLEAN MODE) AS name_match,
MATCH (keywords) AGAINST ('black' IN BOOLEAN MODE) AS keyword_match,
MATCH (description) AGAINST ('black' IN BOOLEAN MODE) AS description_match
FROM search
WHERE MATCH (name, keywords, description) AGAINST ('black' IN BOOLEAN MODE)
ORDER BY (name_match * 3  + keyword_match * 2  + description_match) DESC LIMIT 0,100;