Having spent a long time looking for how to put the latest documents first in Apache Lucene to no avail. Finally, I’ve found a solution that works.
Most of the answers on the web suggested using a boost on documents based on their date. However, I was unconvinced how these solutions would pan out in the long term. The other day, I came across Apache Lucene Sort Tips which describes how to use the TopFieldDocCollector
. By chance it mentioned the constant SortField.FIELD_SCORE
that can be used when constructing a multi-field Sort
object.
So, the answer is simple, but I thought I’d write a post specifically addressing this use-case so that an answer is easy for others to find. You need a field containing the modified date of all your documents. Storing this as an ISO 8601 string does the trick. Now you construct a sort object passing SortField.FIELD_SCORE
as the first field and your date field (descending) as the second and hey presto!
So, here’s how we create our sort:
var sort = new Sort(new[] { SortField.FIELD_SCORE, new SortField("last_modified", SortField.STRING, true)});
And use this with a TopFieldDocCollector
in the usual way.
Massive thanks to the author of the original post. I just thought it was worth posting something specifically for this use-case.