Having spent a long time looking for how to put the latest documents first in Apache Lucene to no avail. Finally, I’ve found a solution that works.
Most of the answers on the web suggested using a boost on documents based on their date. However, I was unconvinced how these solutions would pan out in the long term. The other day, I came across Apache Lucene Sort Tips which describes how to use the TopFieldDocCollector
. By chance it mentioned the constant SortField.FIELD_SCORE
that can be used when constructing a multi-field Sort
object.
So, the answer is simple, but I thought I’d write a post specifically addressing this use-case so that an answer is easy for others to find. You need a field containing the modified date of all your documents. Storing this as an ISO 8601 string does the trick. Now you construct a sort object passing SortField.FIELD_SCORE
as the first field and your date field (descending) as the second and hey presto!
So, here’s how we create our sort:
var sort = new Sort(new\[\] {
SortField.FIELD\_SCORE,
new SortField("last\_modified", SortField.STRING, true)});
And use this with a TopFieldDocCollector
in the usual way.
Massive thanks to the author of the original post. I just thought it was worth posting something specifically for this use-case.
This is why NewRedo are so proud to have sponsored Legal Tech in Leeds Hackathon
Discover MoreSolomon Hykes is probably most famous for being the founder and former CTO of Docker. Docker revolutionised the way we package, run and distribute server applications, so when Hykes starts a new venture, it's worth checking out.
Discover MoreI recently set up a Terraform project which I wanted to run on a regular schedule. There are a number of ways to achieve this, but I decided to package the project as a Lambda function and schedule it with…
Discover More