Recent Documents First in Apache Lucene

Recent Documents First in Apache Lucene

Phill Luby
22nd September 2012

Home Insights Recent Documents First in Apache Lucene

Having spent a long time looking for how to put the latest documents first in Apache Lucene to no avail. Finally, I’ve found a solution that works.

Most of the answers on the web suggested using a boost on documents based on their date. However, I was unconvinced how these solutions would pan out in the long term. The other day, I came across Apache Lucene Sort Tips which describes how to use the TopFieldDocCollector. By chance it mentioned the constant SortField.FIELD_SCORE that can be used when constructing a multi-field Sort object.

So, the answer is simple, but I thought I’d write a post specifically addressing this use-case so that an answer is easy for others to find. You need a field containing the modified date of all your documents. Storing this as an ISO 8601 string does the trick. Now you construct a sort object passing SortField.FIELD_SCORE as the first field and your date field (descending) as the second and hey presto!

So, here’s how we create our sort:

var sort = new Sort(new\[\] {
    new SortField("last\_modified", SortField.STRING, true)});

And use this with a TopFieldDocCollector in the usual way.

Massive thanks to the author of the original post. I just thought it was worth posting something specifically for this use-case.

Share Article


Sponsoring the Legal Tech in Leeds’ Hackathon
Sponsoring the Legal Tech in Leeds’ Hackathon

This is why NewRedo are so proud to have sponsored Legal Tech in Leeds Hackathon

Discover More
Taking a look at Dagger
Taking a look at Dagger

Solomon Hykes is probably most famous for being the founder and former CTO of Docker. Docker revolutionised the way we package, run and distribute server applications, so when Hykes starts a new venture, it's worth checking out.

Discover More
Running Terraform in a Lambda Function
Running Terraform in a Lambda Function

I recently set up a Terraform project which I wanted to run on a regular schedule. There are a number of ways to achieve this, but I decided to package the project as a Lambda function and schedule it with… 

Discover More