Quantcast
Channel: Changing Bits
Browsing all 52 articles
Browse latest View live
ā†§

Choosing a fast unique identifier (UUID) for Lucene

Most search applications using Apache Lucene assign a unique id, or primary key, to each indexed document. While Lucene itself does not require this (it could care less!), the application usually...

View Article


Image may be NSFW.
Clik here to view.

A new proximity query for Lucene, using automatons

The simplest Apache Lucene query, TermQuery, matches any document that contains the specified term, regardless of where the term occurs inside each document. Using BooleanQuery you can combine multiple...

View Article


Image may be NSFW.
Clik here to view.

Scoring tennis using finite-state automata

For some reason having to do with the medieval French, the scoring system for tennis is very strange. In actuality, the game is easy to explain: to win, you must score at least 4 points and win by at...

View Article

Apache Luceneā„¢ 5.0.0 is coming!

At long last, after a strong series of 4.x feature releases, most recently 4.10.2, we are finally working towards another major Apache Lucene release! There are no promises for the exact timing (it's...

View Article

Where are my new blog posts?

Some of you have noticed that I'm not writing much in this blog lately. But fear not: exciting changes are still happening in Lucene, and I am still writing about them! It's just that most of what I...

View Article


Jirasearch 2.0 dog food: using Lucene to find our Jira issues

A few years ago I first built and released Jirasearch as a fun dog-food test case for the thin-wrapper Lucene server, to expose a powerful search UI over our Jira issues. This is a great showcase of a...

View Article

Apache Lucene 7.0 Is Coming Soon!

The Apache Lucene project will likely release its next major release, 7.0, in a few months! Remember that Lucene developers generally try hard to backport new features for the next non-major (feature)...

View Article

Image may be NSFW.
Clik here to view.

Lucene gets concurrent deletes and updates!

Long ago, Lucene could only use a single thread to write new segments to disk. The actual indexing of documents, which is the costly process of inverting incoming documents into in-memory segment data...

View Article


Image may be NSFW.
Clik here to view.

Lucene's near-real-time segment index replication

[TL;DR: Apache Lucene 6.0 quietly introduced a powerful new feature called near-real-time (NRT) segment replication, for efficiently and reliably replicating indices from one server to another, and...

View Article


Image may be NSFW.
Clik here to view.

Concurrent query execution in Apache Lucene

Apache Lucene is a wonderfully concurrent pure Java search engine, easily able to saturate the available CPU or IO resources on your server, if you ask it to. The concurrency model for a "typical"...

View Article

Image may be NSFW.
Clik here to view.

Apache Lucene performance on 128-core AMD Ryzen Threadripper 3990X

Almost a decade ago, I started running Lucene's nightly benchmarks, and have been trying with mixed success to keep them running every night, through the numerous amazing changes relentlessly developed...

View Article

Image may be NSFW.
Clik here to view.

Open-source collaboration, or how we finally added merge-on-refresh to...

The open-source software movement is clearly a powerful phenomenon.A diverse (in time, geography, interests, gender (hmm not really, not yet, hrmph), race, skills, use-cases, age, corporate employer,...

View Article
Browsing all 52 articles
Browse latest View live