Michael McCandless (@mikemccand) 's Twitter Profile
Michael McCandless

@mikemccand

Apache member, PMC member and committer for the Apache Lucene project; Senior Principal Engineer at Amazon Product Search. Opinions are all my own!

ID: 1051519483532185601

linkhttp://blog.mikemccandless.com calendar_today14-10-2018 17:05:55

1,1K Tweet

1,1K Followers

376 Following

Benjamin Trent (@benwtrent) 's Twitter Profile Photo

#Lucene 9.12 is released! This is the final minor release in the 9 series. πŸŽ‰ Next stop Lucene 10.0.0! πŸŽ‰ Here are some of my highlights for 9.12 🧡

Benjamin Trent (@benwtrent) 's Twitter Profile Photo

Last but not least is a race condition bug that I have been fighting with for many weeks. github.com/apache/lucene/… This is the beauty of OSS. Spend weeks banging your head against a wall, and somebody like Ao Li aoli.al comes along out of nowhere and helps.

Adrien Grand (@jpountz) 's Twitter Profile Photo

Lucene 10 was released earlier today. The main release highlight is improved hardware efficiency, I wrote about it on the Elastic blog: elastic.co/search-labs/bl…

Robert Charles Muir (@rcmuir) 's Twitter Profile Photo

open source contributor's PR failing with crazy error from CI system, turns out test discovered a recently-introduced bug in LibreOffice's Mongolian spell-checking rules. this is normal... have seen worse in these hunspell dicts, once even found a binary executable!

Michael McCandless (@mikemccand) 's Twitter Profile Photo

The next Apache #Lucene major release (11.0) already has some compelling performance gains over 10.x, now encoding dense postings blocks as a bitset, and using vectorized (SIMD) CPU instructions for postings int[] decoding! Thanks to the pluggable Codec API in Lucene, such

Benjamin Trent (@benwtrent) 's Twitter Profile Photo

The number of improvements in Lucene here are crazy. Pretty much every count and boolean query gets a nice boost and some of the count improvements are hilarious πŸš€πŸš€πŸš€

The number of improvements in Lucene here are crazy. Pretty much every count and boolean query gets a nice boost and some of the count improvements are hilarious πŸš€πŸš€πŸš€
Michael McCandless (@mikemccand) 's Twitter Profile Photo

I must call out this unsung hero in #Apache #Lucene's benchmarking toolkit: Created by Robert Charles Muir (thank you!), it tests performance impact of any git branch (or PR) against many (10 currently?) different CPU architectures/models/revisions via AWS's diverse instances so we can

Adrien Grand (@jpountz) 's Twitter Profile Photo

Guo Feng contributed a 2.5x (!) speedup to #Lucene's numeric range queries by using vectorization. HZ sped up query evaluation, ID sped up decoding data from the index. Lots of great performance improvements coming in Lucene 10.2.

Guo Feng contributed a 2.5x (!) speedup to #Lucene's numeric range queries by using vectorization. HZ sped up query evaluation, ID sped up decoding data from the index. Lots of great performance improvements coming in Lucene 10.2.
Michael McCandless (@mikemccand) 's Twitter Profile Photo

#Apache #Lucene will soon have a faster and smaller terms index! This is a complex part of Lucene, and a major hotspot for terms heavy use cases like (primary) key/value store (~34% speedup, but results are preliminary!). Lucene's pluggable Codec API makes experimentation like

Adrien Grand (@jpountz) 's Twitter Profile Photo

This is now live on nightly benchmarks, with a 32% speedup on primary-key lookups benchmarks.mikemccandless.com/PKLookup.html and a 5% speedup on fuzzy queries. The main benefit that Lucene users will notice is likely faster indexing with explicity document IDs

Adrien Grand (@jpountz) 's Twitter Profile Photo

Lucene's histogram collector is becoming more sophisticated, it can now take advantage of points indexes when the query fully matches a segment, which can give a multiple fold performance boost. github.com/apache/lucene/…

Andy Jassy (@ajassy) 's Twitter Profile Photo

Important moment for Project Kuiper as we just confirmed our first 27 production satellites are operating as expected in low Earth orbit. While this is the first step in a much longer journey to launch the rest of our low Earth orbit constellation, it represents an incredible

Important moment for <a href="/ProjectKuiper/">Project Kuiper</a> as we just confirmed our first 27 production satellites are operating as expected in low Earth orbit. While this is the first step in a much longer journey to launch the rest of our low Earth orbit constellation, it represents an incredible
Cory Zue (@czue) 's Twitter Profile Photo

This xkcd was published 18 years ago, mostly irrelevant for like a decade as computers got faster, and is now very much back.

This xkcd was published 18 years ago, mostly irrelevant for like a decade as computers got faster, and is now very much back.
Adrien Grand (@jpountz) 's Twitter Profile Photo

The search library benchmark from the Tantivy folks was just updated with Lucene 10.2 tantivy-search.github.io/bench. Lucene now performs much better at the COUNT collection type, a bit better at TOP_K. Still somewhat slow at TOP_100_COUNT and phrase queries across all collection types.

Adrien Grand (@jpountz) 's Twitter Profile Photo

Yelp's nrtSearch was just upgraded to Lucene 10. Also switched from persistent storage to object storage as a source of truth, and plans on doing NRT replication via object storage instead of over the network. Very similar to Elasticsearch Serverless.

Adrien Grand (@jpountz) 's Twitter Profile Photo

I wanted to share what I learned from Tantivy's "Search Benchmark, the Game", so I set up GitHub pages and wrote two blogs, on general observations on the benchmark jpountz.github.io/2025/05/12/ana… and how it helped drive performance improvements in Lucene jpountz.github.io/2025/04/12/why…

Adrien Grand (@jpountz) 's Twitter Profile Photo

There has been a big regression in Lucene's nightly benchmarks recently after a kernel upgrade. Michael McCandless and Robert Charles Muir found that it was caused by a change in the Linux scheduler configuration. github.com/apache/lucene/…

Cory Zue (@czue) 's Twitter Profile Photo

I'm starting a new journey! Read more about it here: "The Solopreneur Sabbatical" coryzue.com/writing/the-so…