Back to Blog
ElasticsearchPerformanceObservability

Elasticsearch Read Optimization — Tuning for Faster Search

A comprehensive guide to optimizing Elasticsearch for faster search performance — covering filesystem cache, document modeling, query design, and index-level tuning.

2026-04-06

Search performance in Elasticsearch depends on a combination of factors, including how expensive individual queries are, how many searches run in parallel, the number of indices and shards involved, and the overall sharding strategy and shard size. While hardware and system-level settings play an important role, the structure of your documents and the design of your queries often have the biggest impact.

Note

These variables influence how the system should be tuned. For example, optimizing for a small number of complex queries differs significantly from optimizing for many lightweight, concurrent searches. Make sure to also consider your cluster's shard count, index layout, and overall data distribution.

Give Memory to the Filesystem Cache

Elasticsearch heavily relies on the filesystem cache to make search fast. In general, you should make sure that at least half the available memory goes to the filesystem cache so that Elasticsearch can keep hot regions of the index in physical memory.

By default, Elasticsearch automatically sets its JVM heap size to follow this best practice. However, in self-managed or Elastic Cloud on Kubernetes deployments, you have the flexibility to allocate even more memory to the filesystem cache, which can lead to performance improvements depending on your workload.

Note

On Linux, the filesystem cache uses any memory not actively used by applications. To allocate memory to the cache, ensure that enough system memory remains available and is not consumed by Elasticsearch or other processes.

Avoid Page Cache Thrashing on Linux

Search can cause a lot of randomized read I/O. When the underlying block device has a high readahead value, there may be a lot of unnecessary read I/O done, especially when files are accessed using memory mapping.

Most Linux distributions use a sensible readahead value of 128KiB for a single plain device, however, when using software RAID, LVM or dm-crypt the resulting block device may end up having a very large readahead value (in the range of several MiB). This usually results in severe page cache thrashing adversely affecting search performance.

You can check the current value using:

lsblk -o NAME,RA,MOUNTPOINT,TYPE,SIZE

Warning

blockdev expects values in 512 byte sectors whereas lsblk reports values in KiB. For example, to temporarily set readahead to 128KiB for /dev/nvme0n1:

blockdev --setra 256 /dev/nvme0n1

Use Faster Hardware

If your searches are I/O-bound, consider increasing the size of the filesystem cache or using faster storage. Each search involves a mix of sequential and random reads across multiple files, and there may be many searches running concurrently on each shard, so SSD drives tend to perform better than spinning disks. If your searches are CPU-bound, consider using a larger number of faster CPUs.

Directly-attached (local) storage generally performs better than remote storage because it is simpler to configure well and avoids communications overheads. With careful tuning, it is sometimes possible to achieve acceptable performance using remote storage too — but always benchmark with a realistic workload before committing to a particular storage architecture.

Document Modeling

Documents should be modeled so that search-time operations are as cheap as possible. In particular, joins should be avoided. nested can make queries several times slower and parent-child relations can make queries hundreds of times slower. If the same questions can be answered without joins by denormalizing documents, significant speedups can be expected.

Search as Few Fields as Possible

The more fields a query_string or multi_match query targets, the slower it is. A common technique to improve search speed over multiple fields is to copy their values into a single field at index time using the copy_to directive:

PUT movies
{
  "mappings": {
    "properties": {
      "name_and_plot": {
        "type": "text"
      },
      "name": {
        "type": "text",
        "copy_to": "name_and_plot"
      },
      "plot": {
        "type": "text",
        "copy_to": "name_and_plot"
      }
    }
  }
}

Pre-index Data

Leverage patterns in your queries to optimize how data is indexed. For instance, if most queries run range aggregations on a fixed list of ranges, you can pre-index those ranges and use a terms aggregation instead:

PUT index
{
  "mappings": {
    "properties": {
      "price_range": {
        "type": "keyword"
      }
    }
  }
}

PUT index/_doc/1
{
  "designation": "spoon",
  "price": 13,
  "price_range": "10-100"
}

Map Identifiers as Keyword

Not all numeric data should be mapped as a numeric field type. Elasticsearch optimizes numeric fields for range queries, but keyword fields are better for term-level queries. Identifiers such as ISBN or product IDs are rarely used in range queries but are often retrieved using term-level queries. Consider mapping them as keyword for faster retrieval.

Query-Level Optimizations

Avoid Scripts

If possible, avoid using script-based sorting, scripts in aggregations, and the script_scorequery. Scripts bypass many of Elasticsearch's built-in caching and optimization mechanisms.

Search Rounded Dates

Queries on date fields that use now are typically not cacheable since the range changes constantly. Switching to rounded dates makes better use of the query cache:

GET index/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "range": {
          "my_date": {
            "gte": "now-1h/m",
            "lte": "now/m"
          }
        }
      }
    }
  }
}

The longer the rounding interval, the more the query cache can help — but too aggressive rounding might hurt user experience.

Force-Merge Read-Only Indices

Indices that are read-only may benefit from being merged down to a single segment. This is typical with time-based indices: only the current time frame receives new documents while older indices are read-only. Single-segment shards can use simpler and more efficient data structures.

Warning

Do not force-merge indices to which you are still writing, or will write again in the future. Rely on the automatic background merge process instead. Continuing to write to a force-merged index can severely degrade performance.

Cache & Warm-up Strategies

Warm Up Global Ordinals

Global ordinals optimize aggregation performance and are calculated lazily by default. For fields heavily used in bucketing aggregations, you can tell Elasticsearch to construct and cache them before requests arrive:

PUT index
{
  "mappings": {
    "properties": {
      "foo": {
        "type": "keyword",
        "eager_global_ordinals": true
      }
    }
  }
}

Warm Up the Filesystem Cache

After a restart, the filesystem cache is empty. You can explicitly tell the OS which files to load eagerly using the index.store.preloadsetting. Use with caution — loading too many files will hurt performance if the cache can't hold all the data.

Use Preference for Cache Utilization

Elasticsearch maintains caches at the node level. With round-robin routing (the default), consecutive identical requests hit different shard copies, preventing cache reuse. Using a preference value that identifies the current user or session routes requests consistently and improves cache hit rates.

Index-Level Tuning

Index Sorting for Faster Conjunctions

Index sorting can make conjunctions (AND queries) faster at the cost of slightly slower indexing. When documents are sorted within a segment, Elasticsearch can skip entire blocks of non-matching documents during query evaluation.

Use index_phrases and index_prefixes

The text field supports index_phrases (indexes 2-shingles for faster phrase queries) and index_prefixes (indexes term prefixes for faster prefix queries). If your use case involves many phrase or prefix queries, these options can provide significant speedups.

Use constant_keyword for Filtering

If a filter matches most documents in an index, consider splitting data into dedicated indices and using constant_keyword to let Elasticsearch transparently skip the filter:

PUT bicycles
{
  "mappings": {
    "properties": {
      "cycle_type": {
        "type": "constant_keyword",
        "value": "bicycle"
      },
      "name": {
        "type": "text"
      }
    }
  }
}

On this index, Elasticsearch will automatically ignore any filter on cycle_type: bicycle, making the query cheaper without changing client-side logic.

Replicas & Throughput

Replicas improve resiliency and can help with throughput, but not always. The setup with fewer shards per node in total usually performs better because it gives a greater share of the filesystem cache to each shard. There's a trade-off between throughput and availability.

The formula: if you have num_nodes nodes, num_primaries primary shards, and want to cope with max_failures node failures, the optimal replica count is:

max(max_failures, ceil(num_nodes / num_primaries) - 1)

Monitoring & Profiling

Use the Search Profiler in Kibana to navigate and analyze the Profile API results. It gives insight into how each component of your queries and aggregations impacts processing time, helping you identify bottlenecks and tune accordingly.

Keep an eye on open search contexts by polling the node stats API:

GET _nodes/stats/indices/search

High open_contextsvalues can indicate a backlog or overly long scroll timeouts. Clear scrolls as soon as they're no longer needed to release resources.

Need help tuning your Elasticsearch cluster?

We design and optimize Elastic Stack deployments at enterprise scale. Let’s talk about your performance challenges.

Send a Message

Related Articles