Keywords to Context: Semantic Search and Retrieval-Augmented Generation with OpenSearch

by daniel.veza / 21 May 2026

Keyword search struggles with natural language and exploratory questions. Daniel walked the DrupalSouth 2026 audience through how OpenSearch and Skpr enable semantic search that understands intent and meaning, and how Retrieval-Augmented Generation (RAG) transforms results into clear, human-friendly answers grounded in your actual content.

Have you ever searched for something that gotten... nothing?

At DrupalSouth Wellington 2026, Daniel Veza (Senior Developer at PreviousNext and core subsystem maintainer for Layout Builder) presented on Semantic search with Opensearch, and how to take your search from just matching keywords, to undertanding context.

The Problem

If you've ever built a knowledge base, documentation site, or internal search tool and watched users get zero results, or worse, completely irrelevant ones, you've already felt the limits of traditional keyword search.

A user types "how do I cancel my subscription?" and gets nothing, because the answer lives in an article titled "managing your billing preferences." The content is there. The intent is clear. But keyword search can't make that connection.

Semantic search and retrieval-augmented generation (RAG) are designed to solve this. Together, they shift search from matching words to understanding meaning.

Why keyword search falls short

Traditional search ranks results by matching the words in a query against the words in your documents. If the user types "affordable options", but your content says "cost-effective plans," keyword search sees no overlap and returns nothing useful.

The common workarounds, synonym lists and query boosting help to a point, but they're brittle. They require constant maintenance and still can't handle the open-ended, conversational questions that users ask today.

How semantic search works

Instead of matching words, semantic search matches meaning. The mechanism embeds any piece of text, which can be represented as a list of numbers generated by a machine learning model. Texts with similar meaning produce similar numbers, and those numbers cluster together in what's called a vector space.

Think of it like a map. "Dog-friendly hotel" and "pet-friendly lodging" land in the same neighbourhood. "Reset password" is somewhere else entirely. When a user submits a query, it is converted to the same type of map coordinates, and OpenSearch retrieves the content nearest to them. Regardless of whether a single word is shared.

Before content gets stored, two things happen to it:

Chunking: Documents are broken into focused passages rather than stored as one large block.
Embedding: Each chunk is converted into its vector representation. This "meaning fingerprint" is stored in OpenSearch alongside the original text.

Hybrid search: combining both approaches

A hybrid approach combines keyword and semantic signals rather than replacing one with the other. OpenSearch can be configured with a normalisation processor that blends both scores, with a 70/30 weighting in favour of semantic search, while still preserving keyword boosting where it's useful, while the semantic layer handles intent and meaning. In practice, this produces noticeably better results across a wider range of queries.

RAG: from results to answers

Semantic search returns better results. RAG turns those results into a direct answer.

RAG (retrieval-augmented generation) works in three steps:

Retrieve: the user's query is embedded, and the most relevant chunks are pulled from OpenSearch.
Augment: those chunks are fed into a prompt that defines the role, tone, format, and behaviour of the response.
Generate: the prompt and retrieved context go into an LLM, which produces a clear answer grounded in your actual content.

The user gets a direct answer, not hallucinations, not a list of links. It's the same approach behind Google AI Overviews, but applied to your own indexed content with your own prompt and guardrails.

RAG is also useful beyond site search. It can summarise all support tickets raised in a given month, surface recurring themes across a large document corpus, or generate structured responses across any text-based content library — returning one coherent answer with sources instead of 20 results to wade through.

Implementing it in OpenSearch

The implementation uses OpenSearch 3.x and AWS Bedrock. The key steps:

Machine learning settings: run models on dedicated nodes, set memory thresholds, and enable auto-redeploy so models survive restarts.
Two Bedrock connectors: one for the generative LLM (Amazon Nova Light is a solid general-purpose choice), and one for generating embeddings (Amazon Titan Embed Text, producing 1,024-dimensional vectors). Both run at index time and query time.
Search pipeline: a two-part pipeline handles the hybrid normalization processor (blending keyword and semantic scores) and the RAG response processor (feeding the top results into the LLM with your prompt and returning the generated summary).

For Drupal, the Search API Semantic and Search API OpenSearch Semantic modules handle indexing without needing a custom solution. Decoupled front ends can query OpenSearch directly with a hybrid neural + match query blended by the normalization processor. A context window of five chunks balances latency, cost, and accuracy well; 10–15 chunks improves accuracy for specific queries but increases both.

Cost and latency trade-offs

Semantic search itself is inexpensive. RAG passes content through two models and incurs per-request costs on Bedrock. For high-traffic sites, it's worth building a guardrail. For example, a feature flag that disables the RAG layer if costs hit a threshold, while still returning semantic results.

On latency, the multi-model pipeline adds some processing time. A useful pattern is to return search results immediately and lazy-load the RAG summary separately, so users aren't waiting before they see anything.

The bottom line

Keyword search still works for exact matches, but user behaviour has shifted toward conversational queries and expected direct answers. Semantic search closes the gap by understanding meaning. Hybrid search preserves keyword signal alongside it. RAG converts results into grounded answers. OpenSearch and Bedrock give you the tools to build all of this today, integrated with your existing Drupal content via Search API.

The code examples referenced in this talk can be found on Daniels github.

Photo credit: Karl Hepworth - https://www.flickr.com/people/200855369@N08/

License: ShareAlike 2.0 - https://creativecommons.org/licenses/by-sa/2.0/deed.en