Cookie Consent by Free Privacy Policy Generator

Back to the Campfire Blog

Building AI Search: What I Learned Along the Way

Written by: Noah Learner Tags: community, data-pipelines, critical thinking

Published: Dec 28, 2025

When we set out to build AI-powered search for The SEO Community archive, we knew it sounded straightforward: hook up an LLM to our message database and let it answer questions. But underneath we knew there'd be much much more to it. 

What followed was three days of rapid iteration, testing, and learning. We tried six different approaches before landing on something that actually works well. Here's what we learned.

Log in to the archive.

The problem: Much gold, but disparate

The archive has over 18,500 messages across dozens of channels. Traditional keyword search works fine when you know exactly what you're looking for, like "Screaming Frog" or a specific URL. But what about questions like:

"What does the community think about AI overviews affecting CTR?"

Keywords alone can't capture the intent behind that question. We needed semantic understanding, not just string matching.

But here's the real challenge we didn't anticipate: consistency. Users asking essentially the same question with different words should get the same answer. "Should I use llms.txt?" and "Should I implement llms.txt?" are the same question. Our search needed to understand that.

The Journey: Six approaches in three days

Timeline showing 6 development iterations over 3 days: Day 1 Morning - Algolia plus Gemini (partial success), Day 1 Afternoon - Stop Word Filtering (failed), Day 1 Evening - AI Entity Extraction (partial), Day 2 Morning - Query Normalization (failed), Day 2 Afternoon - Multi-Query plus RRF (partial), Day 2 Evening - Vector Search (success)
Our development journey: six approaches, three days, one solution

Day 1, Morning

Approach 1: Algolia + Gemini
Our first attempt was the obvious one: use Algolia (which we already had for keyword search) to find relevant messages, then send them to Gemini for summarization.
Day 1, Afternoon
Approach 2: Stop Word Filtering
When "What do we know about llms.txt?" returned poor results, we tried stripping question words before searching Algolia.

Day 1, Evening

Approach 3: AI Entity Extraction
Instead of a static stop word list, we used Gemini to extract the core topic from each query before searching.

Day 2, Morning

Approach 4: Query Normalization
We tried normalizing semantically equivalent queries to a canonical form before searching and caching.

Day 2, Afternoon

Approach 5: Multi-Query + RRF
Run three query variations in parallel, then merge results using Reciprocal Rank Fusion.

Day 2, Evening

Approach 6: Vector Search + Query Expansion
Replace keyword search entirely with vector embeddings. Add SEO-specific term expansion.

Approach 1: Algolia Keyword Search + Gemini

Flowchart showing User Query flowing to Algolia Search (keyword matching), then to Top 20 Messages, Build Context String, and finally Gemini 2.5 Flash producing Summary with citations
Our initial architecture: simple but limited by keyword matching

The Idea Worked, But Inconsistent

Use Algolia to find the top 20 messages matching the query keywords, then pass them to Gemini 2.5 Flash to generate a summarized answer with citations.

How It Worked

The Problem

Keyword search is literal. When a user asked about "GSC," Algolia looked for messages containing the letters "GSC," not messages discussing Google Search Console. Worse, semantically equivalent queries returned completely different results:

Side-by-side comparison showing Query A 'should I use llms.txt?' finding 15 messages versus Query B 'should I implement llms.txt?' finding 8 different messages - demonstrating inconsistent search results for equivalent questions
The same question, different results: a terrible user experience
Query A
"should I use llms.txt?"
Found: 15 messages about llms.txt
Query B
"should I implement llms.txt?"
Found: 8 different messages

The same question, phrased differently, got different answers. That's a terrible user experience.

Lesson Learned

Keyword search can't handle synonyms, abbreviations, or rephrased questions. For semantic queries, you need semantic search.

Approach 2: Stop Word Filtering

The Idea Made Things Worse

Strip common question words (what, how, should, about, etc.) before sending queries to Algolia. Keep only the "meaningful" keywords.

The Implementation

const STOP_WORDS = [
  'what', 'how', 'should', 'about', 'do', 'does',
  'can', 'could', 'would', 'is', 'are', 'the', 'a', 'an',
  'we', 'i', 'you', 'they', 'know', 'think', 'use'
]

function extractKeywords(query) {
  return query
    .toLowerCase()
    .split(/\s+/)
    .filter(word => !STOP_WORDS.includes(word))
    .join(' ')
}

The Problem

This was too aggressive. "What do we know about llms.txt?" became just "llms.txt," which sometimes helped but often lost important context. And it didn't solve the core problem: "use" and "implement" are different keywords that mean the same thing.

Worse, the stop word list kept growing. Every edge case needed a new word added. It was a game of whack-a-mole.

Lesson Learned

Rule-based text processing doesn't scale. You can't anticipate every way users will phrase questions. You need a smarter approach.

Approach 3: AI Entity Extraction

The Idea Better, But Expensive

Instead of a static stop word list, use Gemini to intelligently extract the core topic from each query. Let AI understand what the user is really asking about.

The Implementation

We added a fast "extraction" call before searching:

// First LLM call: Extract topics (fast, low tokens)
const extractionPrompt = `
Extract the core topic/entity from this question.
Return only the key terms, no explanation.

Question: "should we implement llms.txt?"
Answer: llms.txt

Question: "${userQuery}"
Answer:`

const searchTerms = await geminiFlash.generate(extractionPrompt)

// Then search Algolia with extracted terms
const results = await algolia.search(searchTerms)

What Worked

This handled synonyms better. "CWV" correctly extracted to "Core Web Vitals." Questions about "GSC" found messages about "Google Search Console."

What Didn't Work

Two problems: cost and latency. Every search now required two LLM calls instead of one. The extraction call was fast (~200ms), but it added up. And we still had inconsistent results because Algolia was still doing keyword matching on the extracted terms.

Before (Single LLM Call)
~$0.003
per query
After (Two LLM Calls)
~$0.0033
per query (+10%)

Lesson Learned

AI can understand intent better than rules, but adding LLM calls adds latency and cost. Each call should provide significant value.

Approach 4: Query Normalization

The Idea Overcomplicated

Normalize semantically equivalent queries to a canonical form. Cache results by normalized query so users asking the same thing get cache hits.

The Implementation

const normalizationPrompt = `
Normalize this question to its canonical form.
Remove personal pronouns, use present tense, standardize phrasing.

"should I use llms.txt?" → "using llms.txt"
"should we implement llms.txt?" → "using llms.txt"
"what is llms.txt?" → "llms.txt overview"

Question: "${userQuery}"
Normalized:`

The Problem

This added a third LLM call to each search. And the normalization itself was inconsistent, Gemini might normalize the same query differently on different calls. We were trying to use AI to create consistency, but AI is inherently probabilistic.

We were solving the wrong problem. The issue wasn't caching or normalization. The issue was that keyword search fundamentally can't do semantic matching.

Lesson Learned

Don't add complexity to work around a fundamental limitation. Fix the root cause instead. We were putting bandaids on keyword search when we needed to replace it entirely.

Approach 5: Multi-Query + Reciprocal Rank Fusion

Flowchart showing user query splitting into 3 parallel branches - original query plus 2 AI variations - each running through Algolia, then merging via RRF (Reciprocal Rank Fusion) into Top 20 Messages, then to Gemini
Multi-query with RRF: parallel searches merged with rank fusion scoring

The Idea 57% Consistency

Run multiple query variations in parallel, then merge results using Reciprocal Rank Fusion (RRF), a technique from information retrieval research.

How RRF Works

RRF combines results from multiple search queries by scoring each document based on its rank across all queries:

// For each document, sum: 1 / (k + rank) across all queries
// k is typically 60

RRF_score = Σ (1 / (60 + rank_in_query_i))

// Document ranked #1 in one query and #5 in another:
// Score = 1/61 + 1/65 = 0.0164 + 0.0154 = 0.0318

The Implementation

The Results

We built a test suite with 34 pairs of semantically equivalent queries and measured result overlap:

57%
Average Overlap
35%
Good (≥70%)
62%
OK (≥50%)

Better than before, but still not good enough. And we were now making 3 Algolia queries plus 1 LLM call for query generation per search. The complexity was getting out of hand.

Lesson Learned

You can improve keyword search with clever techniques, but you're still limited by the fundamental approach. Sometimes the answer is to change the approach entirely.

Approach 6: Vector Search + Query Expansion

The Final Solution 72% Consistency, Grade A

Replace keyword search entirely with vector embeddings. Convert every message to a 768-dimensional vector. Convert queries to vectors. Find messages with similar vectors.

The Key Insight

Vector embeddings capture meaning, not just words. When you embed "should I use llms.txt?" and "should I implement llms.txt?", they produce nearly identical vectors because they mean the same thing.

Conceptual diagram showing two different text queries - 'should I use llms.txt?' and 'should I implement llms.txt?' - both mapping to nearly the same point in a 3D vector space, demonstrating semantic similarity as geometric similarity
Semantic similarity becomes geometric similarity in vector space
The magic: Semantic similarity becomes geometric similarity. Finding related content becomes finding nearby points in vector space.

The Implementation

One-Time Setup

  1. All 2,070 messages
    Archive message content
  2. text-embedding-004
    Generate 768-dim vectors
  3. Store in Firestore
    Vector field on each document

Per Query

  1. User Query
    Expand SEO terms → Generate embedding
  2. Firestore Vector Search
    findNearest(COSINE)
  3. Top 100 Messages
    Most semantically similar
  4. Gemini
    Summary with citations

Query Expansion for SEO Terms

Vector search handles synonyms well, but abbreviations are trickier. "GSC" and "Google Search Console" have different embeddings because they're different strings. So we added a simple expansion layer:

Diagram showing query transformation: input 'What's happening in GSC?' passes through Query Expansion which appends the full term, outputting 'What's happening in GSC? Google Search Console'. Side panel shows expansion mappings like GSC to Google Search Console, GA4 to Google Analytics 4, SGE to Search Generative Experience
Query expansion: preserving original intent while adding SEO terminology
const QUERY_EXPANSIONS = {
  'GSC': 'Google Search Console',
  'GA4': 'Google Analytics 4',
  'SF': 'Screaming Frog crawler',
  'SGE': 'Search Generative Experience AI Overview',
  'CWV': 'Core Web Vitals',
  'E-E-A-T': 'Experience Expertise Authoritativeness Trustworthiness',
  // ... 70+ mappings
}

// "What's happening in GSC?" becomes
// "What's happening in GSC? Google Search Console"

The original query stays intact (preserving intent), but we append the expanded terms so the embedding captures both the abbreviation and full name.

The Results

72.2%
Average Overlap
53%
Good (≥70%)
85%
OK (≥50%)

Here's how specific test cases improved:

Query Pair Before After
"GSC performance" vs "Google Search Console performance" 52% 82%
"AI search optimization" vs "SGE optimization" 16% 80%
"should I use llms.txt" vs "should I implement llms.txt" 48% 91%
"how to build backlinks" vs "how to create backlinks" 55% 88%
Bar chart visualization showing dramatic improvements in query consistency: GSC queries improved from 52% to 82%, AI/SGE queries from 16% to 80%, llms.txt queries from 48% to 91%, and backlink queries from 55% to 88%
Before and after: query expansion dramatically improved consistency for SEO terms

The Final Architecture

Complete architecture diagram showing 7 steps: User Query, Query Expansion (appending SEO terms), Embedding Generation (text-embedding-004 to 768-dim vector), Firestore Vector Search (findNearest with COSINE), Context Building (numbered message citations), Gemini 2.5 Flash (with prompt instructions), and Response (JSON with summary, citations, sourceMessages, and meta)
The complete AI search architecture: from query to cited response

Cost Breakdown

One surprise: vector search ended up being cheaper than our earlier approaches.

Side-by-side comparison of Complex Approach (3 LLM calls, $0.01/query, inconsistent results) versus Final Approach (1 embedding plus 1 LLM call, $0.003/query, 72% consistency). Bottom stats show initial setup cost of $0.03 for 2,070 messages and monthly cost of approximately $30 for 10K queries
Simpler architecture, lower cost, better results
Component Cost Notes
Initial embedding generation ~$0.03 One-time for 2,070 messages
Query embedding ~$0.0001 Per query
Firestore vector search Free Included in Firestore reads
Gemini 2.5 Flash response ~$0.003 Per query
Total per query ~$0.003  

Compare this to Approach 4 (normalization + extraction + response) which cost ~$0.01 per query and still had worse results.

Cost projection: At 10,000 queries/month, the AI search costs ~$30/month. The initial embedding generation for the entire archive was $0.03 total.

Key Takeaways

1. Semantic search needs semantic technology

We spent two days trying to make keyword search behave semantically. It can't. Vector embeddings solve this at a fundamental level.

2. Measure what matters

Our test suite with 34 query pairs and overlap measurement was invaluable. Without it, we'd still be guessing whether changes helped.

3. Simpler is often better

Our final solution has fewer moving parts than approaches 3, 4, and 5. One embedding call, one vector search, one LLM call. That's it.

4. Domain-specific knowledge still matters

Vector search handles synonyms, but not abbreviations. The query expansion map for SEO terms boosted our score from 67% to 72%. Small additions, big impact.

5. Iterate fast, test everything

We built a test endpoint that let us curl queries and compare results in real-time. This rapid iteration loop was essential for the 6 approaches in 3 days.  This just wasn't possible for me 2 or more years ago.  I got to experiment with several approaches and learn quickly.

6. Keep learnings in a Markdown file as you work

After I finished this project I came up with an idea I will use from now on.  In an environment where I'm testing several methods and want to carry those learnings forward I need to have Claude write the learnings from each iteration in a learnings.md file so that we can have a shared context as we build.  This will help both Claude and more importantly, me, remember how I got to different outcomes.

What's next

The current system works well, but there's always room to improve:

  • Hybrid search: Combine vector search with keyword matching for queries that include specific names or URLs
  • Conversation context: Use previous questions to inform follow-up queries in the same session
  • Source diversity: Ensure answers cite messages from multiple channels and authors, not just the most semantically similar
  • Feedback loop: Track which citations users click to understand what's actually helpful

Building AI features is an iterative process. You start with assumptions, hit reality, adjust, and repeat. The key is having good metrics and being willing to throw away approaches that don't work, even if you spent time on them.

Try it yourself

  1. AI search is live in the The SEO Community archive.
  2. Log in with your Slack credentials.
  3. Toggle to "Ask AI" mode and try some questions.
  4. See if it finds what you're looking for.
  5. And if it doesn't? Let us know in #feedback.
  6. That's how we'll make it better.

This feature was built over December 26-28, 2025, with a lot of trial and error. Questions about the implementation? Ask in #lounge or dm Noah.



More articles


Our Values

What we believe in

Building friendships

Kindness

Giving

Elevating others

Creating signal

Treating each other with respect

What has no home here

Diminishing others

Gatekeeping

Taking without giving back

Spamming others

Arguing

Selling links and guest posts


The SEO Community full logo

Sign up to get the newsletter

By signing up, you agree to our Privacy Policy and Terms of Service. We may send you newsletters and occasional emails about products or services. You can opt-out at any time.

Join The Community

Powered by MODX

The SEO Community is a free organization, with private Slack group and events, for those interested in search and AI optimization to learn, discuss, and meet others in the industry.