RAG with PostgreSQL and C#

Happy New Year 2026 — let the year begin

Happy New Year 2026 🎉

Let’s start the year with something honest.

This article exists because something broke.

I wasn’t trying to build a demo.
I was building an activity stream — the kind of thing every social or collaborative system eventually needs.

Posts.
Comments.
Reactions.
Short messages.
Long messages.
Noise.

At some point, the obvious question appeared:

“Can I do RAG over this?”

That question turned into this article.

The Original Problem: RAG over an Activity Stream

An activity stream looks simple until you actually use it as input.

In my case:

The UI language was English
The content language was… everything else

Users were writing:

Spanish
Russian
Italian
English
Sometimes all of them in the same message

Perfectly normal for humans.
Absolutely brutal for naïve RAG.

I tried the obvious approach:

embed everything
store vectors
retrieve similar content
augment the prompt

And very quickly, RAG went crazy.

Why It Failed (And Why This Matters)

The failure wasn’t dramatic.
No exceptions.
No errors.

Just… wrong answers.

Confident answers.
Fluent answers.
Wrong answers.

The problem was subtle:

Same concept, different languages
Mixed-language sentences
Short, informal activity messages
No guarantee of language consistency

In an activity stream:

You don’t control the language
You don’t control the structure
You don’t even control what a “document” is

And RAG assumes you do.

That’s when I stopped and realized:

RAG is not “plug-and-play” once your data becomes messy.

So… What Is RAG Really?

RAG stands for Retrieval-Augmented Generation.

The idea is simple:

Retrieve relevant data first, then let the model reason over it.

Instead of asking the model to remember everything, you let it look things up.

Search first.
Generate second.

Sounds obvious.
Still easy to get wrong.

The Real RAG Pipeline (No Marketing)

A real RAG system looks like this:

Your data lives in a database
Text is split into chunks
Each chunk becomes an embedding
Embeddings are stored as vectors
A user asks a question
The question is embedded
The closest vectors are retrieved
Retrieved content is injected into the prompt
The model answers

Every step can fail silently.

Tokenization & Chunking (The First Trap)

Models don’t read text.
They read tokens.

This matters because:

prompts have hard limits
activity streams are noisy
short messages lose context fast

You usually don’t tokenize manually, but you do choose:

chunk size
overlap
grouping strategy

In activity streams, chunking is already a compromise — and multilingual content makes it worse.

Embeddings in .NET (Microsoft.Extensions.AI)

In .NET, embeddings are generated using Microsoft.Extensions.AI.

The important abstraction is:

IEmbeddingGenerator<TInput, TEmbedding>

This keeps your architecture:

provider-agnostic
DI-friendly
survivable over time

Minimal Setup

dotnet add package Microsoft.Extensions.AI
dotnet add package Microsoft.Extensions.AI.OpenAI

Creating an Embedding Generator

using OpenAI;
using Microsoft.Extensions.AI;
using Microsoft.Extensions.AI.OpenAI;

var client = new OpenAIClient("YOUR_API_KEY");

IEmbeddingGenerator<string, Embedding<float>> embeddings =
    client.AsEmbeddingGenerator("text-embedding-3-small");

Generating a Vector

var result = await embeddings.GenerateAsync(
    new[] { "Some activity text" });

float[] vector = result.First().Vector.ToArray();

That vector is what drives everything that follows.

⚠️ Embeddings Are Model-Locked (And Language Makes It Worse)

Embeddings are model-locked.

Meaning:

Vectors from different embedding models cannot be compared.

Even if:

the dimension matches
the text is identical
the provider is the same

Each model defines its own universe.

But here’s the kicker I learned the hard way:

Multilingual content amplifies this problem.

Even with multilingual-capable models:

language mixing shifts vector space
short messages lose semantic anchors
similarity becomes noisy

In an activity stream:

English UI
Spanish content
Russian replies
Emoji everywhere

Vector distance starts to mean “kind of related, maybe”.

That’s not good enough.

PostgreSQL + pgvector (Still the Right Choice)

Despite all that, PostgreSQL with pgvector is still the right foundation.

Enable pgvector

CREATE EXTENSION IF NOT EXISTS vector;

Chunk-Based Table

CREATE TABLE doc_chunks (
    id            bigserial PRIMARY KEY,
    document_id   bigint NOT NULL,
    chunk_index   int NOT NULL,
    content       text NOT NULL,
    embedding     vector(1536) NOT NULL,
    created_at    timestamptz NOT NULL DEFAULT now()
);

Technically correct.
Architecturally incomplete — as I later discovered.

Retrieval: Where Things Quietly Go Wrong

SELECT content
FROM doc_chunks
ORDER BY embedding <=> @query_embedding
LIMIT 5;

This query decides:

what the model sees
what it ignores
how wrong the answer will be

When language is mixed, retrieval looks correct — but isn’t.

Classic example: Moscow

Spanish: Moscú
Italian: Mosca
Meaning in Spanish: 🪰 a fly

So for a Spanish speaker, “Mosca” looks like it should mean insect (which it does), but it’s also the Italian name for Moscow.

Why RAG Failed in This Scenario

Let’s be honest:

Similar ≠ relevant
Multilingual ≠ multilingual-safe
Short activity messages ≠ documents
Noise ≠ knowledge

RAG didn’t fail because the model was bad.
It failed because the data had no structure.

Why This Article Exists

This article exists because:

I tried RAG on a real system
With real users
Writing in real languages
In real combinations

And the naïve RAG approach didn’t survive.

What Comes Next

The next article will not be about:

embeddings
models
APIs

It will be about structured RAG.

How I fixed this by:

introducing structure into the activity stream
separating concerns in the pipeline
controlling language before retrieval
reducing semantic noise
making RAG predictable again

In other words:
How to make RAG work after it breaks.

Final Thought

RAG is not magic.

It’s:

search + structure + discipline

If your data is chaotic, RAG will faithfully reflect that chaos — just with confidence.

Happy New Year 2026 🎆

If you’re reading this:
Happy New Year 2026.

Let’s make this the year we stop trusting demos
and start trusting systems that survived reality.

Let the year begin 🚀

RAG with PostgreSQL and C# (pros and cons)