by Joche Ojeda | Feb 16, 2026 | A.I, Apps, CLI, Github Copilot, SDK
A strange week
This week I was going to the university every day to study Russian.
Learning a new language as an adult is a very humbling experience. One moment you are designing enterprise architectures, and the next moment you are struggling to say:
me siento bien
which in Russian is: я чувствую себя хорошо
So like any developer, I started cheating immediately.
I began using AI for everything:
- ChatGPT to review my exercises
- GitHub Copilot inside VS Code correcting my grammar
- Sometimes both at the same time
It worked surprisingly well. Almost too well.
At some point during the week, while going back and forth between my Russian homework and my development work, I noticed something interesting.
I was using several AI tools, but the one I kept returning to the most — without even thinking about it — was GitHub Copilot inside Visual Studio Code.
Not in the browser. Not in a separate chat window. Right there in my editor.
That’s when something clicked.
Two favorite tools
XAF is my favorite application framework. I’ve built countless systems with it — ERPs, internal tools, experiments, prototypes.
GitHub Copilot has become my favorite AI agent.
I use it constantly:
- writing code
- reviewing ideas
- fixing small mistakes
- even correcting my Russian exercises
And while using Copilot so much inside Visual Studio Code, I started thinking:
What would it feel like to have Copilot inside my own applications?
Not next to them. Inside them.
That idea stayed in my head for a few days until curiosity won.
The innocent experiment
I discovered the GitHub Copilot SDK.
At first glance it looked simple: a .NET library that allows you to embed Copilot into your own applications.
My first thought:
“Nice. This should take 30 minutes.”
Developers should always be suspicious of that sentence.
Because it never takes 30 minutes.
First success (false confidence)
The initial integration was surprisingly easy.
I managed to get a basic response from Copilot inside a test environment. Seeing AI respond from inside my own application felt a bit surreal.
For a moment I thought:
Done. Easy win.
Then I tried to make it actually useful.
That’s when the adventure began.
The rabbit hole
I didn’t want just a chatbot.
I wanted an agent that could actually interact with the application.
Ask questions. Query data. Help create things.
That meant enabling tool calling and proper session handling.
And suddenly everything started failing.
Timeouts. Half responses. Random behavior depending on the model. Sessions hanging for no clear reason.
At first I blamed myself.
Then my integration. Then threading. Then configuration.
Three or four hours later, after trying everything I could think of, I finally discovered the real issue:
It wasn’t my code.
It was the model.
Some models were timing out during tool calls. Others worked perfectly.
The moment I switched models and everything suddenly worked was one of those small but deeply satisfying developer victories.
You know the moment.
You sit back. Look at the screen. And just smile.
The moment it worked
Once everything was connected properly, something changed.
Copilot stopped feeling like a coding assistant and started feeling like an agent living inside the application.
Not in the IDE. Not in a browser tab. Inside the system itself.
That changes the perspective completely.
Instead of building forms and navigation flows, you start thinking:
What if the user could just ask?
Instead of:
- open this screen
- filter this grid
- generate this report
You imagine:
- “Show me what matters.”
- “Create what I need.”
- “Explain this data.”
The interface becomes conversational.
And once you see that working inside your own application, it’s very hard to unsee it.
Why this experiment mattered to me
This wasn’t about building a feature for a client. It wasn’t even about shipping production code.
Most of my work is research and development. Prototypes. Ideas. Experiments.
And this experiment changed the way I see enterprise applications.
For decades we optimized screens, menus, and workflows.
But AI introduces a completely different interaction model.
One where the application is no longer just something you navigate.
It’s something you talk to.
Also… Russian homework
Ironically, this whole experiment started because I was trying to survive my Russian classes.
Using Copilot to correct grammar. Using AI to review exercises. Switching constantly between tools.
Eventually that daily workflow made me curious:
What happens if Copilot is not next to my application, but inside it?
Sometimes innovation doesn’t start with a big strategy.
Sometimes it starts with curiosity and a small personal frustration.
What comes next
This is just the beginning.
Now that AI can live inside applications:
- conversations can become interfaces
- tools can be invoked by language
- workflows can become more flexible
We are moving from:
software you operate
to:
software you collaborate with
And honestly, that’s a very exciting direction.
Final thought
This entire journey started with a simple curiosity while studying Russian and writing code in the same week.
A few hours of experimentation later, Copilot was living inside my favorite framework.
And now I can’t imagine going back.
Note: The next article will go deep into the technical implementation — the architecture, the service layer, tool calling, and how I wired everything into XAF for both Blazor and WinForms.
by Joche Ojeda | Jan 5, 2026 | A.I, Postgres
RAG with PostgreSQL and C#
Happy New Year 2026 — let the year begin
Happy New Year 2026 🎉
Let’s start the year with something honest.
This article exists because something broke.
I wasn’t trying to build a demo.
I was building an activity stream — the kind of thing every social or collaborative system eventually needs.
Posts.
Comments.
Reactions.
Short messages.
Long messages.
Noise.
At some point, the obvious question appeared:
“Can I do RAG over this?”
That question turned into this article.
The Original Problem: RAG over an Activity Stream
An activity stream looks simple until you actually use it as input.
In my case:
- The UI language was English
- The content language was… everything else
Users were writing:
- Spanish
- Russian
- Italian
- English
- Sometimes all of them in the same message
Perfectly normal for humans.
Absolutely brutal for naïve RAG.
I tried the obvious approach:
- embed everything
- store vectors
- retrieve similar content
- augment the prompt
And very quickly, RAG went crazy.
Why It Failed (And Why This Matters)
The failure wasn’t dramatic.
No exceptions.
No errors.
Just… wrong answers.
Confident answers.
Fluent answers.
Wrong answers.
The problem was subtle:
- Same concept, different languages
- Mixed-language sentences
- Short, informal activity messages
- No guarantee of language consistency
In an activity stream:
- You don’t control the language
- You don’t control the structure
- You don’t even control what a “document” is
And RAG assumes you do.
That’s when I stopped and realized:
RAG is not “plug-and-play” once your data becomes messy.
So… What Is RAG Really?
RAG stands for Retrieval-Augmented Generation.
The idea is simple:
Retrieve relevant data first, then let the model reason over it.
Instead of asking the model to remember everything, you let it look things up.
Search first.
Generate second.
Sounds obvious.
Still easy to get wrong.
The Real RAG Pipeline (No Marketing)
A real RAG system looks like this:
- Your data lives in a database
- Text is split into chunks
- Each chunk becomes an embedding
- Embeddings are stored as vectors
- A user asks a question
- The question is embedded
- The closest vectors are retrieved
- Retrieved content is injected into the prompt
- The model answers
Every step can fail silently.
Tokenization & Chunking (The First Trap)
Models don’t read text.
They read tokens.
This matters because:
- prompts have hard limits
- activity streams are noisy
- short messages lose context fast
You usually don’t tokenize manually, but you do choose:
- chunk size
- overlap
- grouping strategy
In activity streams, chunking is already a compromise — and multilingual content makes it worse.
Embeddings in .NET (Microsoft.Extensions.AI)
In .NET, embeddings are generated using Microsoft.Extensions.AI.
The important abstraction is:
IEmbeddingGenerator<TInput, TEmbedding>
This keeps your architecture:
- provider-agnostic
- DI-friendly
- survivable over time
Minimal Setup
dotnet add package Microsoft.Extensions.AI
dotnet add package Microsoft.Extensions.AI.OpenAI
Creating an Embedding Generator
using OpenAI;
using Microsoft.Extensions.AI;
using Microsoft.Extensions.AI.OpenAI;
var client = new OpenAIClient("YOUR_API_KEY");
IEmbeddingGenerator<string, Embedding<float>> embeddings =
client.AsEmbeddingGenerator("text-embedding-3-small");
Generating a Vector
var result = await embeddings.GenerateAsync(
new[] { "Some activity text" });
float[] vector = result.First().Vector.ToArray();
That vector is what drives everything that follows.
⚠️ Embeddings Are Model-Locked (And Language Makes It Worse)
Embeddings are model-locked.
Meaning:
Vectors from different embedding models cannot be compared.
Even if:
- the dimension matches
- the text is identical
- the provider is the same
Each model defines its own universe.
But here’s the kicker I learned the hard way:
Multilingual content amplifies this problem.
Even with multilingual-capable models:
- language mixing shifts vector space
- short messages lose semantic anchors
- similarity becomes noisy
In an activity stream:
- English UI
- Spanish content
- Russian replies
- Emoji everywhere
Vector distance starts to mean “kind of related, maybe”.
That’s not good enough.
PostgreSQL + pgvector (Still the Right Choice)
Despite all that, PostgreSQL with pgvector is still the right foundation.
Enable pgvector
CREATE EXTENSION IF NOT EXISTS vector;
Chunk-Based Table
CREATE TABLE doc_chunks (
id bigserial PRIMARY KEY,
document_id bigint NOT NULL,
chunk_index int NOT NULL,
content text NOT NULL,
embedding vector(1536) NOT NULL,
created_at timestamptz NOT NULL DEFAULT now()
);
Technically correct.
Architecturally incomplete — as I later discovered.
Retrieval: Where Things Quietly Go Wrong
SELECT content
FROM doc_chunks
ORDER BY embedding <=> @query_embedding
LIMIT 5;
This query decides:
- what the model sees
- what it ignores
- how wrong the answer will be
When language is mixed, retrieval looks correct — but isn’t.
Classic example: Moscow
So for a Spanish speaker, “Mosca” looks like it should mean insect (which it does), but it’s also the Italian name for Moscow.
Why RAG Failed in This Scenario
Let’s be honest:
- Similar ≠ relevant
- Multilingual ≠ multilingual-safe
- Short activity messages ≠ documents
- Noise ≠ knowledge
RAG didn’t fail because the model was bad.
It failed because the data had no structure.
Why This Article Exists
This article exists because:
- I tried RAG on a real system
- With real users
- Writing in real languages
- In real combinations
And the naïve RAG approach didn’t survive.
What Comes Next
The next article will not be about:
It will be about structured RAG.
How I fixed this by:
- introducing structure into the activity stream
- separating concerns in the pipeline
- controlling language before retrieval
- reducing semantic noise
- making RAG predictable again
In other words:
How to make RAG work after it breaks.
Final Thought
RAG is not magic.
It’s:
search + structure + discipline
If your data is chaotic, RAG will faithfully reflect that chaos — just with confidence.
Happy New Year 2026 🎆
If you’re reading this:
Happy New Year 2026.
Let’s make this the year we stop trusting demos
and start trusting systems that survived reality.
Let the year begin 🚀