Closing the Loop: Letting AI Finish the Work

Closing the Loop: Letting AI Finish the Work

Last week I was in Sochi on a ski trip. Instead of skiing, I got sick.

So I spent a few days locked in a hotel room, doing what I always do when I can’t move much: working. Or at least what looks like work. In reality, it’s my hobby.

YouTube wasn’t working well there, so I downloaded a few episodes in advance. Most of them were about OpenClaw and its creator, Peter Steinberger — also known for building PSPDFKit.

What started as passive watching turned into one of those rare moments of clarity you only get when you’re forced to slow down.

Shipping Code You Don’t Read (In the Right Context)

In one of the interviews, Peter said something that immediately caught my attention: he ships code he doesn’t review.

At first that sounds reckless. But then I realized… I sometimes do the same.

However, context matters.

Most of my daily work is research and development. I build experimental systems, prototypes, and proofs of concept — either for our internal office or for exploring ideas with clients. A lot of what I write is not production software yet. It’s exploratory. It’s about testing possibilities.

In that environment, I don’t always need to read every line of generated code.

If the use case works and the tests pass, that’s often enough.

I work mainly with C#, ASP.NET, Entity Framework, and XAF from DevExpress. I know these ecosystems extremely well. So if something breaks later, I can go in and fix it myself. But most of the time, the goal isn’t to perfect the implementation — it’s to validate the idea.

That’s a crucial distinction.

When writing production code for a customer, quality and review absolutely matter. You must inspect, verify, and ensure maintainability. But when working on experimental R&D, the priority is different: speed of validation and clarity of results.

In research mode, not every line needs to be perfect. It just needs to prove whether the idea works.

Working “Without Hands”

My real goal is to operate as much as possible without hands.

By that I mean minimizing direct human interaction with implementation. I want to express intent clearly enough so agents can execute it.

If I can describe a system precisely — especially in domains I know deeply — then the agent should be able to build, test, and refine it. My role becomes guiding and validating rather than manually constructing everything.

This is where modern development is heading.

The Problem With Vibe Coding

Peter talked about something that resonated deeply: when you’re vibe coding, you produce a lot of AI slop.

You prompt. The AI generates. You run it. It fails. You tweak. You run again. Still wrong. You tweak again.

Eventually, the human gets tired.

Even when you feel close to a solution, it’s not done until it’s actually done. And manually pushing that process forward becomes exhausting.

This is where many AI workflows break down. Not because the AI can’t generate solutions — but because the loop still depends too heavily on human intervention.

Closing the Loop

The key idea is simple and powerful: agentic development works when the agent can test and correct itself.

You must close the loop.

Instead of: human → prompt → AI → human checks → repeat

You want: AI → builds → tests → detects errors → fixes → tests again → repeat

The agent needs tools to evaluate its own output.

When AI can run tests, detect failures, and iterate automatically, something shifts. The process stops being experimental prompting and starts becoming real engineering.

Spec-Driven vs Self-Correcting Systems

Spec-driven development still matters. Some people dismiss it as too close to waterfall, but every methodology has flaws.

The real evolution is combining clear specifications with self-correcting loops.

The human defines:

  • The specification
  • The expected behavior
  • The acceptance criteria

Then the AI executes, tests, and refines until those criteria are satisfied.

The human doesn’t need to babysit every iteration. The human validates the result once the loop is closed.

Engineering vs Parasitic Ideas

There’s a concept from a book about parasitic ideas.

In social sciences, parasitic ideas can spread because they’re hard to disprove. In engineering, bad ideas fail quickly.

If you design a bridge incorrectly, it collapses. Reality provides immediate feedback.

Software — especially AI-generated software — needs the same grounding in reality. Without continuous testing and validation, generated code can drift into something that looks plausible but doesn’t work.

Closing the loop forces ideas to confront reality.

Tests are that reality.

Taking the Human Out of the Repetitive Loop

The goal isn’t removing humans entirely. It’s removing humans from repetitive validation.

The human should:

  • Define the specification
  • Define what “done” means
  • Approve the final result

The AI should:

  • Implement
  • Test
  • Detect issues
  • Fix itself
  • Repeat until success

When that happens, development becomes scalable in a new way. Not because AI writes code faster — but because AI can finish what it starts.

What I Realized in That Hotel Room

Getting sick in Sochi wasn’t part of the plan. But it forced me to slow down long enough to notice something important.

Most friction in modern development isn’t writing code. It’s closing loops.

We generate faster than we validate. We start more than we finish. We rely on humans to constantly re-check work that machines could verify themselves.

In research and experimental work, it’s fine not to inspect every line — as long as the system proves its behavior. In production work, deeper review is essential. Knowing when each approach applies is part of modern engineering maturity.

The future of agentic development isn’t just better models. It’s better loops.

Because in the end, nothing is finished until the loop is closed.

 

Structured RAG for Unknown and Mixed Languages

Structured RAG for Unknown and Mixed Languages

How I stopped my multilingual activity stream from turning RAG into chaos

In the previous article (RAG with PostgreSQL and C# (pros and cons) | Joche Ojeda) I explained how naïve RAG breaks when you run it over an activity stream.

Same UI language.
Totally unpredictable content language.
Spanish, Russian, Italian… sometimes all in the same message.

Humans handle that fine.
Vector retrieval… not so much.

This is the “silent failure” scenario: retrieval looks plausible, the LLM sounds confident, and you ship nonsense.

So I had to change the game.

The Idea: Structured RAG

Structured RAG means you don’t embed raw text and pray.

You add a step before retrieval:

  • Extract a structured representation from each activity record
  • Store it as metadata (JSON)
  • Use that metadata to filter, route, and rank
  • Then do vector similarity on a cleaner, more stable representation

Think of it like this:

Unstructured text is what users write.
Structured metadata is what your RAG system can trust.

Why This Fix Works for Mixed Languages

The core problem with activity streams is not “language”.

The core problem is: you have no stable shape.

When the shape is missing, everything becomes fuzzy:

  • Who is speaking?
  • What is this about?
  • Which entities are involved?
  • Is this a reply, a reaction, a mention, a task update?
  • What language(s) are in here?

Structured RAG forces you to answer those questions once, at write-time, and save the answers.

PostgreSQL: Add a JSONB Column (and Keep pgvector)

We keep the previous approach (pgvector) but we add a JSONB column for structured metadata.

ALTER TABLE activities
ADD COLUMN rag_meta jsonb NOT NULL DEFAULT '{}'::jsonb;

-- Optional: if you store embeddings per activity/chunk
-- you keep your existing embedding column(s) or chunk table.

Then index it.

CREATE INDEX activities_rag_meta_gin
ON activities
USING gin (rag_meta);

Now you can filter with JSON queries before you ever touch vector similarity.

A Proposed Schema (JSON Shape You Control)

The exact schema depends on your product, but for activity streams I want at least:

  • language: detected languages + confidence
  • actors: who did it
  • subjects: what object is involved (ticket, order, user, document)
  • topics: normalized tags
  • relationships: reply-to, mentions, references
  • summary: short canonical summary (ideally in one pivot language)
  • signals: sentiment/intent/type if you need it

Example JSON for one activity record:

{
  "schemaVersion": 1,
  "languages": [
    { "code": "es", "confidence": 0.92 },
    { "code": "ru", "confidence": 0.41 }
  ],
  "actor": {
    "id": "user:42",
    "displayName": "Joche"
  },
  "subjects": [
    { "type": "ticket", "id": "ticket:9831" }
  ],
  "topics": ["billing", "invoice", "error"],
  "relationships": {
    "replyTo": "activity:9912001",
    "mentions": ["user:7", "user:13"]
  },
  "intent": "support_request",
  "summary": {
    "pivotLanguage": "en",
    "text": "User reports an invoice calculation error and asks for help."
  }
}

Notice what happened here: the raw multilingual chaos got converted into a stable structure.

Write-Time Pipeline (The Part That Feels Expensive, But Saves You)

Structured RAG shifts work to ingestion time.

Yes, it costs tokens.
Yes, it adds steps.

But it gives you something you never had before: predictable retrieval.

Here’s the pipeline I recommend:

  1. Store raw activity (as-is, don’t lose the original)
  2. Detect language(s) (fast heuristic + LLM confirmation if needed)
  3. Extract structured metadata into your JSON schema
  4. Generate a canonical “summary” in a pivot language (often English)
  5. Embed the summary + key fields (not the raw messy text)
  6. Save JSON + embedding

The key decision: embed the stable representation, not the raw stream text.

C# Conceptual Implementation

I’m going to keep the code focused on the architecture. Provider details are swappable.

Entities

public sealed class Activity
{
    public long Id { get; set; }
    public string RawText { get; set; } = "";
    public string UiLanguage { get; set; } = "en";

    // JSONB column in Postgres
    public string RagMetaJson { get; set; } = "{}";

    // Vector (pgvector) - store via your pgvector mapping or raw SQL
    public float[] RagEmbedding { get; set; } = Array.Empty<float>();

    public DateTimeOffset CreatedAt { get; set; }
}

Metadata Contract (Strongly Typed in Code, Stored as JSONB)

public sealed class RagMeta
{
    public int SchemaVersion { get; set; } = 1;
    public List<DetectedLanguage> Languages { get; set; } = new();
    public ActorMeta Actor { get; set; } = new();
    public List<SubjectMeta> Subjects { get; set; } = new();
    public List<string> Topics { get; set; } = new();
    public RelationshipMeta Relationships { get; set; } = new();
    public string Intent { get; set; } = "unknown";
    public SummaryMeta Summary { get; set; } = new();
}

public sealed class DetectedLanguage
{
    public string Code { get; set; } = "und";
    public double Confidence { get; set; }
}

public sealed class ActorMeta
{
    public string Id { get; set; } = "";
    public string DisplayName { get; set; } = "";
}

public sealed class SubjectMeta
{
    public string Type { get; set; } = "";
    public string Id { get; set; } = "";
}

public sealed class RelationshipMeta
{
    public string? ReplyTo { get; set; }
    public List<string> Mentions { get; set; } = new();
}

public sealed class SummaryMeta
{
    public string PivotLanguage { get; set; } = "en";
    public string Text { get; set; } = "";
}

Extractor + Embeddings

You need two services:

  • Metadata extraction (LLM fills the schema)
  • Embeddings (Microsoft.Extensions.AI) for the stable text
public interface IRagMetaExtractor
{
    Task<RagMeta> ExtractAsync(Activity activity, CancellationToken ct);
}

Then the ingestion pipeline:

using System.Text.Json;
using Microsoft.Extensions.AI;

public sealed class StructuredRagIngestor
{
    private readonly IRagMetaExtractor _extractor;
    private readonly IEmbeddingGenerator<string, Embedding<float>> _embeddings;

    public StructuredRagIngestor(
        IRagMetaExtractor extractor,
        IEmbeddingGenerator<string, Embedding<float>> embeddings)
    {
        _extractor = extractor;
        _embeddings = embeddings;
    }

    public async Task ProcessAsync(Activity activity, CancellationToken ct)
    {
        // 1) Extract structured JSON
        RagMeta meta = await _extractor.ExtractAsync(activity, ct);

        // 2) Create stable text for embeddings (summary + keywords)
        string stableText =
            $"{meta.Summary.Text}\n" +
            $"Topics: {string.Join(", ", meta.Topics)}\n" +
            $"Intent: {meta.Intent}";

        // 3) Embed stable text
        var emb = await _embeddings.GenerateAsync(new[] { stableText }, ct);
        float[] vector = emb.First().Vector.ToArray();

        // 4) Save into activity record
        activity.RagMetaJson = JsonSerializer.Serialize(meta);
        activity.RagEmbedding = vector;

        // db.SaveChangesAsync(ct) happens outside (unit of work)
    }
}

This is the core move: you stop embedding chaos and start embedding structure.

Query Pipeline: JSON First, Vectors Second

When querying, you don’t jump into similarity search immediately.

You do:

  1. Parse the user question
  2. Decide filters (actor, subject type, topic)
  3. Filter with JSONB (fast narrowing)
  4. Then do vector similarity on the remaining set

Example: filter by topic + intent using JSONB:

SELECT id, raw_text
FROM activities
WHERE rag_meta @> '{"intent":"support_request"}'::jsonb
  AND rag_meta->'topics' ? 'invoice'
ORDER BY rag_embedding <=> @query_embedding
LIMIT 20;

That “JSON first” step is what keeps multilingual streams from poisoning your retrieval.

Tradeoffs (Because Nothing Is Free)

Structured RAG costs more at write-time:

  • more tokens
  • more latency
  • more moving parts

But it saves you at query-time:

  • less noise
  • better precision
  • more predictable answers
  • debuggable failures (because you can inspect metadata)

In real systems, I’ll take predictable and debuggable over “cheap but random” every day.

Final Thought

RAG over activity streams is hard because activity streams are messy by design.

If you want RAG to behave, you need structure.

Structured RAG is how you make retrieval boring again.
And boring retrieval is exactly what you want.

In the next article, I’ll go deeper into the exact pipeline details: language routing, mixed-language detection, pivot summaries, chunk policies, and how I made this production-friendly without turning it into a token-burning machine.

Let the year begin 🚀

“`

RAG with PostgreSQL and C# (pros and cons)

RAG with PostgreSQL and C# (pros and cons)

RAG with PostgreSQL and C#

Happy New Year 2026 — let the year begin

Happy New Year 2026 🎉

Let’s start the year with something honest.

This article exists because something broke.

I wasn’t trying to build a demo.
I was building an activity stream — the kind of thing every social or collaborative system eventually needs.

Posts.
Comments.
Reactions.
Short messages.
Long messages.
Noise.

At some point, the obvious question appeared:

“Can I do RAG over this?”

That question turned into this article.

The Original Problem: RAG over an Activity Stream

An activity stream looks simple until you actually use it as input.

In my case:

  • The UI language was English
  • The content language was… everything else

Users were writing:

  • Spanish
  • Russian
  • Italian
  • English
  • Sometimes all of them in the same message

Perfectly normal for humans.
Absolutely brutal for naïve RAG.

I tried the obvious approach:

  • embed everything
  • store vectors
  • retrieve similar content
  • augment the prompt

And very quickly, RAG went crazy.

Why It Failed (And Why This Matters)

The failure wasn’t dramatic.
No exceptions.
No errors.

Just… wrong answers.

Confident answers.
Fluent answers.
Wrong answers.

The problem was subtle:

  • Same concept, different languages
  • Mixed-language sentences
  • Short, informal activity messages
  • No guarantee of language consistency

In an activity stream:

  • You don’t control the language
  • You don’t control the structure
  • You don’t even control what a “document” is

And RAG assumes you do.

That’s when I stopped and realized:

RAG is not “plug-and-play” once your data becomes messy.

So… What Is RAG Really?

RAG stands for Retrieval-Augmented Generation.

The idea is simple:

Retrieve relevant data first, then let the model reason over it.

Instead of asking the model to remember everything, you let it look things up.

Search first.
Generate second.

Sounds obvious.
Still easy to get wrong.

The Real RAG Pipeline (No Marketing)

A real RAG system looks like this:

  1. Your data lives in a database
  2. Text is split into chunks
  3. Each chunk becomes an embedding
  4. Embeddings are stored as vectors
  5. A user asks a question
  6. The question is embedded
  7. The closest vectors are retrieved
  8. Retrieved content is injected into the prompt
  9. The model answers

Every step can fail silently.

Tokenization & Chunking (The First Trap)

Models don’t read text.
They read tokens.

This matters because:

  • prompts have hard limits
  • activity streams are noisy
  • short messages lose context fast

You usually don’t tokenize manually, but you do choose:

  • chunk size
  • overlap
  • grouping strategy

In activity streams, chunking is already a compromise — and multilingual content makes it worse.

Embeddings in .NET (Microsoft.Extensions.AI)

In .NET, embeddings are generated using Microsoft.Extensions.AI.

The important abstraction is:

IEmbeddingGenerator<TInput, TEmbedding>

This keeps your architecture:

  • provider-agnostic
  • DI-friendly
  • survivable over time

Minimal Setup

dotnet add package Microsoft.Extensions.AI
dotnet add package Microsoft.Extensions.AI.OpenAI

Creating an Embedding Generator

using OpenAI;
using Microsoft.Extensions.AI;
using Microsoft.Extensions.AI.OpenAI;

var client = new OpenAIClient("YOUR_API_KEY");

IEmbeddingGenerator<string, Embedding<float>> embeddings =
    client.AsEmbeddingGenerator("text-embedding-3-small");

Generating a Vector

var result = await embeddings.GenerateAsync(
    new[] { "Some activity text" });

float[] vector = result.First().Vector.ToArray();

That vector is what drives everything that follows.

⚠️ Embeddings Are Model-Locked (And Language Makes It Worse)

Embeddings are model-locked.

Meaning:

Vectors from different embedding models cannot be compared.

Even if:

  • the dimension matches
  • the text is identical
  • the provider is the same

Each model defines its own universe.

But here’s the kicker I learned the hard way:

Multilingual content amplifies this problem.

Even with multilingual-capable models:

  • language mixing shifts vector space
  • short messages lose semantic anchors
  • similarity becomes noisy

In an activity stream:

  • English UI
  • Spanish content
  • Russian replies
  • Emoji everywhere

Vector distance starts to mean “kind of related, maybe”.

That’s not good enough.

PostgreSQL + pgvector (Still the Right Choice)

Despite all that, PostgreSQL with pgvector is still the right foundation.

Enable pgvector

CREATE EXTENSION IF NOT EXISTS vector;

Chunk-Based Table

CREATE TABLE doc_chunks (
    id            bigserial PRIMARY KEY,
    document_id   bigint NOT NULL,
    chunk_index   int NOT NULL,
    content       text NOT NULL,
    embedding     vector(1536) NOT NULL,
    created_at    timestamptz NOT NULL DEFAULT now()
);

Technically correct.
Architecturally incomplete — as I later discovered.

Retrieval: Where Things Quietly Go Wrong

SELECT content
FROM doc_chunks
ORDER BY embedding <=> @query_embedding
LIMIT 5;

This query decides:

  • what the model sees
  • what it ignores
  • how wrong the answer will be

When language is mixed, retrieval looks correct — but isn’t.

Classic example: Moscow

  • Spanish: Moscú

  • Italian: Mosca

  • Meaning in Spanish: 🪰 a fly

So for a Spanish speaker, “Mosca” looks like it should mean insect (which it does), but it’s also the Italian name for Moscow.

Why RAG Failed in This Scenario

Let’s be honest:

  • Similar ≠ relevant
  • Multilingual ≠ multilingual-safe
  • Short activity messages ≠ documents
  • Noise ≠ knowledge

RAG didn’t fail because the model was bad.
It failed because the data had no structure.

Why This Article Exists

This article exists because:

  • I tried RAG on a real system
  • With real users
  • Writing in real languages
  • In real combinations

And the naïve RAG approach didn’t survive.

What Comes Next

The next article will not be about:

  • embeddings
  • models
  • APIs

It will be about structured RAG.

How I fixed this by:

  • introducing structure into the activity stream
  • separating concerns in the pipeline
  • controlling language before retrieval
  • reducing semantic noise
  • making RAG predictable again

In other words:
How to make RAG work after it breaks.

Final Thought

RAG is not magic.

It’s:

search + structure + discipline

If your data is chaotic, RAG will faithfully reflect that chaos — just with confidence.

Happy New Year 2026 🎆

If you’re reading this:
Happy New Year 2026.

Let’s make this the year we stop trusting demos
and start trusting systems that survived reality.

Let the year begin 🚀

The Dangers (and Joys) of Vibe Coding

The Dangers (and Joys) of Vibe Coding

It’s Sunday — so maybe it’s time to write an article to break the flow I’ve been in lately. I’ve been deep into researching design patterns for Oqtane, the web application framework created by Shaun Walker.

Today I woke up really early, around 4:30 a.m. I went downstairs, made coffee, and decided to play around with some applications I had on my list. One of them was HotKey Typer by James Montemagno.

I ran it for the first time and instantly loved it. It’s super simple and useful — but I had a problem. I started using glasses a few years ago, and I generally have trouble with small UI elements on the computer. I usually work at 150% scaling. Unfortunately, James’s app has a fixed window size, so everything looked cut off.

Since I’ve been coding a lot lately, I figured it would be an easy fix. I tweaked it — and it worked! Everything looked better, but a bit too large, so I adjusted it again… and again… and again. Before I knew it, I had turned it into a totally different application.

I was vibe coding for four or five hours straight. In the end, I added a lot of new functionality because I genuinely loved the app and the idea behind it. I added sets (or collections) — basically groups of snippets you can assign to keys 1–9. Then I added autosave, a settings screen, and a reset option for the collections. Every time I finished one feature, I said, “Just one more thing.” Five minutes turned into five hours.

When I was done, I recorded a demo video. It was a lot of fun — and the result was genuinely useful. I even want to create an installer for myself so I can easily reinstall it if I ever reformat my computer. (I used to be that guy who formatted his PC every month. Not anymore… but you never know.)

Lessons From Vibe Coding

I learned a lot from this little experiment. I’ve been vibe coding nonstop for about three months now — I’ve even used up all my Copilot credits before the 25th of the month more than once! Vibe coding is a lot of fun, but it can easily spiral out of control and take you in the wrong direction.

Next week, I want to change my approach a bit — maybe follow a more structured pattern.

Another thing this reminded me of is how important it is to work in a team. My business partner, José Javier Columbie, has always helped me with that. We’ve been working together for about 10 years now. I’m the kind of developer who keeps rewriting, refactoring, optimizing, making things faster, reusable, turning them into plugins or frameworks — and sometimes the original task was actually quite small.

That’s where Javier comes in. He’s the one who says, “José, it’s done. This is what they asked for, and this is what we’re delivering.” He keeps me grounded. Every developer needs that — or at least needs to learn how to set that boundary for themselves.

Final Thoughts

So that’s my takeaway from today’s vibe coding session: have fun, but know when to stop.

I’ll include below the links to:

James Montemagno’s original HotKey Typer repository

My fork with the modifications

egarim/app-hotkeytyper

A video demo of what I built

Be careful when you’re vibe coding — it’s a great place to find flow, but it’s also easy to lose track of direction.

See you in the next article.

SyncFramework for XPO: Updated for .NET 8 & 9  and DevExpress 24.2.3!

SyncFramework for XPO: Updated for .NET 8 & 9 and DevExpress 24.2.3!

SyncFramework for XPO is a specialized implementation of our delta encoding synchronization library, designed specifically for DevExpress XPO users. It enables efficient data synchronization by tracking and transmitting only the changes between data versions, optimizing both bandwidth usage and processing time.

What’s New

  • Base target framework updated to .NET 8.0
  • Added compatibility with .NET 9.0
  • Updated DevExpress XPO dependencies to 24.2.3
  • Continued support for delta encoding synchronization
  • Various performance improvements and bug fixes

Framework Compatibility

  • Primary Target: .NET 8.0
  • Additional Support: .NET 9.0

Our XPO implementation continues to serve the DevExpress community.

Key Features

  • Seamless integration with DevExpress XPO
  • Efficient delta-based synchronization
  • Support for multiple database providers
  • Cross-platform compatibility
  • Easy integration with existing XPO and XAF applications

As always, if you own a license, you can compile the source code yourself from our GitHub repository. The framework maintains its commitment to providing reliable data synchronization for XPO applications.

Happy Delta Encoding! ?