Same UI language.
Totally unpredictable content language.
Spanish, Russian, Italian… sometimes all in the same message.
Humans handle that fine.
Vector retrieval… not so much.
This is the “silent failure” scenario: retrieval looks plausible, the LLM sounds confident, and you ship nonsense.
So I had to change the game.
The Idea: Structured RAG
Structured RAG means you don’t embed raw text and pray.
You add a step before retrieval:
Extract a structured representation from each activity record
Store it as metadata (JSON)
Use that metadata to filter, route, and rank
Then do vector similarity on a cleaner, more stable representation
Think of it like this:
Unstructured text is what users write.
Structured metadata is what your RAG system can trust.
Why This Fix Works for Mixed Languages
The core problem with activity streams is not “language”.
The core problem is: you have no stable shape.
When the shape is missing, everything becomes fuzzy:
Who is speaking?
What is this about?
Which entities are involved?
Is this a reply, a reaction, a mention, a task update?
What language(s) are in here?
Structured RAG forces you to answer those questions once, at write-time, and save the answers.
PostgreSQL: Add a JSONB Column (and Keep pgvector)
We keep the previous approach (pgvector) but we add a JSONB column for structured metadata.
ALTER TABLE activities
ADD COLUMN rag_meta jsonb NOT NULL DEFAULT '{}'::jsonb;
-- Optional: if you store embeddings per activity/chunk
-- you keep your existing embedding column(s) or chunk table.
Then index it.
CREATE INDEX activities_rag_meta_gin
ON activities
USING gin (rag_meta);
Now you can filter with JSON queries before you ever touch vector similarity.
A Proposed Schema (JSON Shape You Control)
The exact schema depends on your product, but for activity streams I want at least:
language: detected languages + confidence
actors: who did it
subjects: what object is involved (ticket, order, user, document)
topics: normalized tags
relationships: reply-to, mentions, references
summary: short canonical summary (ideally in one pivot language)
Notice what happened here: the raw multilingual chaos got converted into a stable structure.
Write-Time Pipeline (The Part That Feels Expensive, But Saves You)
Structured RAG shifts work to ingestion time.
Yes, it costs tokens.
Yes, it adds steps.
But it gives you something you never had before: predictable retrieval.
Here’s the pipeline I recommend:
Store raw activity (as-is, don’t lose the original)
Detect language(s) (fast heuristic + LLM confirmation if needed)
Extract structured metadata into your JSON schema
Generate a canonical “summary” in a pivot language (often English)
Embed the summary + key fields (not the raw messy text)
Save JSON + embedding
The key decision: embed the stable representation, not the raw stream text.
C# Conceptual Implementation
I’m going to keep the code focused on the architecture. Provider details are swappable.
Entities
public sealed class Activity
{
public long Id { get; set; }
public string RawText { get; set; } = "";
public string UiLanguage { get; set; } = "en";
// JSONB column in Postgres
public string RagMetaJson { get; set; } = "{}";
// Vector (pgvector) - store via your pgvector mapping or raw SQL
public float[] RagEmbedding { get; set; } = Array.Empty<float>();
public DateTimeOffset CreatedAt { get; set; }
}
Metadata Contract (Strongly Typed in Code, Stored as JSONB)
public sealed class RagMeta
{
public int SchemaVersion { get; set; } = 1;
public List<DetectedLanguage> Languages { get; set; } = new();
public ActorMeta Actor { get; set; } = new();
public List<SubjectMeta> Subjects { get; set; } = new();
public List<string> Topics { get; set; } = new();
public RelationshipMeta Relationships { get; set; } = new();
public string Intent { get; set; } = "unknown";
public SummaryMeta Summary { get; set; } = new();
}
public sealed class DetectedLanguage
{
public string Code { get; set; } = "und";
public double Confidence { get; set; }
}
public sealed class ActorMeta
{
public string Id { get; set; } = "";
public string DisplayName { get; set; } = "";
}
public sealed class SubjectMeta
{
public string Type { get; set; } = "";
public string Id { get; set; } = "";
}
public sealed class RelationshipMeta
{
public string? ReplyTo { get; set; }
public List<string> Mentions { get; set; } = new();
}
public sealed class SummaryMeta
{
public string PivotLanguage { get; set; } = "en";
public string Text { get; set; } = "";
}
Extractor + Embeddings
You need two services:
Metadata extraction (LLM fills the schema)
Embeddings (Microsoft.Extensions.AI) for the stable text
public interface IRagMetaExtractor
{
Task<RagMeta> ExtractAsync(Activity activity, CancellationToken ct);
}
Then the ingestion pipeline:
using System.Text.Json;
using Microsoft.Extensions.AI;
public sealed class StructuredRagIngestor
{
private readonly IRagMetaExtractor _extractor;
private readonly IEmbeddingGenerator<string, Embedding<float>> _embeddings;
public StructuredRagIngestor(
IRagMetaExtractor extractor,
IEmbeddingGenerator<string, Embedding<float>> embeddings)
{
_extractor = extractor;
_embeddings = embeddings;
}
public async Task ProcessAsync(Activity activity, CancellationToken ct)
{
// 1) Extract structured JSON
RagMeta meta = await _extractor.ExtractAsync(activity, ct);
// 2) Create stable text for embeddings (summary + keywords)
string stableText =
$"{meta.Summary.Text}\n" +
$"Topics: {string.Join(", ", meta.Topics)}\n" +
$"Intent: {meta.Intent}";
// 3) Embed stable text
var emb = await _embeddings.GenerateAsync(new[] { stableText }, ct);
float[] vector = emb.First().Vector.ToArray();
// 4) Save into activity record
activity.RagMetaJson = JsonSerializer.Serialize(meta);
activity.RagEmbedding = vector;
// db.SaveChangesAsync(ct) happens outside (unit of work)
}
}
This is the core move: you stop embedding chaos and start embedding structure.
Query Pipeline: JSON First, Vectors Second
When querying, you don’t jump into similarity search immediately.
You do:
Parse the user question
Decide filters (actor, subject type, topic)
Filter with JSONB (fast narrowing)
Then do vector similarity on the remaining set
Example: filter by topic + intent using JSONB:
SELECT id, raw_text
FROM activities
WHERE rag_meta @> '{"intent":"support_request"}'::jsonb
AND rag_meta->'topics' ? 'invoice'
ORDER BY rag_embedding <=> @query_embedding
LIMIT 20;
That “JSON first” step is what keeps multilingual streams from poisoning your retrieval.
Tradeoffs (Because Nothing Is Free)
Structured RAG costs more at write-time:
more tokens
more latency
more moving parts
But it saves you at query-time:
less noise
better precision
more predictable answers
debuggable failures (because you can inspect metadata)
In real systems, I’ll take predictable and debuggable over “cheap but random” every day.
Final Thought
RAG over activity streams is hard because activity streams are messy by design.
If you want RAG to behave, you need structure.
Structured RAG is how you make retrieval boring again.
And boring retrieval is exactly what you want.
In the next article, I’ll go deeper into the exact pipeline details: language routing, mixed-language detection, pivot summaries, chunk policies, and how I made this production-friendly without turning it into a token-burning machine.
In my previous post (or “mental note,” as I like to call them), I covered how to set up multi-tenancy in Oqtane. Today, I got a really nice surprise — Shaun Walker just posted an excellent video explaining how multi-tenancy works,
along with its advantages and possible drawbacks.
From my point of view, the advantages clearly outweigh the disadvantages,
although it depends on your specific scenario.
Extending the Previous Example
I wanted to improve my previous example a bit. So, I created a new GitHub repository using the same base code,
but this time I added hostnames for each tenant.
A hostname is basically the domain that points to one of your tenants in Oqtane.
In a typical setup, you use DNS records for this.
The simplest case is an A record that points to a specific IP address.
When a request arrives, the server reads the hostname from the request and routes it to the correct tenant.
This part isn’t specific to Oqtane — it’s how web servers work in general.
The concept exists in IIS, Apache, and NGINX,
and it’s part of basic networking theory. If you want to learn more,
there are countless articles about how DNS works.
A Small Story from the Past
This actually takes me back — one of the first things I learned as a teenager was how to configure DNS
and run my own Apache web server.
I even started offering web hosting from my home using an old 486 computer (yes, really).
Eventually, my internet provider noticed what I was doing, blocked my connection, and called my parents.
Let’s just say… that Christmas was canceled for me. 😅
Anyway, that’s a story for another time.
Setting Up Local Domains for Tenants
For today’s example, I’m using the same structure as before:
One host site
Two tenant sites: MyCompany1 and MyCompany2
I want to show you how to assign domain names to each of them.
If you’re running everything locally (for example, through Visual Studio or VS Code),
you can’t use real domain names — but you can simulate them using the Windows hosts file.
If you’ve ever wondered how your computer resolves localhost to 127.0.0.1,
the answer lies in that file. It’s located inside the Windows system folder,
and it maps domain names to IP addresses.
Here’s the cool part: you can add your own domains there, pointing them to any IP you like.
It’s a great trick for local testing.
Below, you’ll see a screenshot of my hosts file.
I’ve mapped my fake domains to my local IP address,
so when I open them in the browser, the requests go straight to my Kestrel server, which then routes them to the correct tenant.
How to Edit the Windows Hosts File
Editing the hosts file in Windows is simple, but you need administrative permissions.
Here’s how you can do it safely:
Press Start, type Notepad, then right-click it and select Run as administrator.
Once Notepad opens, go to File → Open and browse to:
C:\Windows\System32\drivers\etc\hosts
In the open dialog, change the filter from “Text Documents (*.txt)” to “All Files (*.*)”
so you can see the hosts file.
Add your entries at the bottom of the file. For example:
127.0.0.1 mycompany1.xyz
127.0.0.1 mycompany2.xyz
Each line maps an IP address to a domain name.
Save the file and close Notepad.
Open your browser and visit http://mycompany1.xyz:44398
(or the port your Oqtane app is running on).
You should see the tenant corresponding to that domain.
⚠️ Important: If you edit the file without admin rights,
you won’t be able to save it. Also, be careful — if you modify or delete system entries by accident,
your network resolution might stop working.
Here is how my host file actually looks at the moment
Set siteURL for :Company 1
Set siteURL for :Company 2
Testing with Real Domains
Of course, this same logic applies to real domains too — as long as your Oqtane instance is publicly accessible.
In one of the next parts (maybe part 3 or 4), I’ll show how to configure it using a web server like Apache. I know that NGINX is more popular these days,
but I’ve used Apache since my teenage years, so I’m more comfortable with it.
Still, I’ll probably demonstrate both.
Most developers today use cloud providers like AWS or Azure,
but honestly, I still prefer spinning up a simple Ubuntu server and doing everything manually.
The best tool is the one you know best — and for me, that’s Apache on Ubuntu.
Demo
As you can see there is a little bit of a different behavior if is a default site or not If it’s a default site it will redirect to that URL if not it’s going to redirect to the default site URL
Resources
🧩 GitHub Repository — This project is based on the previous example
but adds hostname configuration and uses SQLite for simplicity.
If you hang out around developers long enough, you’ll notice we don’t just use tools — we nickname them, mispronounce them, and sometimes turn them into full-blown mascots. Here are three favorites: WSL, SQL, and GitHub Copilot’s Spec Kit.
WSL → “Weasel”
English reality: WSL stands for Windows Subsystem for Linux.
Nickname: Said quickly as “double-u S L,” it echoes weasel, so the meme stuck.
Spanish (El Salvador / Latin America): In El Salvador and many Latin American countries, the letter W is read as “doble be” (not doble u). So WSL is pronounced “doble be, ese, ele.”
SQL → “Sequel”
English reality: SQL stands for Structured Query Language.
Pronunciation: Both “S-Q-L” and “sequel” are used in English.
Spanish (LatAm): Most developers say it letter by letter: “ese cu e ele.” Bilingual teams sometimes mix in “sequel.”
Spec Kit → “Speckified” (Spooky Spell)
English reality: GitHub Copilot’s Spec Kit helps scaffold code from specs.
Community fun: Projects get “speckified,” a word that mischievously echoes “spookified.” Our playful mascot idea is a wizard enchanting a codebase: You have been Speckified!
Spanish (LatAm): Phonetically, SPEC is “ese, pe, e, ce.” In casual talk many devs just say “espec” (es-pek) to keep the pun alive.
Quick Reference (Latin American / El Salvador Spanish)
Acronym
English Pronunciation
Spanish (LatAm / El Salvador) Phonetics
Nickname / Mascot
WSL
“double-u S L” (sounds like weasel)
“doble be, ese, ele”
Weasel
SQL
“S-Q-L” or “sequel”
“ese cu e ele”
Sequel Robot
SPEC
“spec” → “speckified”
“ese, pe, e, ce” (or “espec”)
Spec Wizard (spell)
Why This Matters
These playful twists — weasel, sequel robot, speckified wizard — show how dev culture works:
Acronyms turn into characters.
English vs. Spanish pronunciations add layers of humor.
Memes make otherwise dry tools easier to talk about.
Next time someone says their project is fully speckified on WSL with SQL, you might be hearing about a weasel, a robot, and a wizard casting spooky spec spells.
Integration testing is a critical phase in software development where individual modules are combined and tested as a group. In our accounting system, we’ve created a robust integration test that demonstrates how the Document module and Chart of Accounts module interact to form a functional accounting system. In this post, I’ll explain the components and workflow of our integration test.
The Architecture of Our Integration Test
Our integration test simulates a small retail business’s accounting operations. Let’s break down the key components:
Test Fixture Setup
The AccountingIntegrationTests class contains all our test methods and is decorated with the [TestFixture] attribute to identify it as a NUnit test fixture. The Setup method initializes our services and data structures:
[SetUp]
public async Task Setup()
{
// Initialize services
_auditService = new AuditService();
_documentService = new DocumentService(_auditService);
_transactionService = new TransactionService();
_accountValidator = new AccountValidator();
_accountBalanceCalculator = new AccountBalanceCalculator();
// Initialize storage
_accounts = new Dictionary<string, AccountDto>();
_documents = new Dictionary<string, IDocument>();
_transactions = new Dictionary<string, ITransaction>();
// Create Chart of Accounts
await SetupChartOfAccounts();
}
This method:
Creates instances of our services
Sets up in-memory storage for our entities
Calls SetupChartOfAccounts() to create our initial chart of accounts
Chart of Accounts Setup
The SetupChartOfAccounts method creates a basic chart of accounts for our retail business:
private async Task SetupChartOfAccounts()
{
// Clear accounts dictionary in case this method is called multiple times
_accounts.Clear();
// Assets (1xxxx)
await CreateAccount("Cash", "10100", AccountType.Asset, "Cash on hand and in banks");
await CreateAccount("Accounts Receivable", "11000", AccountType.Asset, "Amounts owed by customers");
// ... more accounts
// Verify all accounts are valid
foreach (var account in _accounts.Values)
{
bool isValid = _accountValidator.ValidateAccount(account);
Assert.That(isValid, Is.True, $"Account {account.AccountName} validation failed");
}
// Verify expected number of accounts
Assert.That(_accounts.Count, Is.EqualTo(17), "Expected 17 accounts in chart of accounts");
}
This method:
Creates accounts for each category (Assets, Liabilities, Equity, Revenue, and Expenses)
Validates each account using our AccountValidator
Ensures we have the expected number of accounts
Individual Transaction Tests
We have separate test methods for specific transaction types:
Purchase of Inventory
CanRecordPurchaseOfInventory demonstrates recording a supplier invoice:
[Test]
public async Task CanRecordPurchaseOfInventory()
{
// Arrange - Create document
var document = new DocumentDto { /* properties */ };
// Act - Create document, transaction, and entries
var createdDocument = await _documentService.CreateDocumentAsync(document, TEST_USER);
// ... create transaction and entries
// Validate transaction
var isValid = await _transactionService.ValidateTransactionAsync(
createdTransaction.Id, ledgerEntries);
// Assert
Assert.That(isValid, Is.True, "Transaction should be balanced");
}
Validates that the transaction is balanced (debits = credits)
Sale to Customer
CanRecordSaleToCustomer demonstrates recording a customer sale:
[Test]
public async Task CanRecordSaleToCustomer()
{
// Similar pattern to inventory purchase, but with sale-specific entries
// ...
// Create ledger entries - a more complex transaction with multiple entries
var ledgerEntries = new List<ILedgerEntry>
{
// Cash received
// Sales revenue
// Cost of goods sold
// Reduce inventory
};
// Validate transaction
// ...
}
This test is more complex, recording both the revenue side (debit Cash, credit Sales Revenue) and the cost side (debit Cost of Goods Sold, credit Inventory) of a sale.
Full Accounting Cycle Test
The CanExecuteFullAccountingCycle method ties everything together:
[Test]
public async Task CanExecuteFullAccountingCycle()
{
// Run these in a defined order, with clean account setup first
_accounts.Clear();
_documents.Clear();
_transactions.Clear();
await SetupChartOfAccounts();
// 1. Record inventory purchase
await RecordPurchaseOfInventory();
// 2. Record sale to customer
await RecordSaleToCustomer();
// 3. Record utility expense
await RecordBusinessExpense();
// 4. Create a payment to supplier
await RecordPaymentToSupplier();
// 5. Verify account balances
await VerifyAccountBalances();
}
This test:
Starts with a clean state
Records a sequence of business operations
Verifies the final account balances
Mock Account Balance Calculator
The MockAccountBalanceCalculator is a crucial part of our test that simulates how a real database would work:
public class MockAccountBalanceCalculator : AccountBalanceCalculator
{
private readonly Dictionary<string, AccountDto> _accounts;
private readonly Dictionary<Guid, List<LedgerEntryDto>> _ledgerEntriesByTransaction = new();
private readonly Dictionary<Guid, decimal> _accountBalances = new();
public MockAccountBalanceCalculator(
Dictionary<string, AccountDto> accounts,
Dictionary<string, ITransaction> transactions)
{
_accounts = accounts;
// Create mock ledger entries for each transaction
InitializeLedgerEntries(transactions);
// Calculate account balances based on ledger entries
CalculateAllBalances();
}
// Methods to initialize and calculate
// ...
}
This class:
Takes our accounts and transactions as inputs
Creates a collection of ledger entries for each transaction
Calculates account balances based on these entries
Provides methods to query account balances and ledger entries
The InitializeLedgerEntries method creates a collection of ledger entries for each transaction:
private void InitializeLedgerEntries(Dictionary<string, ITransaction> transactions)
{
// For inventory purchase
if (transactions.TryGetValue("InventoryPurchase", out var inventoryPurchase))
{
var entries = new List<LedgerEntryDto>
{
// Create entries for this transaction
// ...
};
_ledgerEntriesByTransaction[inventoryPurchase.Id] = entries;
}
// For other transactions
// ...
}
The CalculateAllBalances method processes these entries to calculate account balances:
private void CalculateAllBalances()
{
// Initialize all account balances to zero
foreach (var account in _accounts.Values)
{
_accountBalances[account.Id] = 0m;
}
// Process each transaction's ledger entries
foreach (var entries in _ledgerEntriesByTransaction.Values)
{
foreach (var entry in entries)
{
if (entry.EntryType == EntryType.Debit)
{
_accountBalances[entry.AccountId] += entry.Amount;
}
else // Credit
{
_accountBalances[entry.AccountId] -= entry.Amount;
}
}
}
}
This approach closely mirrors how a real accounting system would work with a database:
Ledger entries are stored in collections (similar to database tables)
Account balances are calculated by processing all relevant entries
The calculator provides methods to query this data (similar to a repository)
Balance Verification
The VerifyAccountBalances method uses our mock calculator to verify account balances:
private async Task VerifyAccountBalances()
{
// Create mock balance calculator
var mockBalanceCalculator = new MockAccountBalanceCalculator(_accounts, _transactions);
// Verify individual account balances
decimal cashBalance = mockBalanceCalculator.CalculateAccountBalance(
_accounts["Cash"].Id,
_testDate.AddDays(15)
);
Assert.That(cashBalance, Is.EqualTo(-2750m), "Cash balance is incorrect");
// ... verify other account balances
// Also verify the accounting equation
// ...
}
The Benefits of Our Collection-Based Approach
Our redesigned MockAccountBalanceCalculator offers several advantages:
Data-Driven: All calculations are based on collections of data, not hardcoded values.
Flexible: New transactions can be added easily without changing calculation logic.
Maintainable: If transaction amounts change, we only need to update them in one place.
Realistic: This approach closely mirrors how a real database-backed accounting system would work.
Extensible: We can add support for more complex queries like filtering by date range.
The Goals of Our Integration Test
Our integration test serves several important purposes:
Verify Module Integration: Ensures that the Document module and Chart of Accounts module work correctly together.
Validate Business Workflows: Confirms that standard accounting workflows (purchasing, sales, expenses, payments) function as expected.
Ensure Data Integrity: Verifies that all transactions maintain balance (debits = credits) and that account balances are accurate.
Test Double-Entry Accounting: Confirms that our system properly implements double-entry accounting principles where every transaction affects at least two accounts.
Validate Accounting Equation: Ensures that the fundamental accounting equation (Assets = Liabilities + Equity + (Revenues – Expenses)) remains balanced.
Conclusion
This integration test demonstrates the core functionality of our accounting system using a data-driven approach that closely mimics a real database. By simulating a retail business’s transactions and storing them in collections, we’ve created a realistic test environment for our double-entry accounting system.
The collection-based approach in our MockAccountBalanceCalculator allows us to test complex accounting logic without an actual database, while still ensuring that our calculations are accurate and our accounting principles are sound.
While this test uses in-memory collections rather than a database, it provides a strong foundation for testing the business logic of our accounting system in a way that would translate easily to a real-world implementation.
This call/zoom will give you the opportunity to define the roadblocks in your current XAF solution. We can talk about performance, deployment or custom implementations. Together we will review you pain points and leave you with recommendations to get your app back in track
The chart of accounts module is a critical component of any financial accounting system, serving as the organizational structure that categorizes financial transactions. As a software developer working on accounting applications, understanding how to properly implement a chart of accounts module is essential for creating robust and effective financial management solutions.
What is a Chart of Accounts?
Before diving into the implementation details, let’s clarify what a chart of accounts is. In accounting, the chart of accounts is a structured list of all accounts used by an organization to record financial transactions. These accounts are categorized by type (assets, liabilities, equity, revenue, and expenses) and typically follow a numbering system to facilitate organization and reporting.
Core Components of a Chart of Accounts Module
Based on best practices in financial software development, a well-designed chart of accounts module should include:
1. Account Entity
The fundamental entity in the module is the account itself. A properly designed account entity should include:
A unique identifier (typically a GUID in modern systems)
Account name
Account type (asset, liability, equity, revenue, expense)
Official account code (often used for regulatory reporting)
Reference to financial statement lines
Audit information (who created/modified the account and when)
Archiving capability (for soft deletion)
2. Account Type Enumeration
Account types are typically implemented as an enumeration:
This enumeration serves as more than just a label—it determines critical business logic, such as whether an account normally has a debit or credit balance.
3. Account Validation
A robust chart of accounts module includes validation logic for accounts:
Ensuring account codes follow the required format (typically numeric)
Verifying that account codes align with their account types (e.g., asset accounts starting with “1”)
Validating consistency between account types and financial statement lines
Checking that account names are not empty and are unique
4. Balance Calculation
One of the most important functions of the chart of accounts module is calculating account balances:
Point-in-time balance calculations (as of a specific date)
Period turnover calculations (debit and credit movement within a date range)
Determining if an account has any transactions
Implementation Best Practices
When implementing a chart of accounts module, consider these best practices:
1. Use Interface-Based Design
Implement interfaces like IAccount to define the contract for account entities:
When working with a chart of accounts module, you might encounter:
1. Account Code Standardization
Challenge: Different jurisdictions may have different account coding requirements.
Solution: Implement a flexible validation system that can be configured for different accounting standards.
2. Balance Calculation Performance
Challenge: Balance calculations for accounts with many transactions can be slow.
Solution: Implement caching strategies and consider storing period-end balances for faster reporting.
3. Account Hierarchies
Challenge: Supporting account hierarchies for reporting.
Solution: Implement a nested set model or closure table for efficient hierarchy querying.
Conclusion
A well-designed chart of accounts module is the foundation of a reliable accounting system. By following these implementation guidelines and understanding the core concepts, you can create a flexible, maintainable, and powerful chart of accounts that will serve as the backbone of your financial accounting application.
Remember that the chart of accounts is not just a technical construct—it should reflect the business needs and reporting requirements of the organization using the system. Taking time to properly design this module will pay dividends throughout the life of your application.
This call/zoom will give you the opportunity to define the roadblocks in your current XAF solution. We can talk about performance, deployment or custom implementations. Together we will review you pain points and leave you with recommendations to get your app back in track