Data localization in DB

Once we faced with the problem of data localization in our DB. Data was distributed on several tables. And we neede to choose a method accoring to requirements:

  1. Adding new language does not cause changings in code
  2. Must have default localization
  3. Transmit only needed localization from server to client (not all languages)
  4. Easy work with Entity Framework
  5. Localize only strings

We chose between next ways:

  1. Duplicate columns (value_en, value_es, etc...) in a table. This solution does not satisfy to the first requirement, and future support of this is not easy. But for tasks, where only 2 (or 3) languages (and no more) its sutable. Fast and easy.
  2. XML-column with localization. All data we need to localize are packed in special XML field and is stored in the same table. For the second requirement we store natural columns with string data. But in this solution, we always get full localization (all languages) from server and parse it on client side. If we have a lot of languages - it could be a problem, because we don't use the biggest part of data.
  3. Separated table for localization. It's the solution, I'll talk more detailed.

Separated table for data localization

For every table storing localizable data we introduce new additional table with localizable fields. Like this:

Data Diagram

In table Entity there is a description of some object, including default localization (our requirement number 2). CultureId is default language for the object.

Table EntityLocalized stores all localizable fields. Also, it can store a default localization - depending on implementation. As you see, this schema allows us to introduce new languages "on the fly", and we always have a default localization (if EntityLocalized is empty, we can get data from Entity table).

Now we need to work with our ORM - Entity Framework. What do we want?

  1. We want to get Entity object on specific language without manipulating EntityLocalized on client side.
  2. We want to put this logic as deep as we can, that developers not have to think about this.
  3. We want to use LINQ.

First of all, let's define a table-value function (TVF) on SQL server:

~~~sql CREATE FUNCTION [dbo].[GetLocalizedEntity] (
@CultureId bigint ) RETURNS TABLE AS RETURN ( SELECT E.Id as Id, E.Field1, E.Field2, CASE WHEN EL.CultureId IS NULL THEN E.CultureId ELSE EL.CultureId END AS CultureId, CASE WHEN EL.CultureId IS NULL THEN E.Text1 ELSE EL.Text1 END AS Text1, CASE WHEN EL.CultureId IS NULL THEN E.Text2 ELSE EL.Text2 END AS Text2 FROM dbo.Entity E LEFT OUTER JOIN dbo.EntityLocalized EL ON E.Id = EL.EntityId AND EL.CultureId = @CultureId ) ~~~

It receives an Id of language and returns all fields in specific localization. Wherein, if there is no localization with specific @CultureId, you'll get default one. Advantage of TVFs is that they are like parameterized views, and we can select it, use joins and so on:

~~~sql SELECT * FROM [dbo].[GetLocalizedEntity] (1) AS t INNER JOIN dbo.Cultures c ON t.CultureId = c.Id WHERE t.Text1 LIKE '%test%' ~~~

Now, implement this TVF in Entity Framework. Import your model (including our function) in EDMX (or use Code First approach - as you wish). For example, I use Database First:

EDMX

And the most important part: edit function import. By default, it generates new Complex Type, we need to change it to our object class:

Function Import

And call it from DbContext:

~~~c# using (var context = new Entities()) { var result = context.GetLocalizedEntity(1) .Select(o => new { o.Field1, o.Field2, o.Cultures.Name }) .OrderBy(o => o.Name) .ToArray(); } ~~~ As you can see, result of GetLocalizedEntity is IQueryable>, and we can do all we want :)

You can hide TVF-call inside a specific repository for Entity.