Translation database schema for software (exemplified)

Translation database schema for software (exemplified) - sql

I have a device with a GUI (menus etc) that I need translated. The translations are managed using a database.
I have looked at different answers to same issue:
Best practice for multi-language
Schema for a multilanguage database
As most of the examples explain mostly schemas, I have tried to make a small example, using SQLite:
Database shemas:
-- language table to hold individual languages
--
CREATE TABLE "languages" (
"languageID" TEXT NOT NULL,
"langNativeName" TEXT,
"langISOCode" TEXT
PRIMARY KEY("languageID")
)
-- words table to hold the strings to be translated
-- each string has an identifier used in software, such as "mainMenu", "label123" etc
CREATE TABLE "words" (
"wordID" INTEGER NOT NULL,
"wordKey" TEXT,
"wordDefault" TEXT,
PRIMARY KEY("wordID")
)
-- translations
-- combining languages and words
CREATE TABLE "translations" (
"keyID" INTEGER NOT NULL,
"langID" INTEGER NOT NULL,
"translation" TEXT,
PRIMARY KEY("keyID","langID")
)
With some sample data:
Languages:
+------------+----------------+-------------+-------------+
| languageID | langNativeName | langEnglish | langISOCode |
+------------+----------------+-------------+-------------+
| 1 | English | English | en |
| 2 | Francois | French | fr |
| 3 | Deutsch | German | de |
+------------+----------------+-------------+-------------+
Words: (strings to translate). The wordKey will always be unique.
+--------+----------------+-------------+
| wordID | wordKey | wordDefault |
+--------+----------------+-------------+
| 1 | tileProduction | Start |
| 2 | tileJobs | Job |
| 3 | tileGoto | Go To |
+--------+----------------+-------------+
Translations:
word (1) is not translated for English
word (3) is not translated at all
+-------+--------+--------------------+
| keyID | langID | translation |
+-------+--------+--------------------+
| 1 | 2 | Produccion |
| 1 | 3 | Produktion |
| 2 | 2 | Seleccion de tarea |
| 2 | 3 | Jobauswahl |
+-------+--------+--------------------+
My proposed SQL to get a translation
As I have understood from examples, the following SQL should be used to get a translation. An extra field has been added to get a star on untranslated items.
select
wordKey,
coalesce( translation, wordDefault) displaytext,
case coalesce( translation, wordDefault)
when translation then ""
else "*"
end remark
from
words
left join translations
on words.wordid = translations.keyID
AND
translations.langid = 3
left join languages on
languages.languageid = translations.langID
AND
translations.langID = 3
Which will give me:
+----------+-------------+--------+
| wordKey | displaytext | remark |
+----------+-------------+--------+
| tileProd | Produktion | |
| tileJob | Jobauswahl | |
| tileGoto | Go To | * |
+----------+-------------+--------+
This works.. but what I am in doubt about before I start inserting larger amounts of real data:
is this model correct, and if not, what may I have missed ? There will be some metadata for each language, which I have left out for clarity.
is the query to get a translation correct, or can it be done better ?

Related

How to use JOIN, CROSS JOIN to combine globalized stored values in SQL into a single table

We have various tables pertaining to different entities where we would like to globalize the stored values. We do not know how to proceed technically anymore and are open to any form of help, hints or tips.
Language
ID | Culture | Description |
---+---------+-------------+
1 | EN | English |
2 | FR | French |
3 | ES | Spanish |
Job
ID | Description |
---+-------------+
1 | Doctor |
2 | Firefighter |
JobGlobalization
ID | JobID | Description | Culture |
---+-------+-------------+---------+
1 | 1 | Docteur | FR |
2 | 1 | Doctora | ES |
We attempted to use CROSS JOIN to obtain something of the following:
ID | Description | Culture |
---+-------------+---------+
1 | Doctor | EN |
1 | Doctor | FR |
1 | Doctor | ES |
2 | Firefighter | ES |
2 | Firefighter | ES |
2 | Firefighter | ES |
Query used:
SELECT Job.ID, Job.Description, Language.Culture
CROSS JOIN Language
ORDER BY Job.ID
We experienced with different joins on the child globalization table in order to correlate the entities together, however the results set kept multiplying itself in the wrong way.
We would like that for every parent entity, whether it has any related child entities, a row is selected for every culture in the Language table. The description column will default to the parent entity in the case where there are no associated records in the child table.
The resulting table should be as follows:
ID | Description | Culture |
---+-------------+---------+
1 | Doctor | EN |
1 | Docteur | FR |
1 | Doctora | ES |
2 | Firefighter | EN |
2 | Firefighter | FR |
2 | Firefighter | ES |
We had in mind a condition that would select the 'Description' column from the parent table 'Job' if there were no corresponding record for it in the child table.
e.g.
IIF(JobGlobalization.Description IS NOT NULL, JobGlobalization.Description, Job.Description)

We attempted to use CROSS JOIN to obtain something of the following:
This should produce the result set you describe:
SELECT j.ID, j.Description, l.Culture
FROM Job j CROSS JOIN
Language l
ORDER BY j.ID, l.Culture;
You can insert this into JobGlobalization (although you might want to truncate it first). Or you can use CREATE TABLE AS (or the equivalent for your database) to create JobGlobalization from scratch.
You would then need to update this table with the appropriate values for the culture.

How can Make a selection to filter out the duplicates without grouping them? I want them all to display individually?

I have some data in a database that is sorted into a text column a individual identifier for each text item and a language for each of these text columns.
SELECT Text, Language, COUNT(*)
FROM TableA
WHERE Language = 'English'
GROUP BY Text, Language
HAVING COUNT(*) > 1
This Query gives me a list of the data I need however I have 2 issues, It is grouped up so the results display as:
| Text | Language | Amount Counted |
|------------|----------|-----------------|
| Hello Text | English | 5 |
The issue is I can sort based on the text to make a count however I cannot figure out how to add the unique identifier in there and list these out as one big list? For example The text 'Hello' could be in the list 5 Times and I would get this listed as above. However Each version of hello Will have a Different ID Value Perhaps The first version of Hello is (ID 232) and the Second is (ID 546) how can I add in the ID value which is in the same table and just list all the duplicated with their ID values?
So I would get As a example:
| Text | Language | ID |
|----------------|----------|------|
| Hello Text | English | 232 |
| Hello Text | English | 546 |
| Hello Text | English | 643 |
| Hello Text | English | 745 |
| Hello Text | English | 1353 |
| Other Text | English | 343 |
| Other Text | English | 433 |
| Different Text | English | 433 |
| Different Text | English | 437 |
| Different Text | English | 563 |
| Different Text | English | 898 |

Do you just want a window function?
SELECT text, language, id
FROM (SELECT a.*, COUNT(*) OVER (PARTITION BY Text) as cnt
FROM TableA a
WHERE Language = 'English'
) a
WHERE cnt > 1
ORDER BY id;

Query M:N contains

I am trying to filter a set of tables that includes an M:N junction table in Android Room (SQLite).
An image can have many subjects. I'd like to allow filtering by a subject, so that I get a row with complete image information (including all subjects). So if an image had (National Park, Yosemite) filtering for either would result in one row with both keywords. Unless I messed something up, a typical join will result in multiple rows such that matching Yosemite would get the right image, but you'd be lacking National Park. I came up with this:
SELECT *,
(SELECT GROUP_CONCAT(name)
FROM meta_subject_junction
JOIN subject
ON subject.id = meta_subject_junction.subjectId
WHERE meta_subject_junction.metaId = meta.id) AS keywords,
(SELECT documentUri
FROM image_parent
WHERE meta.parentId = image_parent.id ) AS parentUri
FROM meta
Now this gets me the complete rows, but I think at this point I'd need to:
WHERE keywords LIKE(%YOSEMITE%)
and I think the LIKE is less than ideal, not to mention an imprecise match. Is there a better way to accomplish this? Thanks, this is bending my novice SQL brain.
Further details
meta
+----+----------+--+
| id | name | |
+----+----------+--+
| 1 | yosemite | |
| 2 | bryce | |
| 3 | flowers | |
+----+----------+--+
subject
+----+---------------+--+
| id | name | |
+----+---------------+--+
| 1 | National Park | |
| 2 | Yosemite | |
| 3 | Tulip | |
+----+---------------+--+
junction
+--------+-----------+
| metaId | subjectId |
+--------+-----------+
| 1 | 1 |
| 1 | 2 |
| 2 | 1 |
| 3 | 3 |
+--------+-----------+
Although I may have done something wrong, as far as I can tell Android Room doesn't like:
+----+-----------+---------------+
| id | name | subject |
+----+-----------+---------------+
| 1 | yosemite | National Park |
| 1 | yosemite | Yosemite |
+----+-----------+---------------+
so I'm trying to reduce the rows:
+----+-----------+-------------------------+
| id | name | subject |
+----+-----------+-------------------------+
| 1 | yosemite | National Park, Yosemite |
+----+-----------+-------------------------+
which the above query does. However, I also want to query for a subject. So that National Park filter will yield:
+----+-----------+-------------------------+
| id | name | subject |
+----+-----------+-------------------------+
| 1 | yosemite | National Park, Yosemite |
| 2 | bryce | National Park |
+----+-----------+-------------------------+
I'd like to be more precise/efficient than LIKE with the already 'concat' subject. Most of my attempts end up with no results in Room (multi-row) or reducing the subject to only the filter keyword.
Update
Here's a test I've been using to compare the actual SQL results from a query to what Android Room ends up with:
http://sqlfiddle.com/#!7/0ac11/10/0
That join query is interpreted as four objects in Android Room, so I'm trying to reduce the rows, but retain the full subject results while filtering for any image containing the subject keyword.

If you want multiple keywords, then where and group by and having can be used:
select image_id
from image_subject
where subject_id in ('a', 'b', 'c') -- whatever
group by image-id
having count(distinct subject_id) = 3; -- same count as in `where`

This gets the result I need, though I'd love to hear a better option if this is particularly inefficient.
SELECT meta.*,
(SELECT GROUP_CONCAT(name)
FROM junction
JOIN subject
ON subject.id = junction.subjectId
WHERE junction.metaId = meta.id) AS keywords,
junction.subjectId
FROM meta
LEFT JOIN junction ON junction.metaId = meta.id
WHERE subjectId IN (1,2)
GROUP BY meta.id
+----+----------+------------------------+-----------+
| id | name | keywords | subjectId |
+----+----------+------------------------+-----------+
| 1 | yosemite | National Park,Yosemite | 2 |
| 2 | bryce | National Park | 1 |
+----+----------+------------------------+-----------+
http://sqlfiddle.com/#!7/86a76/13

sqlite3: Create tables with many rows or one table with more columns

I'm asking for a best practive when creating some tables for localization of an Web Interface in sqlite3
In my first intetion I wanted to create a table with the different languages, and another on for the Messeage Code and Entries.
tblLanguage
+------------+-------------+---------+
| idLangCode | txtLangName | txtCode |
+------------+-------------+---------+
| 1 | English | en |
| 2 | German | de |
| 3 | French | fr |
| 4 | Spanish | es |
| 5 | Chinese | zh |
+------------+-------------+---------+
tblMessageText
+----+-------+--------------------------+------------+
| Id | Code | Message | LanguageID |
+----+-------+--------------------------+------------+
| 1 | 20500 | Set Point changed | 1 |
| 2 | 20500 | Sollwert geändert | 2 |
| 3 | 20500 | Punto de ajuste cambiado | 5 |
+----+-------+--------------------------+------------+
So in the second table I would have several rows with the same Message Code but whith an different language text.
The other possibility would be to have just one table with just one row for each Message Code but an Column for each language.
tblMessageTextMulti
+----+-------+-------------------+-------------------+--------------------------+
| id | Code | txtMessageText_EN | txtMessageText_DE | txtMessageText_ES |
+----+-------+-------------------+-------------------+--------------------------+
| 1 | 20500 | Set Point changed | Sollwert geändert | Punto de ajuste cambiado |
+----+-------+-------------------+-------------------+--------------------------+
My team likes the second solution with just one table more, because it just has one entry for each Message Code, and you see all Language text side by side.
What I like on the first solution is, that I could dynmically Query the langugage with just on line in php:
$query = 'SELECT * FROM qryInfoMessage WHERE idLangCode=' .$Language;
For the second solution with one table I can not store the query itself in the database, because I have to change the Column name in my query dynmically. So I have to put this together in php.
$query = 'SELECT Code, txtMessageText_EN FROM tblMessageTextMulti;
What I dont show here is that my query is much more complex, whith string substitution and date time conversion.
Beside that, what are the advantages or disadvantages of this solutions. Which one should be more perfomant and what is the best practice?

Create a summary result with one query

I have a table with the following format.
mysql> describe unit_characteristics;
+----------------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| uut_id | int(10) unsigned | NO | PRI | NULL | |
| uut_sn | varchar(45) | NO | | NULL | |
| characteristic_name | varchar(80) | NO | PRI | NULL | |
| characteristic_value | text | NO | | NULL | |
| creation_time | datetime | NO | | NULL | |
| last_modified_time | datetime | NO | | NULL | |
+----------------------+------------------+------+-----+---------+----------------+
each uut_sn has multiple characteristic_name/value pairs. I want to use MySQL to generate a table
+----------------------+-------------+-------------+-------------+--------------+
| uut_sn | char_name_1 | char_name_2 | char_name_3 | char_name_4 | ... |
+----------------------+-------------+-------------+-------------+--------------+
| 00000 | char_val_1 | char_val_2 | char_val_3 | char_val_4 | ... |
| 00001 | char_val_1 | char_val_2 | char_val_3 | char_val_4 | ... |
| 00002 | char_val_1 | char_val_2 | char_val_3 | char_val_4 | ... |
| ..... | char_val_1 | char_val_2 | char_val_3 | char_val_4 | ... |
+----------------------+------------------+------+-----+---------+--------------+
Is this possible with just one query?
Thanks,
-peter

This is a standard pivot query:
SELECT uc.uut_sn,
MAX(CASE
WHEN uc.characteristic_name = 'char_name_1' THEN uc.characteristic_value
ELSE NULL
END) AS char_name_1,
MAX(CASE
WHEN uc.characteristic_name = 'char_name_2' THEN uc.characteristic_value
ELSE NULL
END) AS char_name_2,
MAX(CASE
WHEN uc.characteristic_name = 'char_name_3' THEN uc.characteristic_value
ELSE NULL
END) AS char_name_3,
FROM unit_characteristics uc
GROUP BY uc.uut_sn
To make it dynamic, you need to use MySQL's dynamic SQL syntax called Prepared Statements. It requires two queries - the first gets a list of the characteristic_name values, so you can concatenate the appropriate string into the CASE expressions like you see in my example as the ultimate query.

You're using the EAV antipattern. There's no way to automatically generate the pivot table you describe, without hardcoding the characteristics you want to include. As #OMG Ponies mentions, you need to use dynamic SQL to general the query in a custom fashion for the set of characteristics you want to include in the result.
Instead, I recommend you fetch the characteristics one per row, as they are stored in the database, and if you want an application object to represent a single UUT with all its characteristics, you write code to loop over the rows as you fetch them in your application, collecting them into objects.
For example in PHP:
$sql = "SELECT uut_sn, characteristic_name, characteristic_value
FROM unit_characteristics";
$stmt = $pdo->query($sql);
$objects = array();
while ($row = $stmt->fetch()) {
if (!isset($objects[ $row["uut_sn"] ])) {
$object[ $row["uut_sn"] ] = new Uut();
}
$objects[ $row["uut_sn"] ]->$row["characteristic_name"]
= $row["characterstic_value"];
}
This has a few advantages over the solution of hardcoding characteristic names in your query:
This solution takes only one SQL query instead of two.
No complex code is needed to build your dynamic SQL query.
If you forget one of the characteristics, this solution automatically finds it anyway.
GROUP BY in MySQL is often slow, and this avoids the GROUP BY.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Translation database schema for software (exemplified) - sql

Related

How to use JOIN, CROSS JOIN to combine globalized stored values in SQL into a single table

How can Make a selection to filter out the duplicates without grouping them? I want them all to display individually?

Query M:N contains

sqlite3: Create tables with many rows or one table with more columns

Create a summary result with one query

Categories

Resources