How do I correctly map letters in the database?

How do I correctly map letters in the database? - sql

I have two tables. One table with the letters of different countries and a second table with a mapping of these letters to each other.
I need to make a query to get the mapped letters of the two languages. Can you tell me how this can be done optimally?
The letter table
id
letter
language
1
A
en
2
Ä
de
3
A
de
4
O
en
5
O
de
6
Ö
de
The letter mapping table
id
letter1(letterTable.id)
letter2(letterTable.id)
1
1
2
2
1
3
3
4
5
4
4
6
Would it be better to create a separate table for each alphabet?
Maybe there is some other architectural approach for this kind of letter matching? I would really appreciate it!

This can be achieved with a join that is restricted to the two languages you want to check:
select en.id as id_en,
en.letter as letter_en,
de.id as id_de,
de.letter as letter_de
from letter en
join letter_mapping lm on lm.letter1 = en.id
join letter de on de.id = lm.letter2 and de.language = 'de'
where en.language = 'en';
Online example

Related

How to assign data without repetition in SQL

I need to create automatic weekly assignments of items to sites for my employees.
The items table items_bank looks like that(of course there will be a lot of items with few more languages) :
**item_id** **item_name** **language**
1 Jorge Garcia English
2 Chrissy Metz English
3 Nina Hagen German
4 Harald Glööckle German
5 Melissa Anderson French
6 Pauley Perrette French
My second table is the sites table :
**site_id** **site_name**
1 DR
2 LI
3 IG
I need to assign every week items to the sites with the following constraints :
For each site assign at least X items of English, Y items of German, and so on...
we want to create diversity - so we would like to avoid repeating the assignments of the 2 weeks before
I think we need to create another table in which we can save there the history of the last 2 weeks' assignments.
right now I managed to create an SQL query that assigns items but I don't know how to take the constraints under consideration this is what I create so far :
WITH numbered_tasks AS (
SELECT t.*, row_number() OVER (ORDER BY rand()) item_number, count(*) OVER () total_items
FROM item_bank t
),
numbered_employees AS (
SELECT e.*,row_number() OVER (ORDER BY rand()) site_number,
count(*) OVER () total_sites
FROM sites_bank e
)
SELECT nt.item_name,
ne.acronym
FROM numbered_tasks nt
INNER JOIN numbered_employees ne
ON ne.site_number-1 = mod(nt.item_number-1, ne.total_sites)
Expected results are for the example which says :
site_id=1 have to get 1 item with the English language
site_id=2 have to get 1 item with the German language
site_id=1 have to get 1 item with the French language
**item_id** **language** **Week_number** **site**
1 English 1 1
4 German 1 2
5 French 1 3
Any help will be appreciated!

SQL subquery as part of LIKE search

From this Question we learned to use a subquery to find information once-removed.
Subquery we learned :
SELECT * FROM papers WHERE writer_id IN ( SELECT id FROM writers WHERE boss_id = 4 );
Now, I need to search a table, both in column values that table, and in column values related by id on another table.
Here are the same tables, but col values contain more text for our "searching" reference...
writers :
id
name
boss_id
1
John Jonno
2
2
Bill Bosworth
2
3
Andy Seaside
4
4
Hank Little
4
5
Alex Crisp
4
The writers have papers they write...
papers :
id
title
writer_id
1
Boston
1
2
Chicago
4
3
Cisco
3
4
Seattle
2
5
North
5
I can use this to search only the names on writers...
Search only writers.name : (Not what I want to do)
SELECT * FROM writers WHERE LOWER(name) LIKE LOWER('%is%');
Output for above search : (Not what I want to do)
id
name
boss_id
5
Alex Crisp
4
I want to return cols from writers (not papers), but searching text both in writers.name and the writers.id-associated papers.title.
For example, if I searched "is", I would get both:
Alex Crisp (for 'is' in the name 'Crisp')
Andy Seaside (because Andy wrote a paper with 'is' in the title 'Cisco')
Output for "is" search :
id
title
writer_id
2
Chicago
4
4
Seattle
2
Here's what I have that doesn't work:
SELECT * FROM papers WHERE LOWER(title) LIKE LOWER('%is%') OR writer_id ( writers=writer_id WHERE LOWER(name) LIKE LOWER('%$is%') );

The best way to express this criteria is by using a correlated query with exists:
select *
from writers w
where Lower(w.name) like '%is%'
or exists (
select * from papers p
where p.writer_id = w.id and Lower(p.title) like '%is%'
);
Note you don't need to use lower on the string you are providing, and you should only use lower if your collation truly is case-sensitive as using the function makes the search predicate unsargable.

Since you want to return cols from writers (not papers) you should select them first, and use stuff from papers in the criteria
select *
from writers w
where
w.name like '%is%'
or
w.id in (select p.writer_id
paper p
where p.title like '%is%'
)
You can add your LOWER functions (my sql environment is not case-sensitive, so I didn't need them)

SQL CONTAINSTABLE - Unexpected results

I have a table programs with some records and have a different results if using LIKE or CONTAINSTABLE.
CREATE TABLE Programs (
ID varchar(255) NOT NULL PRIMARY KEY,
Title varchar(255) NOT NULL
);
Insert INTO Programs VALUES
('1', '5 Horas em Islamabad'),
('2','Gus Melhoras" Melhora'),
('3', '13 Horas - Os Soldados Secretos de Benghazi'),
('4','72 Horas de Medo'),
('5','As Primeiras 48 Horas')
SELECT distinct Title FROM Programs WHERE Title LIKE '%Horas%'
SELECT ID, Title, KEY_TBL.RANK
FROM Programs AS DocTable
INNER JOIN CONTAINSTABLE(Programs, Title, 'Horas') AS KEY_TBL
ON DocTable.ID = KEY_TBL.[KEY]
WHERE KEY_TBL.RANK > 0
ORDER BY KEY_TBL.RANK DESC;
With LIKE i have 5 results
ID Title
1 5 Horas em Islamabad
2 Gus Melhoras" Melhora
3 13 Horas - Os Soldados Secretos de Benghazi
4 72 Horas de Medo
5 As Primeiras 48 Horas
With ContainsTable i have 2 results
ID Title RANK
4 72 Horas de Medo 32
5 As Primeiras 48 Horas 32
I understand why the record with title "Gus Melhoras" Melhora" is not returned, because does not have the word "Horas".
But the records "5 Horas em Islamabad" and "13 Horas - Os Soldados Secretos de Benghazi" contain the word "Horas" and do not return.
Can anybody why this happened and can help me?
My dbms are Microsoft SQL Server.
Columns used in Full text index
EDIT:
In my case i defined the "Language for Word Breaker" in "Brazilian". If i changed to "English" returns correctly 4 items.
The word i search "Horas" is "Hours" in English. But if i add the new record, with title "13 hours in Islamabad" and search by word "Hours" the record is returned.
Anyone know why this particular behavior in Brazilian or Portuguese Language?
More, in Spanish "Horas" is the same word "Horas" and if i change my "Language for Word Breaker" to Spanish the 4 items are returned.
EDIT2:
Used the queries send by #Randy in Marin and i did the test used the Portuguese language.
SELECT s.stopword, l.name
FROM sys.fulltext_system_stopwords s
JOIN sys.fulltext_languages l ON l.lcid = s.language_id
WHERE l.lcid = 2070 -- portuguese
stopword name
0 Portuguese
1 Portuguese
2 Portuguese
3 Portuguese
4 Portuguese
5 Portuguese
6 Portuguese
7 Portuguese
8 Portuguese
9 Portuguese
a Portuguese
agora Portuguese
...
When execute the query to find the exact matches
SELECT occurrence, special_term, left(display_term, 20) as [display_term]
FROM sys.dm_fts_parser ('"5 Horas em Islamabad"', 2070, 0, 0); -- portuguese
occurrence special_term display_term
1 Exact Match tt24050000
1 Exact Match 5 horas
1 Exact Match tt24170000
2 Noise Word em
3 Exact Match islamabad
It's the equal result to the Brazilian language, although there are digits stopwords

The dmv sys.dm_fts_parser shows how a phrase is parsed for different languages with or without a stoplist or accent.
SET NOCOUNT ON
--select * from sys.syslanguages
SELECT occurrence, special_term, left(display_term, 20) as [display_term]
FROM sys.dm_fts_parser ('"5 Horas em Islamabad"', 1033, 0, 0); -- english
SELECT occurrence, special_term, left(display_term, 20) as [display_term]
FROM sys.dm_fts_parser ('"5 Horas em Islamabad"', 1046, 0, 0); -- brazilian
SELECT occurrence, special_term, left(display_term, 20) as [display_term]
FROM sys.dm_fts_parser ('"5 Horas em Islamabad"', 3082, 0, 0); -- spanish
occurrence special_term display_term
----------- ---------------- --------------------
1 Noise Word 5
1 Noise Word nn5
2 Exact Match horas
3 Exact Match em
4 Exact Match islamabad
occurrence special_term display_term
----------- ---------------- --------------------
1 Exact Match tt24050000
1 Exact Match 5 horas
1 Exact Match tt24170000
2 Noise Word em
3 Exact Match islamabad
occurrence special_term display_term
----------- ---------------- --------------------
1 Noise Word 5
1 Noise Word nn5
2 Exact Match horas
3 Exact Match em
4 Exact Match islamabad
The "5" is not a noise word in Brazilian. I tried null for a stoplist and both 0 and 1 for the accent and it did not help.
If you run the following two queries, it's clear that the Brazialian stoplist is very different. It does not have digits. Perhaps it should. Maybe a support call is required.
SELECT s.stopword, l.name
FROM sys.fulltext_system_stopwords s
JOIN sys.fulltext_languages l
ON l.lcid = s.language_id
WHERE l.lcid = 1033
stopword
----------------------------------------------------------------
$
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
...
SELECT s.stopword, l.name
FROM sys.fulltext_system_stopwords s
JOIN sys.fulltext_languages l
ON l.lcid = s.language_id
WHERE l.lcid = 1046
stopword
----------------------------------------------------------------
a
abaixo
acaso
aceleradamente
acerca
acima
acolá
ademais
adentro
adiantado
adiante
adrede
afora
agora
agorinha
ainda
alerta
algo
algum
alguma
algumas
...

LIKE and CONTAINSTABLE can be expected to have different results. LIKE uses simple and deterministic pattern matching rules and all characters are significant. CONTAINSTABLE uses a complex system that attempts to apply language specific algorithms to do fuzzy matches.
If storing documents that can be of different languages, specifying the language in CONTAINSTABLE can yield better results. The LCID might be stored in the record of the document and passed to CONTAINSTABLE in the join. If not specified, the language of the full text is used and might not be a good match.
SELECT ID, Title, KEY_TBL.RANK
FROM Programs AS DocTable
INNER JOIN CONTAINSTABLE(Programs, Title, 'Horas', 1046) AS KEY_TBL
ON DocTable.ID = KEY_TBL.[KEY]
WHERE KEY_TBL.RANK > 0
ORDER BY KEY_TBL.RANK DESC;
UPDATE:
Here is a means to check in which languages a value is a stopword.
select * from sys.fulltext_system_stopwords
WHERE stopword IN ('5', 'em')
stopword language_id
---------------------------------------------------------------- -----------
5 0
5 1028
5 1030
5 1031
5 1033
5 1036
5 1040
5 1041
5 1043
5 1045
5 1049
5 1053
5 1054
5 1055
5 2052
5 2057
5 2070
5 3082
em 1046
em 2070

QMF Turning Sequence Number Results into their own columns without PIVOT

I'm sorry, I'm new and have almost no training. I've been searching for a few days on this and maybe I'm just not using the correct terms...
Using QMF for Windows.
I have 3 columns in my ADDRESSTABLE table - address identifier codes, address line sequence numbers and their corresponding address lines. ADR_CODE, SEQ_NO, ADRS_LINE.
Each address record has between 3 and 5 lines, and thusly, 3 to 5 sequence numbers. So, when I pull a query for address identifier codes, I get 3-5 repetitions of the address identifier code. Like so:
SELECT DISTINCT A.ADR_CODE, A.SEQ_NO, A.ADRS_LINE
FROM ADDRESSTABLE A
WHERE (A.ADR_CODE LIKE 'A%')
And I get:
ADR_CODE SEQ_NO ADRS_LINE
AAAA 1 JOHN DOE
AAAA 2 123 HAPPY STREET
AAAA 3 ANYWHERE, NY
AAAA 4 12345
AABB 1 234 MAIN STREET
AABB 2 SOMEWHERE, MN
AABB 3 34567
ACDE 1 MR PINK
ACDE 2 21 RESERVOIR RD
ACDE 3 APT 4
ACDE 4 LOS ANGELES
ACDE 5 90210
And I figured out that if I do:
SELECT DISTINCT A.ADR_CODE, MIN(A.SEQ_NO CONCAT A.ADRS_LINE) AS
"FIRST ADDRESS LINE", MAX(A.SEQ_NO CONCAT A.ADRS_LINE) AS
"LAST ADDRESS LINE"
FROM ADDRESSTABLE A
WHERE (A.ADR_CODE LIKE 'A%')
ORDER BY A.ADR_CODE ASC
GROUP BY A.ADR_CODE
I get:
ADR_CODE FIRST ADDRESS LINE LAST ADDRESS LINE
AAAA 1JOHN DOE 412345
AABB 1234 MAIN STREET 334567
ACDE 1MR PINK 590210
My question is, how do I get the rest of those in between lines? MIN+1 and MAX-1 is illegal, MIN(A.SEQ_NO +1... and MAX(A.SEQ_NO-1... is illegal. I'm stuck and I don't want to use PIVOT because I want this whole thing to be part of a larger query. In short, My query should end up with about 7000 rows of freight records - each with their own address in a line - instead of 7000 rows*(3 to 5 address lines per record).Thank you, James

Although not exactly a PIVOT, will a multiple join to same address table work for you as ...
SELECT
A.ADR_CODE,
A.ADRS_LINE as AdrLine1,
A2.ADRS_LINE as AdrLine2,
A3.ADRS_LINE as AdrLine3,
COALESCE( A4.ADRS_LINE, "" ) as AdrLine4,
COALESCE( A5.ADRS_LINE, "" ) as AdrLine5
FROM
ADDRESSTABLE A
JOIN ADDRESSTABLE A2
ON A.ADR_CODE = A2.ADR_CODE
AND A2.SEQ_NO = 2
JOIN ADDRESSTABLE A3
ON A.ADR_CODE = A3.ADR_CODE
AND A3.SEQ_NO = 3
LEFT JOIN ADDRESSTABLE A4
ON A.ADR_CODE = A4.ADR_CODE
AND A4.SEQ_NO = 4
LEFT JOIN ADDRESSTABLE A5
ON A.ADR_CODE = A5.ADR_CODE
AND A5.SEQ_NO = 5
WHERE
A.ADR_CODE LIKE 'A%'
AND A.SEQ_NO = 1
ORDER BY
A.ADR_CODE
Since the query is applied to the address table for only sequence #1, that will keep distinct adr_code entires for you since you are not getting the possible 2-5 address lines.
NOW comes the JOINs. I rejoin to same address table on the same adr_code key, but for each one additionally only for that particular address line... applying LEFT-JOIN for 4 and 5 since you stated 1-3 are always and 4-5 are only POSSIBLE (also applied coalesce() to prevent NULLs).

Query to JOIN / overwrite field

I'm not sure if I'm using the correct terminology.
SELECT movies.*, actors.`First Name`, actors.`Last Name`
From movies
Inner Join actors on movies.`actor1` Where movies.`actor1` = actors.`indexActors`;
#Inner Join actors on movies.`actor2` Where movies.`actor2` = actors.`indexActors`;
I have the 2nd line commented out, each one works individually, and I'm wondering how to combine them.
2ndly, when I execute the query, I get the results:
ID Title Runtime Rating Actor1 Actor2 First Name Last Name
1 Se7en 127 R 1 2 Morgan Freeman
2 Bruce Almighty 101 PG-13 1 3 Morgan Freeman
3 Mr. Popper's Penguins 94 PG 3 4 Jim Carrey
4 Superbad 113 R 4 5 Emma Stone
5 Crazy, Stupid, Love. 118 PG-13 4 Null Emma Stone
Is there a way to add the results from the 2nd join to the rightmost columns?
Also, is it possible to combine the strings/VARCHARs from First Name and Last Name, and then have that value show up under the corresponding Actor Field?
(aka the field under Actor 1 for row 1 would be "Morgan Freeman" instead of "1")
Thanks.

Your sql is not valid, but you can achieve your goal by joining to the same table twice, with different aliases. This sort of thing
select blah blah blah
from table1 t1 join table2 t2 on t1.field1 = t2.field1
join table2 t2_again on t1.field1 = t2_again.field2
etc
As far as joining first and last names in a single field, most databases have a way to concatenate strings, but they are not all the same. You'll have to specify your db engine.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How do I correctly map letters in the database? - sql

Related

How to assign data without repetition in SQL

SQL subquery as part of LIKE search

SQL CONTAINSTABLE - Unexpected results

QMF Turning Sequence Number Results into their own columns without PIVOT

Query to JOIN / overwrite field

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How do I correctly map letters in the database? - sql

Related

How to assign data without repetition in SQL

SQL subquery as part of LIKE search

SQL CONTAINSTABLE - Unexpected results

QMF Turning Sequence Number Results into their own columns without PIVOT

Query to JOIN / *overwrite* field

Categories

Resources

Query to JOIN / overwrite field