How to make SQL query that will combine rows of result from one table with rows of another table in specific conditions in SQLite - sql

I have aSQLite3 database with three tables. Sample data looks like this:
Original
id aName code
------------------
1 dog DG
2 cat CT
3 bat BT
4 badger BDGR
... ... ...
Translated
id orgID isTranslated langID aName
----------------------------------------------
1 2 1 3 katze
2 1 1 3 hund
3 3 0 3 (NULL)
4 4 1 3 dachs
... ... ... ... ...
Lang
id Langcode
-----------
1 FR
2 CZ
3 DE
4 RU
... ...
I want to select all data from Original and Translated in way that result would consist of all data in Original table, but aName of rows that got translation would be replaced with aName from Translated table, so then I could apply an ORDER BY clause and sort data in the desired way.
All data and table designs are examples just to show the problem. The schema does contain some elements like an isTranslated column or translation and original names in separate tables. These elements are required by application destination/design.
To be more specific this is an example rowset I would like to produce. It's all the data from table Original modified by data from Translated if translation is available for that certain id from Original.
Desired Result
id aName code isTranslated
---------------------------------
1 hund DG 1
2 katze CT 1
3 bat BT 0
4 dachs BDGR 1
... ... ... ...

This is a typcial application for the CASE expression:
SELECT Original.id,
CASE isTranslated
WHEN 1 THEN Translated.aName
ELSE Original.aName
END AS aName,
code,
isTranslated
FROM Original
JOIN Translated ON Original.id = Translated.orgID
WHERE Translated.langID = (SELECT id FROM Lang WHERE Langcode = 'DE')
If not all records in Original have a corresponding record in Translated, use LEFT JOIN instead.
If untranslated names are guaranteed to be NULL, you can just use IFNULL(Translated.aName, Original.aName) instead.

You should probably list the actual results you want, which would help people help you in the future.
In the current case, I'm guessing you want something along these lines:
SELECT Original.id, Original.code, Translated.aName
FROM Original
JOIN Lang
ON Lang.langCode = 'DE'
JOIN Translated
ON Translated.orgId = Original.id
AND Translated.langId = Lang.id
AND Translated.aName IS NOT NULL;
(Check out my example to see if these are the results you want).
In any case, the table set you've got is heading towards a fairly standard 'translation table' setup. However, there are some basic changes I'd make.
Original
Name the table to something specific, like Animal
Don't include a 'default' translation in the table (you can use a view, if necessary).
'code' is fine, although in the case of animals, genus/species probably ought to be used
Lang
'Lanugage' is often a reserved word in RDBMSs, so the name is fine.
Specifically name which 'language code' you're using (and don't abbreviate column names). There's actually (up to) three different ISO codes possible - just grab them all.
(Also, remember that languages have language-specific names, so language also needs it's own 'translation' table)
Translated
Name the table entity-specific, like AnimalNameTranslated, or somesuch.
isTranslated is unnecessary - you can derive it from the existence of the row - don't add a row if the term isn't translated yet.
Put all 'translations' into the table, including the 'default' one. This means all your terms are in one place, so you don't have to go looking elsewhere.

Related

Efficiently return words that match, or whose synonym(s), match a keyword

I have a database of industry-specific terms, each of which may have zero or more synonyms. Users of the system can search for terms by keyword and the results should include any term that contains the keyword or that has at least one synonym that contains the keyword. The result should then include the term and ONLY ONE of the matching synonyms.
Here's the setup... I have a term table with 2 fields: id and term. I also have a synonym table with 3 fields: id, termId, and synonym. So there would data like:
term Table
id | term
-- | -----
1 | dog
2 | cat
3 | bird
synonym Table
id | termId | synonym
-- | ------ | --------
1 | 1 | canine
2 | 1 | man's best friend
3 | 2 | feline
A keyword search for (the letter) "i" should return the following as a result:
id | term | synonym
-- | ------ | --------
1 | dog | canine <- because of the "i" in "canine"
2 | cat | feline <- because of the "i" in "feline"
3 | bird | <- because of the "i" in "bird"
Notice how, even though both "dog" synonyms contain the letter "i", only one was returned in the result (doesn't matter which one).
Because I need to return all matches from the term table regardless of whether or not there's a synonym and I need no more than 1 matching synonym, I'm using an OUTER APPLY as follows:
<!-- language: sql -->
SELECT
term.id,
term.term,
synonyms.synonym
FROM
term
OUTER APPLY (
SELECT
TOP 1
term.id,
synonym.synonym
FROM
synonym
WHERE
term.id = synonym.termId
AND synonym.synonym LIKE #keyword
) AS synonyms
WHERE
term.term LIKE #keyword
OR synonyms.synonym LIKE #keyword
There are indexes on term.term, synonym.termId and synonym.synonym. #Keyword is always something like '%foo%'. The problem is that, with close to 50,000 terms (not that much for databases, I know, but...), the performance is horrible. Any thoughts on how this can be done more efficiently?
Just a note, one thing I had thought to try was flattening the synonyms into a comma-delimited list in the term table so that I could get around the OUTER APPLY. Unfortunately though, that list can easily exceed 900 characters which would then prevent SQL Server from adding an index to that column. So that's a no-go.
Thanks very much in advance.
You've got a lot of unnecessary logic in there. There's no telling how SQL server is creating an execution path. It's simpler and more efficient to split this up into two separate db calls and then merge them in your code:
Get matches based on synonyms:
SELECT
term.id
,term.term
,synonyms.synonym
FROM
term
INNER JOIN synonyms ON term.termId = synonyms.termId
WHERE
synonyms.synonym LIKE #keyword
Get matches based on terms:
SELECT
term.id
,term.term
FROM
term
WHERE
term.term LIKE #keyword
For "flattening the synonyms into a comma-delimited list in the term table: - Have you considered using Full Text Search feature? It would be much faster even when your data goes on becoming bulky.
You can put all synonyms (as comma delimited) in "synonym" column and put full text index on the same.
If you want to get results also with the synonyms of the words, I recommend you to use Freetext. This is an example:
SELECT Title, Text, * FROM [dbo].[Post] where freetext(Title, 'phone')
The previous query will match the words with ‘phone’ by it’s meaning, not the exact word. It will also compare the inflectional forms of the words. In this case it will return any title that has ‘mobile’, ‘telephone’, ‘smartphone’, etc.
Take a look at this article about SQL Server Full Text Search, hope it helps

Select multiple results from sub query into single row (as array datatype)

I'm trying to solve a small problem with a SQL query in an oracle database. Let's assume I have these tables:
One table that holds information about cars:
tblCars
ID Model Color
--------------------
1 Volvo Red
2 BMW Blue
3 BMW Green
And another one containing information about drivers:
tblDrivers
ID fID_tblCars Name
---------------------------
1 1 George
2 1 Mike
3 2 Jason
4 2 Paul
5 2 William
6 3 Steve
Now, let's pretend that to find out the popularity of the cars, I want to create reports that contain the data about the cars and the people that are driving them (which seems a very reasonable thing one would accomplish with a database).
This "ReportObject" would have a string for the model, a string for the color and an array (or a list) of strings for the drivers.
Currently, I do this with two queries, in the first I select the cars
SELECT ID, Model, Color FROM tblCars
and create a report object for each result.
Then, I would take each result and get the drivers for each specific car
SELECT Name FROM tblDrivers WHERE fID_tblCars = ResultObject.ID
Basically, step one gives me a resulting data set that looks like this:
Result
------------------------------------------
ColumnID ColumnModel ColumnColor
Type Integer Type String Type String
and now, if I will have more cars in the future, I will have to make a lot of additional queries, one for each row in the resulting table.
When I try this:
SELECT Model, Color, (SELECT Name FROM tblDrivers WHERE tblDrivers.fID_tblCars = tblCars.ID) as Name FROM tblCars
I get some error message telling me that one result in the row contains multiple elements (which is what I want!).
I want the result to look like this:
Result
--------------------------------------------------------
ColumnID ColumnModel ColumnColor ColumnName
Type Integer Type String Type String Type Array
So when I build my report object, I could do something like this:
foreach (var Row in Results)
{
ReportObject.Model = Row.Model;
ReportObject.Color = Row.Color;
foreach (string Driver in Row.Name)
{
ReportObject.Drivers.Add(Driver);
}
}
Am I completely missing my basics here or do I have to split this up in multiple queries?
Thanks!
This works in Oracle. In the SQL Fiddle example I couldn't get the IDENTITY or the PRIMARY KEYS to work when creating the table (never used Oracle SQL before)
SELECT c.id,
c.model,
c.color,
LISTAGG(d.name, ',') WITHIN GROUP (ORDER BY d.name) AS "Drivers"
FROM tblCars c
JOIN tblDrivers d
ON c.id = d.fID_TblCars
GROUP BY c.id,
c.model,
c.color
ORDER BY c.Id
SQL Fiddle Example

Selecting rows using multiple LIKE conditions from a table field

I created a table out of a CSV file which is produced by an external software.
Amongst the other fields, this table contains one field called "CustomID".
Each row on this table must be linked to a customer using the content of that field.
Every customer may have one or more set of customIDs at their own discretion, as long as each sequence starts with the same prefix.
So for example:
Customer 1 may use "cust1_n" and "cstm01_n" (where n is a number)
Customer 2 may use "customer2_n"
ImportedRows
PKID CustomID Description
---- --------------- --------------------------
1 cust1_001 Something
2 cust1_002 ...
3 cstm01_000001 ...
4 customer2_00001 ...
5 cstm01_000232 ...
..
Now I have created 2 support tables as follows:
Customers
PKID Name
---- --------------------
1 Customer 1
2 Customer 2
and
CustomIDs
PKID FKCustomerID SearchPattern
---- ------------ -------------
1 1 cust1_*
2 1 cstm01_*
3 2 customer2_*
What I need to achieve is the retrieval of all rows for a given customer using all the LIKE conditions found on the CustomIDs tables for that customer.
I have failed miserably so far.
Any clues, please?
Thanks in advance.
Silver.
To use LIKE you must replace the * with % in the pattern. Different dbms use different functions for string manipulation. Let's assume there is a REPLACE function available:
SELECT ir.*
FROM ImportedRows ir
JOIN CustomIDs c ON ir.CustomID LIKE REPLACE(c.SearchPattern, '*', '%')
WHERE c.FKCustomerID = 1;

INNER JOIN on broken table (non-nullable field)

I have three tables:
Document DocumentExt Tracking
------------------------- ----------------- --------------------------------------------
ID | Name | DocType DocId | OtherId ID | DocOtherId | DocType | AccessTime
------------------------- ----------------- --------------------------------------------
1 SomeDoc Z 1 Doc1 1 Doc2 X [Date here]
2 SomeDoc2 X 2 Doc2 2 A [Date here]
3 SomeDoc3 Y 3 Doc3 3 Doc1 Z [Date here]
... ... ... ... ... ... ... ... ...
Note the missing value in Tracking.DocOtherId. This is a non-nullable field and defaults to an empty string.
The problem is that I need to perform a join on these two that includes one row for each record in Tracking, but also the associated information from Document. Such like:
SELECT
Tracking.ID, Document.Name
FROM
Tracking
INNER JOIN
DocumentExt ON DocumentExt.OtherId = Tracking.DocOtherId
INNER JOIN
Document ON Document.ID = DocumentExt.DocId
However, since Tracking.DocOtherId in non-nullable, the query is returning a row for each of the records in DocumentExt AND Tracking. I need it to treat Tracking.DocOtherId as nullable for the purposes of the query so that the JOIN will work properly. Is there a way to do this?
EDIT: I suppose I need to make it clear that I need ONE record returned for EACH record in Tracking, including the ones with the empty string in DocOtherId.
For example:
TrackingId | DocumentName
---------------------------
1 SomeDoc2
2 NULL
3 SomeDoc1
... ...
EDIT 2: Flagged for closure. I went about this all wrong and I've taken an entirely different approach.
If your data is not significantly large, you can try using a temp table to strip your data and then do a join on that refined data. For example:
EDIT: to respond to the first comment below
this first section will get you the temp table you need without the empty strings and instead have nulls where the empty strings would be
Select ID,
case when DocOtherId = '' then null else DocOtherId end,
AccessTime
into #tracking
From Tracking
then use this temp table to join back to your data the same way you had originally. in this example you just sub out #tracking for Tracking, though you can call your temp table anything you want.
SELECT
Tracking.ID, Document.Name
FROM
#tracking
INNER JOIN
DocumentExt ON DocumentExt.OtherId = Tracking.DocOtherId
INNER JOIN
Document ON Document.ID = DocumentExt.DocId
I saw this comment:
I need ALL of the records from Tracking included.
That means you want an OUTER join, not an INNER join. Use the coalesce() function to show a generic value for missing names.
SELECT Tracking.ID, coalesce(Document.Name,'') As Name
FROM Tracking t
LEFT JOIN DocumentExt de ON de.OtherID = t.DocOtherId
LEFT JOIN Document d on d.ID = de.DocID
This will return ONE record for EACH record in the tracking table, including the ones with the empty string in DocOtherId.

How to specify row names in MS Access 2007

I have a cross tab query and it pulls only the row name if there is data associated with it in the database. For example, if I have three types of musical instruments:
Guitar
Piano
Drums
Other
My results will show up as:
Guitar 1
Drums 2
It doesn't list Piano because there is no ID associated with Piano in the DB. I know I can specify columns in the properties menu, i.e. "1, 2, 3, 4, 5" will put columns in the DB for each, regardless of whether or not there is data to populate them.
I am looking for a similar solution for rows. Any ideas?
Also, I need NULL values to show up as 0.
Here's the actual SQL (forget the instrument example above)
TRANSFORM Count(Research.Patient_ID) AS CountOfPatient_ID
SELECT
Switch(
[Age]<22,"21 and under",
[Age]>=22 And [AGE]<=24,"Between 22 And 24",
[Age]>=25 And [AGE]<=29,"Between 25 And 29",
[Age]>=30 And [AGE]<=34,"30-34",
[Age]>=35 And [AGE]<=39,"35-39",
[Age]>=40 And [AGE]<=44,"40-44",
[Age]>44,"Over 44"
) AS Age_Range
FROM (Research
INNER JOIN (
SELECT ID, DateDiff("yyyy",DOB,Date()) AS AGE FROM Demographics
) AS Demographics ON Research.Patient_ID=Demographics.ID)
INNER JOIN [Letter Status] ON Research.Patient_ID=[Letter Status].Patient_ID
WHERE ((([Letter Status].Letter_Count)=1))
GROUP BY Demographics.AGE, [Letter Status].Letter_Count
PIVOT Research.Site In (1,2,3,4,5,6,7,8,9,10);
In short, I need all of the rows to show up regardless of whether or not there is a value (for some reason the LEFT JOIN isn't working, so if you can, please use my code to form your answer), and I also need to replace NULL values with 0.
Thanks
I believe this has to do with the way you are joining the instruments table to the IDs table. If you use a left outer join from instruments to IDs, Piano should be included. It would be helpful to see your actual tables and queries though, as your question is kind of vague.
What if you union the select with a hard coded select with one value for each age group.
select 1 as Guitar, 1 as Piano, 1 as Drums, 1 as Other
When you do the transform, each row will have a result that is +1 of the result you want.
foo barTmpCount
-------- ------------
Guitar 2
Piano 1
Drums 3
Other 1
You can then do a
select foo, barTmpCount - 1 as barCount from <query>
and get something like this
foo barCount
-------- ---------
Guitar 1
Piano 0
Drums 2
Other 0