SQL join return results if where clause does not match - sql

I’m trying to return values from a join where the where clause might not have a matches.
Here’s my database schema
strings
-------
id: INT
name: VARCHAR
value: VARCHAR
fileId: INT FOREIGN KEY files(id)
languages
---------
id: INT
code: CHAR
name: VARCHAR
translations
------------
id: INT
string_id: INT, FOREIGN KEY strings(id)
language_id: INT, FOREIGN KEY languages(id)
translation: VARCHAR
I’m trying to select all the strings, and all the translations in a given language. The translations may or may not exist for a given language, but I want to return the strings any way.
I’m using a query similar to:
SELECT s.id, s.name, s.value, t.translation
FROM strings s LEFT OUTER JOIN translations t ON s.id = t.string_id
WHERE s.file_id = $1 AND t.language_id = $2
I want to return strings regardless of whether matches are found in the translations table. If translations don’t exist for a given language, that field would of course be null. I think the problem is with the WHERE clause having the t.language_id = ..., since language_id doesn't exist in this particular case. But not sure the best way to fix this.
Database Postgresql

Conditions on the second table need to go in the ON clause:
SELECT s.id, s.name, s.value, t.translation
FROM strings s LEFT OUTER JOIN
translations t
ON s.id = t.string_id AND t.language_id = $2
WHERE s.file_id = $1;
Otherwise, t.language_id is NULL and that fails the comparison in the WHERE clause.

Related

Select records that match several tags

I implemented a standard tagging system on SQLite with two tables.
Table annotation:
CREATE TABLE IF NOT EXISTS annotation (
id INTEGER PRIMARY KEY,
comment TEXT
)
Table label:
CREATE TABLE IF NOT EXISTS label (
id INTEGER PRIMARY KEY,
annot_id INTEGER NOT NULL REFERENCES annotation(id),
tag TEXT NOT NULL
)
I can easily find the annotations that match tags 'tag1' OR 'tag2' :
SELECT * FROM annotation
JOIN label ON label.annot_id = annotation.id
WHERE label.tag IN ('tag1', 'tag2') GROUP BY annotation.id
But how do I select the annotations that match tags 'tag1' AND
'tag2'?
How do I select the annotations that match tags 'tag1'
AND 'tag2' but NOT 'tag3'?
Should I use INTERSECT? Is it efficient or is there a better way to express these?
I would definitely go with INTERSECT for question 1 and EXCEPT for question 2. After many years of experience with SQL I find it best to go with whatever the platform offers in cases where it directly addresses what you want to do.
The only exception would be if you had a really good reason not to. In this case, intersect and except are not ansi standard, so you are stuck with sqlite for as long as you use them.
If you want to go old school and use ONLY straight up SQL it is possible using subqueries, one for tag A, one for tag B, and one for tag C. Using an outer join with an "is null" condition is a common idiom to perform the exclusion.
Here is an sqlite example:
create table annotation (id integer, comment varchar);
create table label (id integer, annot_id integer, tag varchar);
insert into annotation values (1,'annot 1'),(2,'annot 2');
insert into label values (1,1,'tag1'),(2,1,'tag2'),(3,1,'tag2');
insert into label values (1,2,'tag1'),(2,2,'tag2'),(3,2,'tag3');
select distinct x.id,x.comment from annotation x
join label a on a.annot_id=x.id and a.tag='tag1'
join label b on b.annot_id=x.id and b.tag='tag2'
left join label c on c.annot_id=x.id and c.tag='tag3'
where
c.id is null;
This is set up so that both annotation 1 and 2 have tag1 and tag2 but label 2 has tag3 so should be excluded the output is only annotation 1:
id
comment
1
annot 1

SQL: combine two tables for a query

I want to query two tables at a time to find the key for an artist given their name. The issue is that my data is coming from disparate sources and there is no definitive standard for the presentation of their names (e.g. Forename Surname vs. Surname, Forename) and so to this end I have a table containing definitive names used throughout the rest of my system along with a separate table of aliases to match the varying styles up to each artist.
This is PostgreSQL but apart from the text type it's pretty standard. Substitute character varying if you prefer:
create table Artists (
id serial primary key,
name text,
-- other stuff not relevant
);
create table Aliases (
artist integer references Artists(id) not null,
name text not null
);
Now I'd like to be able to query both sets of names in a single query to obtain the appropriate id. Any way to do this? e.g.
select id from ??? where name = 'Bloggs, Joe';
I'm not interested in revising my schema's idea of what a "name" is to something more structured, e.g. separate forename and surname, since it's inappropriate for the application. Most of my sources don't structure the data, sometimes one or the other name isn't known, it may be a pseudonym, or sometimes the "artist" may be an entity such as a studio.
I think you want:
select a.id
from artists a
where a.name = 'Bloggs, Joe' or
exists (select 1
from aliases aa
where aa.artist = a.id and
aa.name = 'Bloggs, Joe'
);
Actually, if you just want the id (and not other columns), then you can use:
select a.id
from artists a
where a.name = 'Bloggs, Joe'
union all -- union if there could be duplicates
select aa.artist
from aliases aa
where aa.name = 'Bloggs, Joe';

Postgres — substitute referenced columns with corresponding data

I'm trying to approach translations (aka i18n) in Postgres, so far I've come up with a following pattern for storing strings in master language and its translations:
-- available translation languages
CREATE TYPE lang AS ENUM ('fr', 'de', 'cn');
-- strings in master language, e.g. en
CREATE TABLE strings (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
string text NOT NULL
);
-- strings translations in other languages, e.g. fr, de
CREATE TABLE translations (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
string_id uuid NOT NULL REFERENCES strings (id) ON UPDATE CASCADE ON DELETE CASCADE,
lang lang NOT NULL,
string text NOT NULL
);
-- a collection of things with a name and a description
CREATE TABLE things (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
name uuid REFERENCES strings (id) ON UPDATE CASCADE,
description uuid REFERENCES strings (id) ON UPDATE CASCADE
);
So basically a Thing has a name and a description and they both reference Strings by id. A String has a master language text (strings.string), and also there are Translations which reference Strings by id.
A small example:
db=# select id, name, description from things;
id | name | description
--------------------------------------+--------------------------------------+--------------------------------------
df2ac652-cae7-4c90-ad85-05793e67ba47 | ce5a6cb6-6f14-4775-bed8-62ed871fdefc | 635e144d-f64f-4e2b-90f8-1280b1b7d24e
(1 row)
db=# select strings.id, strings.string from strings inner join things on (strings.id = things.name or strings.id = things.description); id | string
--------------------------------------+-----------------------------------
ce5a6cb6-6f14-4775-bed8-62ed871fdefc | Cool Thing
635e144d-f64f-4e2b-90f8-1280b1b7d24e | Some Cool Thing description here
(2 rows)
The only problem is that I can't figure out a proper efficient way to retrieve Things with substituted values for a particular language. Say, I want to retrieve it in master language, then I'd probably do a join:
SELECT
things.id AS id,
strings.string AS name
FROM things
INNER JOIN strings
ON (things.name = strings.id);
This would return:
id | name
-------------------------------------+------------
df2ac652-cae7-4c90-ad85-05793e67ba47 | Cool Thing
(1 row)
But I cannot add description, since I've already used strings.string AS name in the above query.
Maybe my approach to i18n is fundamentally wrong and I'm not seeing a simpler solution here. Any help is very much appreciated.
You can just chain the joins together:
SELECT t.id AS id, s.string AS name, trfr.string as name_fr
FROM things t INNER JOIN
strings s
ON t.name = s.id INNER JOIN
translations trfr
ON tr.id = t.name AND lang = 'fr';
I do find your data model a bit confusing.
First, use serial for the primary key instead of uuids, unless you have a real business reason for using uuids. Numbers are much easier to work with.
Second, having a table with string and string_id is just confusing. Your names should be clearer. Maybe something like: string_id and string_in_language.
Third, I would not make lang an enumerated type. I would make it a reference table. You might definitely want to store additional information about the language -- say, the default first day of the week to use, or the full name, or the default currency symbol.
Gordon Linoff gave me a hint, so I'll post an answer in case anyone else has the same issue
SELECT
t.id AS id,
tn.string AS name, -- name translation
td.string AS description -- description translation
FROM things t
INNER JOIN translations tn
ON t.name = tn.string_id
INNER JOIN translations td
ON t.description = td.string_id
WHERE tn.lang = 'fr' AND td.lang = 'fr';
To use a fallback master language:
SELECT
t.id AS id,
COALESCE(tn.string, sn.string) AS name, -- name translation
COALESCE(td.string, sd.string) AS description -- description translation
FROM things t
LEFT OUTER JOIN strings sn
ON t.name = sn.id
LEFT OUTER JOIN strings sd
ON t.description = sd.id
LEFT OUTER JOIN translations tn
ON t.name = tn.string_id
AND tn.lang = 'fr'
LEFT OUTER JOIN translations td
ON t.description = td.string_id
AND td.lang = 'fr';

Json query vs SQL query using JSON in Oracle 12c (Performance)

I am using oracle 12c and Sql Developer with json
For this example I have the follow JSON:
{
"id": "12",
"name": "zhelon"
}
So I have created the follow table for this:
create table persons
id number primary key,
person clob,
constraint person check(person is JSON);
The idea is persist in person column the previous JSON and use a the follow query to get that data
SELECT p.person FROM persons p WHERE json_textvalue('$name', 'zhelon')
Talking about perfonce, I am intresting to extract some json field and add new a colum to the table to improve the response time (I don't know if that is possible)
create table persons
id number primary key,
name varchar(2000),
person clob,
constraint person check(person is JSON);
To do this:
SELECT p.person FROM persons p WHERE p.name = 'zhelon';
My question is:
What's the best way to make a query to get data? I want to reduce the response time.
Which query get the data faster ?
SELECT p.person FROM persons p WHERE json_textvalue('$name', 'zhelon')
or
SELECT p.person FROM persons p WHERE p.name = 'zhelon';
You can create a virtual column like this:
ALTER TABLE persons ADD (NAME VARCHAR2(100)
GENERATED ALWAYS AS (JSON_VALUE(person, '$name' returning VARCHAR2)) VIRTUAL);
I don't know the correct syntax of JSON_VALUE but I think you get an idea.
If needed you can also define a index on such columns like any other column.
However, when you run SELECT p.person FROM persons p WHERE p.name = 'zhelon';
I don't know which value takes precedence, p.person from JSON or the column.
Better use a different name in order to be on the safe side:
ALTER TABLE persons ADD (NAME_VAL VARCHAR2(100)
GENERATED ALWAYS AS (JSON_VALUE(person, '$name' returning VARCHAR2)) VIRTUAL);
SELECT p.person FROM persons p WHERE p.NAME_VAL= 'zhelon';

Conversion failed when converting the nvarchar value 'Bottle' to data type int

This is my question: Select all rows from the ProductVendor Table and only the corresponding data from the Unit of Measure table. Display columns BusinessEntityID, ProductID, StandardPrice from the Product Vendor table and the Name from the Unit of Measure table. Write this 2 different ways – 1 using a join and the other using a WHERE clause. (10 Pts.)
This is my code:
USE AdventureWorks2008R2;
SELECT CAST(Name as INT),BusinessEntityID, ProductID, StandardPrice, Name
FROM Purchasing.ProductVendor Right Outer JOIN Production.UnitMeasure
ON Purchasing.ProductVendor.BusinessEntityID = Production.UnitMeasure.Name
And I keep getting this error: Conversion failed when converting the nvarchar value 'Bottle' to data type int.
The error is coming from your JOIN, as BusinessEntityID is an INT, while Name is an NVARCHAR, when you JOIN on those two fields it's attempting to convert the Name field to an INT, which fails as it doesn't contain INT values.
You need to change the ON criteria to be the appropriate fields that relate the two tables (and use table aliases to make it simple):
SELECT pv.BusinessEntityID, pv.ProductID, pv.StandardPrice, m.Name
FROM Purchasing.ProductVendor pv
JOIN Production.UnitMeasure m
ON pv.BusinessEntityID = m.ID --?