Trying to understand a query (LEFT JOIN and subquery)

Trying to understand a query (LEFT JOIN and subquery) - sql

I've been trying to understand the behavior of a query but i dont fully understand what is going on.
Take a look:
SELECT main.entity_id,main.sku,name.value AS name
FROM product_entity AS main
LEFT JOIN product_entity_varchar AS name ON main.entity_id = name.entity_id
WHERE name.attribute_id = (
SELECT attribute_id
FROM ger_attribute
WHERE attribute_code LIKE "name"
AND 'entity_type_id' = (
SELECT entity_type_id
FROM ger_entity_type
WHERE entity_type_code = 'catalog_product_info'
)
)
Can you please explain why it is using a subquery ,why the LEFT JOIN is important in these cases and the condition entity_type_code = 'catalog_product_info'?
Thanks

SELECT main.entity_id,main.sku,name.value AS name
FROM product_entity AS main
LEFT JOIN product_entity_varchar AS name ON main.entity_id = name.entity_id
...
The query starts by pulling out table product_entity. The LEFT JOIN allows the query to access the record(s) in table product_entity_varchar whose entity_id is equal to the value of the column that has the same name in product_entity.
In the resultset, the value of column value from table product_entity_varchar is displayed, under alias name.
The keyword LEFT makes the relation optional ; it there is no matching record in product_entity_varchar, the name will simply appear as NULL in the output. If it were an [INNER] JOIN, then the relation would be mandatory : unmatched records would be filtered out, and would not appear in the output.

OK. Let's clarify what is LEFT OUTER JOIN in SQL.
LEFT JOIN queries could be understood as just syntax abbreviation of UNION ALL of INNER JOIN-ed and NOT EXIST-ed queries.
Let's decompose LEFT JOIN-ed query:
SELECT main.entity_id,main.sku,name.value AS name
FROM
product_entity AS main
LEFT JOIN product_entity_varchar AS name ON main.entity_id = name.entity_id
WHERE
name.attribute_id = (
SELECT attribute_id
FROM ger_attribute
WHERE attribute_code LIKE "name"
AND 'entity_type_id' = (
SELECT entity_type_id
FROM ger_entity_type
WHERE entity_type_code = 'catalog_product_info'
)
)
Is equivalent to:
SELECT main.entity_id,main.sku,name.value AS name
FROM
product_entity AS main
JOIN product_entity_varchar AS name
ON main.entity_id = name.entity_id
WHERE
name.attribute_id = (
SELECT attribute_id
FROM ger_attribute
WHERE attribute_code LIKE "name"
AND 'entity_type_id' = (
SELECT entity_type_id
FROM ger_entity_type
WHERE entity_type_code = 'catalog_product_info'
)
)
UNION ALL
SELECT main.entity_id,main.sku, NULL AS name -- <-- Attention!
FROM
product_entity AS main
NOT EXISTS(
SELECT * FROM product_entity_varchar AS name
WHERE
main.entity_id = name.entity_id
AND name.attribute_id = (
SELECT attribute_id
FROM ger_attribute
WHERE attribute_code LIKE "name"
AND 'entity_type_id' = (
SELECT entity_type_id
FROM ger_entity_type
WHERE entity_type_code = 'catalog_product_info'
)
)
)
I hope this will help to understand query, although it's impossible without data to answer "why it is using a subquery" etc.

Related

Return matching rows using sub-query to create list in another table?

I'm brand new to learning SQL. I have a very basic understanding of the language.
I have two tables. Storage Data is the table I ultimately want to return by [ScenarioID].
My second table, Storage Static, also has [ScenarioID], but I also need to first filter this table by two other fields [Platform] and [AccountID]. I have the following code that successfully filters this Storage Static table.
select
[ScenarioID]
from [dbo].[Storage Static]
WHERE [Platform] = 'ABC' AND [AccountID] in (select MAX([AccountID])
from [dbo].[Storage Static])
What I'm trying to do is basically embed the above code as a sub-query into my query for the original table Storage Data.
select
[ScenarioID],
[CountryID]
from [dbo].[Storage Data] t1
INNER JOIN
(
select
[ScenarioID]
from [dbo].[Storage Static]
WHERE [Platform] = 'ABC' AND [AccountID] in (select MAX([AccountID]) MaxPop
from [dbo].[Storage Static])
) t2
on t1.[ScenarioID] = t2.MaxPop
I know the MaxPop part doesn't really work but that was my attempt at assigning a name or variable to that sub-query.
Ultimately, I want to filter my original table by the list of [ScenarioID]'s that I created in the sub-query.

If I understand correctly, you just need to fix the JOIN conditions.
select sd.[ScenarioID], sd.[CountryID]
from [dbo].[Storage Data] sd join
(select [ScenarioID]
from [dbo].[Storage Static]
where [Platform] = 'ABC' and
[AccountID] in (select max([AccountID]) from [dbo].[Storage Static])
) ss
on sd.[ScenarioID] = ss.[ScenarioID]

Since you need MaxPop to be taken as a field needed for join'
instead ofinclause, I use aninner join`, so that the filed can be considered for the join.
select t1.[ScenarioID], t1.[CountryID]
from [dbo].[Storage Data] t1
INNER JOIN
(
select [ScenarioID], MaxPop
from [dbo].[Storage Static] s
Inner Join
(select MAX([AccountID]) MaxPop
from [dbo].[Storage Static]) m
on m.MaxPop = s.[AccountID]
WHERE [Platform] = 'ABC'
) t2
on t1.[ScenarioID] = t2.MaxPop

How to use outer field name or alias in a nested subquery

I would use the selected field Referencia in the subquery.
I have tried including the field name the alias, and table name but not works.
How I can achieve this ?
Thanks
SELECT * FROM
(
SELECT
articulos.Codigo AS Referencia,
articulos.Nombre AS Descripcion,
barras.Codigo AS [Codigo de Barras],
ROW_NUMBER() OVER (PARTITION BY articulos.Codigo ORDER BY
articulos.Codigo ASC) as cantidad,
articulos.Familia,
articulos.Marca,
categorias.Codigo as Categoria,
articulos.ImpuestoEspecial AS Ecotasa,
articulos.Fase,
articulos.Iva,
--
-- Tarifa1
( SELECT [Codigo],[EuroPrecio]
FROM [GES16100].[dbo].[Tarifas]
WHERE [Codigo] = 1 AND [Articulo] = <------- Here, Referencia
)AS T1,
articulos.Proveedor,
articulos.GUID_Registro
FROM [GES16100].[dbo].[Articulos] as articulos
FULL JOIN [GES16100].[dbo].[Barras] as barras
ON articulos.Codigo = barras.Articulo
FULL JOIN [GES16100].[dbo].[Categorias_Asignaciones] catasignaciones
ON catasignaciones.GUID_RegistroFichero =articulos.GUID_Registro
FULL JOIN [GES16100].[dbo].[CategoriasFicheros] categorias
ON categorias.GUID_Registro = catasignaciones.GUID_Categoria
)AS supersub
WHERE supersub.cantidad = 1

Use table aliases and qualified column names whenever you have more than one table in a query.
Second, your subquery will not work because it returns two columns where one is expected.
For your example, I am guessing:
( SELECT t.EuroPrecio
FROM [GES16100].[dbo].[Tarifas] t
WHERE t.Codigo = 1 AND t.Articulo = a.Codigo
) AS T1,
You cannot use the column alias Referencias because it is defined in the same SELECT. Just use the column it is refering to.

Providing Language FallBack In A SQL Select Statement

I have a table that represents an Object. It has many columns but also fields that require language support.
For simplicity let's say I have 3 tables:
MainObjectTable
LanguageDependantField1
LanguageDependantField2.
MainObjectTable has a PK int called ID, and both LanguageDependantTables have a foreign key link back to the MainObjectTable along with a language code and the date they were added.
I've created a stored procedure that accepts the MainObjectTable ID and a Language. It will return a single row containing the most recent items from the language tables. The select statement looks like
SELECT
MainObjectTable.VariousColumns,
LanguageDependantField1.Description,
LanguageDependantField2.SomeOtherText
FROM
MainObjectTable
OUTER APPLY
(SELECT TOP 1 LanguageDependantField1.Description
FROM LanguageDependantField1
WHERE LanguageDependantField1.MainObjectTable_ID = MainObjectTable.ID
AND LanguageDependantField1.Language_ID = #language
ORDER BY
LanguageDependantField1.[Default], LanguageDependantField1.CreatedDate DESC) LanguageDependantField1
OUTER APPLY
(SELECT TOP 1 LanguageDependantField2.SomeOtherText
FROM LanguageDependantField2
WHERE LanguageDependantField2.MainObjectTable_ID = MainObjectTable.ID
AND LanguageDependantField2.Language_ID = #language
ORDER BY
LanguageDependantField2.[Default] DESC, LanguageDependantField2.CreatedDate DESC) LanguageDependantField2
WHERE
MainObjectTable.ID = #MainObjectTableID
What I want to add is the ability to fallback to a default language if a row isn't found in the specified language. Let's say we use "German" as the selected language. Is it possible to return an English row from LanguageDependantField1 if the German does not exist presuming we have #fallbackLanguageID
Also am I right to use OUTER APPLY in this scenario or should I be using JOIN?
Many thanks for your help.

Try this:
SELECT MainObjectTable.VariousColumns,
COALESCE(PrefLang.Description,Fallback.Description,'Not Found Desc')
as Description,
COALESCE(PrefLang.SomeOtherText,FallBack.SomeOtherText,'Not found')
as SomeOtherText
FROM MainObjectTable
LEFT JOIN
(SELECT TOP 1 pl.Description,pl.SomeOtherText
FROM LanguageDependantField1 pl
WHERE pl.MainObjectTable_ID = MainObjectTable.ID
AND pl.Language_ID = #language
ORDER BY
pl.[Default], pl.CreatedDate DESC)
PrefLang ON 1=1
LEFT JOIN
(SELECT TOP 1 fb.Description,fb.SomeOtherText
FROM LanguageDependantField1 fb
WHERE fb.MainObjectTable_ID = MainObjectTable.ID
AND fb.Language_ID = #fallbackLanguageID
ORDER BY
fb.[Default], fb.CreatedDate DESC)
Fallback ON 1=1
WHERE
MainObjectTable.ID = #MainObjectTableID
Basically, make two queries, one to the preferred language and one to English (Default). Use the LEFT JOIN, so if the first one isn't found, the second query is used...
I don't have your actual tables, so there might be a syntax error in above, but hope it gives you the concept you want to try...

Yes, the use of Outer Apply is correct if you want to correlate the MainObjectTable table rows to the inner queries. You cannot use Joins with references in the derived table to the outer table. If you wanted to use Joins, you would need to include the joining column(s) and in this case pre-filter the results. Here is what that might look like:
With RankedLanguages As
(
Select LDF1.MainObjectTable_ID, LDF1.Language_ID, LDF1.Description, LDF1.SomeOtherText, ...
, Row_Number() Over ( Partition By LDF1.MainObjectTable_ID, LDF1.Language_ID
Order By LDF1.[Default] Desc, LDF1.CreatedDate Desc ) As Rnk
From LanguageDependantField1 As LDF1
Where LDF1.Language_ID In( #languageId, #defaultLanguageId )
)
Select M.VariousColumns
, Coalesce( SpecificLDF.Description, DefaultLDF.Description ) As Description
, Coalesce( SpecificLDF.SomeOtherText, DefaultLDF.SomeOtherText ) As SomeOtherText
, ...
From MainObjectTable As M
Left Join RankedLanguages As SpecificLDF
On SpecificLDF.MainObjectTable_ID = M.ID
And SpecifcLDF.Language_ID = #languageId
And SpecifcLDF.Rnk = 1
Left Join RankedLanguages As DefaultLDF
On DefaultLDF.MainObjectTable_ID = M.ID
And DefaultLDF.Language_ID = #defaultLanguageId
And DefaultLDF.Rnk = 1
Where M.ID = #MainObjectTableID

Limit join to one row

I have the following query:
SELECT sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount, 'rma' as
"creditType", "Clients"."company" as "client", "Clients".id as "ClientId", "Rmas".*
FROM "Rmas" JOIN "EsnsRmas" on("EsnsRmas"."RmaId" = "Rmas"."id")
JOIN "Esns" on ("Esns".id = "EsnsRmas"."EsnId")
JOIN "EsnsSalesOrderItems" on("EsnsSalesOrderItems"."EsnId" = "Esns"."id" )
JOIN "SalesOrderItems" on("SalesOrderItems"."id" = "EsnsSalesOrderItems"."SalesOrderItemId")
JOIN "Clients" on("Clients"."id" = "Rmas"."ClientId" )
WHERE "Rmas"."credited"=false AND "Rmas"."verifyStatus" IS NOT null
GROUP BY "Clients".id, "Rmas".id;
The problem is that the table "EsnsSalesOrderItems" can have the same EsnId in different entries. I want to restrict the query to only pull the last entry in "EsnsSalesOrderItems" that has the same "EsnId".
By "last" entry I mean the following:
The one that appears last in the table "EsnsSalesOrderItems". So for example if "EsnsSalesOrderItems" has two entries with "EsnId" = 6 and "createdAt" = '2012-06-19' and '2012-07-19' respectively it should only give me the entry from '2012-07-19'.

SELECT (count(*) * sum(s."price")) AS amount
, 'rma' AS "creditType"
, c."company" AS "client"
, c.id AS "ClientId"
, r.*
FROM "Rmas" r
JOIN "EsnsRmas" er ON er."RmaId" = r."id"
JOIN "Esns" e ON e.id = er."EsnId"
JOIN (
SELECT DISTINCT ON ("EsnId") *
FROM "EsnsSalesOrderItems"
ORDER BY "EsnId", "createdAt" DESC
) es ON es."EsnId" = e."id"
JOIN "SalesOrderItems" s ON s."id" = es."SalesOrderItemId"
JOIN "Clients" c ON c."id" = r."ClientId"
WHERE r."credited" = FALSE
AND r."verifyStatus" IS NOT NULL
GROUP BY c.id, r.id;
Your query in the question has an illegal aggregate over another aggregate:
sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount
Simplified and converted to legal syntax:
(count(*) * sum(s."price")) AS amount
But do you really want to multiply with the count per group?
I retrieve the the single row per group in "EsnsSalesOrderItems" with DISTINCT ON. Detailed explanation:
Select first row in each GROUP BY group?
I also added table aliases and formatting to make the query easier to parse for human eyes. If you could avoid camel case you could get rid of all the double quotes clouding the view.

Something like:
join (
select "EsnId",
row_number() over (partition by "EsnId" order by "createdAt" desc) as rn
from "EsnsSalesOrderItems"
) t ON t."EsnId" = "Esns"."id" and rn = 1
this will select the latest "EsnId" from "EsnsSalesOrderItems" based on the column creation_date. As you didn't post the structure of your tables, I had to "invent" a column name. You can use any column that allows you to define an order on the rows that suits you.
But remember the concept of the "last row" is only valid if you specifiy an order or the rows. A table as such is not ordered, nor is the result of a query unless you specify an order by

Necromancing because the answers are outdated.
Take advantage of the LATERAL keyword introduced in PG 9.3
left | right | inner JOIN LATERAL
I'll explain with an example:
Assuming you have a table "Contacts".
Now contacts have organisational units.
They can have one OU at a point in time, but N OUs at N points in time.
Now, if you have to query contacts and OU in a time period (not a reporting date, but a date range), you could N-fold increase the record count if you just did a left join.
So, to display the OU, you need to just join the first OU for each contact (where what shall be first is an arbitrary criterion - when taking the last value, for example, that is just another way of saying the first value when sorted by descending date order).
In SQL-server, you would use cross-apply (or rather OUTER APPLY since we need a left join), which will invoke a table-valued function on each row it has to join.
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
-- CROSS APPLY -- = INNER JOIN
OUTER APPLY -- = LEFT JOIN
(
SELECT TOP 1
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(#in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(#in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
) AS FirstOE
In PostgreSQL, starting from version 9.3, you can do that, too - just use the LATERAL keyword to achieve the same:
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
LEFT JOIN LATERAL
(
SELECT
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(__in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(__in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
LIMIT 1
) AS FirstOE

Try using a subquery in your ON clause. An abstract example:
SELECT
*
FROM table1
JOIN table2 ON table2.id = (
SELECT id FROM table2 WHERE table2.table1_id = table1.id LIMIT 1
)
WHERE
...

SQL WHERE In a many-to-many or many-to-many empty

Does anyone know a way to simplify this WHERE expression?
WHERE (
(#UserSpecialtyID in
(
SELECT CharacteristicSpecialties_Id
FROM ModalityVariantSpecialty
WHERE ModalityVariants_Id = ModalityVariants.Id
)
)
OR
NOT EXISTS
(
SELECT CharacteristicSpecialties_Id
FROM ModalityVariantSpecialty
WHERE ModalityVariants_Id = ModalityVariants.Id
)
)

Something like this should probably work but Im not exactly clear on the relationships for your tables. I could probably give a better example if you could explain the relationships.
SELECT
*
FROM MadalityVariants mv
LEFT JOIN ModalityVariantSpecialty mvs on mvs.ModalityVariants_ID = mv.ID
WHERE
#UserSpecialtyID = mvs.CharacteristicSpecialties_ID
OR
mvs.CharacteristicSpecialties_ID is null

WHERE (
#UserSpecialtyID in
(
SELECT COALESCE(CharacteristicSpecialties_Id, A.A)
FROM (SELECT #UserSpecialtyID A) A LEFT JOIN ModalityVariantSpecialty
ON ModalityVariants_Id = ModalityVariants.Id
)
)
this works well if CharacteristicSpecialties_Id is a NON NULLABLE field.

I am assuming that this is a WHERE clause of a SELECT on the table ModalityVariants
Would this work (The SQL is not tested)?
SELECT *
FROM ModalityVariants
LEFT OUTER JOIN ModalityVariantSpeciality
ON ModalityVariants.Id = ModalityVariants_ID
WHERE CharacteristicSpecialities_Id = #UserSpecialityID or
CharacteristicSpecialities_Id is NULL

Here's my attempt:
WHERE #UserSpecialtyID = COALESCE
(
SELECT TOP 1 CharacteristicSpecialties_Id
FROM ModalityVariantSpecialty
WHERE ModalityVariants_Id = ModalityVariants.Id
ORDER BY
CASE WHEN CharacteristicSpecialties_Id = UserSpecialtyID THEN 1
ELSE 2 END ASC
), #UserSpecialtyID)
If both ModalityVariants_Id and UserSpecialtyID match, the subquery returns CharacteristicSpecialties_Id, and the where succeeds
If only ModalityVariants_Id matches, the subquery returns a different ID, and the where fails
If neither matches, the subquery returns NULL, the COALESCE returns #UserSpecialtyID, and the where succeeds
Probably clearest is a variety of John Hartsock's answer, with a subquery to ensure the left join doesn't add any rows.
select *
from ModalityVariants mv
left join
(
select distinct ModalityVariants_ID
, CharacteristicSpecialties_ID
from ModalityVariantSpecialty
) as mvs
on mvs.ModalityVariants_ID = mv.ID
where #UserSpecialtyID = mvs.CharacteristicSpecialties_ID
OR
mvs.CharacteristicSpecialties_ID is null
I'll vote for John's answer :)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Trying to understand a query (LEFT JOIN and subquery) - sql

Related

Return matching rows using sub-query to create list in another table?

How to use outer field name or alias in a nested subquery

Providing Language FallBack In A SQL Select Statement

Limit join to one row

SQL WHERE In a many-to-many or many-to-many empty

Categories

Resources