SQL JOIN: add custom constraint in JOIN clause - sql

I want to select the preferred language if it exists and the default language otherwise.
SELECT a.code,
case
when vpi.program_items is not null then vpi.program_items else vpi2.program_items
end
FROM activity a
LEFT OUTER JOIN v_program_items vpi ON vpi.activity_id = a.id AND vpi.language = 'fr_BE'
LEFT OUTER JOIN v_program_items vpi2 ON vpi2.activity_id = a.id AND vpi2.language = 'fr'
WHERE a.id = 62170
The v_program_items table looks as :
- ID | language| program_items
- 62170 | fr | Présentation du club et des machines¤Briefing avant le vol¤45 minutes de vol en ULM
- 62170 | fr_BE | Un vol en ULM (45 min)
I use two JOIN (on the same table) and one CASE/WHEN.
Is it possible to use only one JOIN ?

The joins you have are fine and perform very well with an index - should be a UNIQUE index (or PK):
CREATE UNIQUE INDEX ON v_program_items (activity_id, language);
Use COALESCE in the SELECT list, like "PM 77-1" suggested in a comment:
SELECT a.code, COALESCE(v1.program_items, v2.program_items) AS program_items
FROM activity a
LEFT JOIN v_program_items v1 ON v1.activity_id = a.id AND v1.language = 'fr_BE'
LEFT JOIN v_program_items v2 ON v2.activity_id = a.id AND v2.language = 'fr'
WHERE a.id = 62170;
In Postgres 11, and only if your table v_program_items is big, consider a covering index:
CREATE UNIQUE INDEX ON v_program_items (activity_id, language) INCLUDE (program_items);
Related:
Do covering indexes in PostgreSQL help JOIN columns?
Either way, while selecting only a single row (or a few), lowly correlated subqueries should be even faster. Simple, too:
SELECT a.code
, COALESCE((SELECT program_items FROM v_program_items WHERE activity_id = a.id AND language = 'fr_BE')
, (SELECT program_items FROM v_program_items WHERE activity_id = a.id AND language = 'fr')) AS program_items
FROM activity a
WHERE a.id = 62170

Your method is fine. If you wanted just one join, you could do it as:
SELECT a.code, vpi.program_items
FROM activity a LEFT JOIN
v_program_items vpi
ON vpi.activity_id = a.id A
WHERE a.id = 62170 AND vp.language in ('fr_BE', 'fr')
ORDER BY (vp.language = 'fr_BE') DESC
FETCH 1 ROW ONLY;
It is not clear that this would have better performance, though.

Related

SQL: Joining 3 table in SQL, Return the earliest date and the date is not null

I'm new in SQL. Will need you guys provide me some guide.
I have join 2 table to get the container information and would like to join another table in order to get the date. Here's the code for the first join.
Select a.ConsolNumber, a.ConsolType,a.ConsolTransport,b.Container_20F,b.Container_20R,b.Container_20H, b.Container_40F,b.DeliveryMode
FROM ConsolHeader a
LEFT Join Containers b on a.Consolnumber = b.Consolnumber
For the second join, here's come with a trickle part which some of the consolnumber having few transit.
For example
|ConsolNumber| ETD |
|------------|---------|
|C00713392 | null |
|C00713392 | 1/1/2021|
|C00713392 | 2/1/2021|
I would love to get the earliest date (1/1/2021) but not null. Here is the code I tried, In result, there is no null ETD date taken but some of the Consolnumber return with the latest date. (2/1/2021)
Select a.ConsolNumber, a.ConsolType,a.ConsolTransport,b.Container_20F,b.Container_20R,b.Container_20H, b.Container_40F,b.DeliveryMode,c.Min(c.ETD)
FROM ConsolHeader a
LEFT Join Containers b on a.Consolnumber = b.Consolnumber
INNER Join ConsolLegs c on a.Consolnumber = c.ConsolNumber
WHERE c.ETD is not null
GROUP BY a.ConsolNumber, a.ConsolType,a.ConsolTransport,b.Container_20F,b.Container_20R,b.Container_20H, b.Container_40F,b.DeliveryMode
More than that, I have more than 100k data row, kindly suggest query which will run more efficiency.
Appreciate and thanks any helps given!
A correlated subquery is a simple method:
SELECT ch.ConsolNumber, ch.ConsolType, ch.ConsolTransport, ch.Container_20F,
c.Container_20R, c.Container_20H, c.Container_40F, c.DeliveryMode,
(SELECT MIN(cl.ETD)
FROM ConsolLegs cl
WHERE cl.Consolnumber = ch.Consolnumber
) as min_ETD
FROM ConsolHeader ch LEFT JOIN
Containers c
ON c.Consolnumber = ch.Consolnumber;
Notes:
MIN() automatically ignores NULLs.
Meaningful table aliases make the query easier to write and to read.
This avoids the outer GROUP BY, which is usually a performance win.
In most databases you want an index on ConsoleLegs(Consolnumber, ETD) for performance.
You can use the NOT EXISTS as follows:
Select a.ConsolNumber, a.ConsolType,
a.ConsolTransport, b.Container_20F,
b.Container_20R, b.Container_20H,
b.Container_40F, b.DeliveryMode,
c.ETD
FROM ConsolHeader a
LEFT Join Containers b on a.Consolnumber = b.Consolnumber
INNER Join ConsolLegs c on a.Consolnumber = c.ConsolNumber
WHERE c.ETD is not null
AND not exists
(select 1 from ConsolLegs cc where c.Consolnumber = cc.Consolnumber
and cc.etd < c.etd)
you can get min ETD first:
SELECT MIN(CL.ETD) FROM ConsolLegs CL
then get result :
Select a.ConsolNumber, a.ConsolType,
a.ConsolTransport, b.Container_20F,
b.Container_20R, b.Container_20H,
b.Container_40F, b.DeliveryMode,
c.ETD
FROM ConsolHeader a
LEFT Join Containers b on a.Consolnumber = b.Consolnumber
INNER Join ConsolLegs c on a.Consolnumber = c.ConsolNumber
AND c.ETD = (SELECT MIN(CL.ETD) FROM ConsolLegs CL)
if query is slow ,try add index on ConsolLegs.ETD

How to optimize query postgres

I am running the following query:
SELECT fat.*
FROM Table1 fat
LEFT JOIN modo_captura mc ON mc.id = fat.modo_captura_id
INNER JOIN loja lj ON lj.id = fat.loja_id
INNER JOIN rede rd ON rd.id = fat.rede_id
INNER JOIN bandeira bd ON bd.id = fat.bandeira_id
INNER JOIN produto pd ON pd.id = fat.produto_id
INNER JOIN loja_extensao le ON le.id = fat.loja_extensao_id
INNER JOIN conta ct ON ct.id = fat.conta_id
INNER JOIN banco bc ON bc.id = ct.banco_id
LEFT JOIN conciliacao_vendas cv ON fat.empresa_id = cv.empresa_id AND cv.chavefato = fat.chavefato AND fat.rede_id = cv.rede_id
WHERE 1 = 1
AND cv.controle_upload_arquivo_id = 6906
AND fat.parcela = 1
ORDER BY fat.data_venda, fat.data_credito limit 20
But very slowly. Here the Explain plan: http://explain.depesz.com/s/DnXH
Try this rewritten version:
SELECT fat.*
FROM Table1 fat
JOIN conciliacao_vendas cv USING (empresa_id, chavefato, rede_id)
JOIN loja lj ON lj.id = fat.loja_id
JOIN rede rd ON rd.id = fat.rede_id
JOIN bandeira bd ON bd.id = fat.bandeira_id
JOIN produto pd ON pd.id = fat.produto_id
JOIN loja_extensao le ON le.id = fat.loja_extensao_id
JOIN conta ct ON ct.id = fat.conta_id
JOIN banco bc ON bc.id = ct.banco_id
LEFT JOIN modo_captura mc ON mc.id = fat.modo_captura_id
WHERE cv.controle_upload_arquivo_id = 6906
AND fat.parcela = 1
ORDER BY fat.data_venda, fat.data_credito
LIMIT 20;
JOIN syntax and sequence of joins
In particular I fixed the misleading LEFT JOIN to conciliacao_vendas, which is forced to act as a plain [INNER] JOIN by the later WHERE condition anyways. This should simplify query planning and allow to eliminate rows earlier in the process, which should make everything a lot cheaper. Related answer with detailed explanation:
Explain JOIN vs. LEFT JOIN and WHERE condition performance suggestion in more detail
USING is just a syntactical shorthand.
Since there are many tables involved in the query and the order the rewritten query joins tables is optimal now, you can fine-tune this with SET LOCAL join_collapse_limit = 1 to save planning overhead and avoid inferior query plans. Run in a single transaction:
BEGIN;
SET LOCAL join_collapse_limit = 1;
SELECT ...; -- read data here
COMMIT; -- or ROOLBACK;
More about that:
Sample Query to show Cardinality estimation error in PostgreSQL
The fine manual on Controlling the Planner with Explicit JOIN Clauses
Index
Add some indexes on lookup tables with lots or rows (not necessary for just a couple of dozens), in particular (taken from your query plan):
Seq Scan on public.conta ct ... rows=6771
Seq Scan on public.loja lj ... rows=1568
Seq Scan on public.loja_extensao le ... rows=16394
That's particularly odd, because those columns look like primary key columns and should already have an index ...
So:
CREATE INDEX conta_pkey_idx ON public.conta (id);
CREATE INDEX loja_pkey_idx ON public.loja (id);
CREATE INDEX loja_extensao_pkey_idx ON public.loja_extensao (id);
To make this really fat, a multicolumn index would be of great service:
CREATE INDEX foo ON Table1 (parcela, data_venda, data_credito);

Avoiding NULL records on this LEFT JOIN

The query below works - EXCEPT - it is returning NULL values for vehicle_id. I do not want any records that have NULL for vehicle_id.
Since vehicle_id is tied to fund_series, this is complicated to me.
When I had the vehicle_id conditions underneath the WHERE, the query was not working. Any SQL geniuses that can help?
I put the MIN() aggregate functions in there just so I could get the GROUP BY to work.
SELECT DISTINCT
MIN(ml.pretty_file_name),
ml.filename,
MIN(ml.issued_date),
MIN(mr.rule_name),
MIN(mlob.line_of_business_name),
MIN(mt.media_type_name),
MAX(v.vehicle_name)
FROM Media_Live ml
JOIN Media_Type mt
ON mt.media_type_id = ml.media_type_id
JOIN Media_Rule mr
ON mr.rule_id = ml.rule_id
JOIN Media_Line_Of_Business mlob
ON mlob.line_of_business_id = ml.line_of_business_id
LEFT JOIN Fund_Class_Media fcm
ON fcm.media_id=ml.media_id
LEFT JOIN Fund_Class_Live fc
ON fc.fund_class_id = fcm.fund_class_id
LEFT JOIN Fund_Series fs
ON fs.fund_series_id = fc.fund_series_id
LEFT JOIN Vehicle AS v
ON v.vehicle_id=fs.vehicle_id AND /*THIS IS WHERE IM GETTING NULLS*/
(
v.vehicle_id = 1
OR v.vehicle_id = 2
OR v.vehicle_id = 5
)
LEFT JOIN Media_Media_Tag AS mmt ON mmt.media_id=ml.media_id
LEFT JOIN Media_Tag AS mtag ON mtag.tag_id=mmt.tag_id
WHERE
(/*people can search with terms for fc*/
--fc.fund_class_id LIKE '%'+replace(?,' ','%')+'%'
)
(
mt.media_type_id = 33
OR mt.media_type_id = 1
OR mt.media_type_id = 12
)
AND
(
mr.rule_id = 3
OR mr.rule_id = 9
)
AND
(
mtag.tag_name != 'exclude_web_lit_center'
)
GROUP BY ml.filename
This is what a left join does, allow nulls. Just take out the left join part making it an inner join.
JOIN Vehicle AS v ON v.vehicle_id=fs.vehicle_id AND v.vehicle_id IN (1,2,5)
You cold also do this, but I don't see why you would:
LEFT JOIN Vehicle AS v ON v.vehicle_id=fs.vehicle_id AND ISNULL(v.vehicle_id,0) IN (1,2,5)
In the WHERE clause, add:
AND v.Vehicl_Id IS NOT NULL
That should do it.

Joining two tables on a key and then left outer joining a table on a number of criteria

I'm attempting to join 3 tables together in a single query. The first two have a key so each entry has a matching entry. This joined table will then be joined by a third table that could produce multiple entries for each entry from the first table (the joined ones).
select * from
(select a.bidentifier, a.bsession, a.symbol, b.jidentifier, b.JSession
from trade_monthly a, trade_monthly_second b
where
a.bidentifier = b.jidentifier AND
a.bsession = b.JSession)
left outer join
trade c
on c.symbol = a.symbol
order by a.bidentifier, a.bsession, a.symbol, b.jidentifier, b.JSession, c.symbol
There will be more criteria (not just c.symbol = a.symbol) on the left outer join but for now this should be useful. How can I nest the queries this way? I'm gettin gan SQL command not properly ended error.
Any help is appreciated.
Thanks
For what I know every derived table must be given a name; so try something like this:
SELECT * FROM
(SELECT a.bidentifier, ....
...
a.bsession = b.JSession) t
LEFT JOIN trade c
ON c.symbol = t.symbol
ORDER BY t.bidentifier, ...
Anyway I think you could use a simpler query:
SELECT a.bidentifier, a.bsession, a.symbol, b.jidentifier, b.JSession, c.*
FROM trade_monthly a
INNER JOIN trade_monthly_second b
ON a.bidentifier = b.jidentifier
AND a.bsession = b.JSession
LEFT JOIN trade c
ON c.symbol = a.symbol
ORDER BY a.bidentifier, a.bsession, a.symbol, b.jidentifier, b.JSession, c.symbol
Try this:
SELECT
`trade_monthly`.`bidentifier` AS `bidentifier`,
`trade_monthly`.`bsession` AS `bsession`,
`trade_monthly`.`symbol` AS `symbol`,
`trade_monthly_second`.`jidentifier` AS `jidentifier`,
`trade_monthly_second`.`jsession` AS `jsession`
FROM
(
(
`trade_monthly`
JOIN `trade_monthly_second` ON(
(
(
`trade_monthly`.`bidentifier` = `trade_monthly_second`.`jidentifier`
)
AND(
`trade_monthly`.`bsession` = `trade_monthly_second`.`jsession`
)
)
)
)
JOIN `trade` ON(
(
`trade`.`symbol` = `trade_monthly`.`symbol`
)
)
)
ORDER BY
`trade_monthly`.`bidentifier`,
`trade_monthly`.`bsession`,
`trade_monthly`.`symbol`,
`trade_monthly_second`.`jidentifier`,
`trade_monthly_second`.`jsession`,
`trade`.`symbol`
Why don't you just create a view of the two inner joined tables. Then you can build a query that joins this view to the trade table using the left outer join matching criteria.
In my opinion, views are one of the most overlooked solutions to a lot of complex queries.

SQL joins "going up" two tables

I'm trying to create a moderately complex query with joins:
SELECT `history`.`id`,
`parts`.`type_id`,
`serialized_parts`.`serial`,
`history_actions`.`action`,
`history`.`date_added`
FROM `history_actions`, `history`
LEFT OUTER JOIN `parts` ON `parts`.`id` = `history`.`part_id`
LEFT OUTER JOIN `serialized_parts` ON `serialized_parts`.`parts_id` = `history`.`part_id`
WHERE `history_actions`.`id` = `history`.`action_id`
AND `history`.`unit_id` = '1'
ORDER BY `history`.`id` DESC
I'd like to replace `parts`.`type_id` in the SELECT statement with `part_list`.`name` where the relationship I need to enforce between the two tables is `part_list`.`id` = `parts`.`type_id`. Also I have to use joins because in some cases `history`.`part_id` may be NULL which obviously isn't a valid part id. How would I modify the query to do this?
Here is some sample date as requested:
history table:
(source: ianburris.com)
serialized_parts table:
(source: ianburris.com)
parts table:
(source: ianburris.com)
part_list table:
(source: ianburris.com)
And what I want to see is:
id name serial action date_added
4 Battery 567 added 2010-05-19 10:42:51
3 Antenna Board 345 added 2010-05-19 10:42:51
2 Main Board 123 added 2010-05-19 10:42:51
1 NULL NULL created 2010-05-19 10:42:51
This would at least be on the right track...
If you're looking to NOT show any parts with an invalid ID, simply change the LEFT JOINs to INNER JOINs (they will restrict NULL values)
SELECT `history`.`id`
, `parts`.`type_id`
, `part_list`.`name`
, `serialized_parts`.`serial`
, `history_actions`.`action`
, `history`.`date_added`
FROM `history_actions`
INNER JOIN `history` ON `history`.`action_id` = `history_actions`.`id`
LEFT JOIN `parts` ON `parts`.`id` = `history`.`part_id`
LEFT JOIN `serialized_parts` ON `serialized_parts`.`parts_id` = `history`.`part_id`
LEFT JOIN `part_list` ON `part_list`.`id` = `parts`.`type_id`
WHERE `history`.`unit_id` = '1'
ORDER BY `history`.`id` DESC
Boy, these backticks make my eyes hurt.
SELECT
h.id,
p.type_id,
pl.name,
sp.serial,
ha.action,
h.date_added
FROM
history h
INNER JOIN history_actions ha ON ha.id = h.action_id
LEFT JOIN parts p ON p.id = h.part_id
LEFT JOIN serialized_parts sp ON sp.parts_id = h.part_id
LEFT JOIN part_list pl ON pl.id = p.type_id
WHERE
h.unit_id = '1'
ORDER BY
history.id DESC