How to convert SUBSELECT with TOP and ORDER BY to JOIN

How to convert SUBSELECT with TOP and ORDER BY to JOIN - sql

I have a working sql select, which looks like this
[Edited: Im sorry i did one mistake in the question, i edited alias of Table1 but im trying the answers]
SELECT
m.Column1
,t2.Column2
,COALESCE
(
(
SELECT TOP 1 Vat
FROM LinkedDBServer.DatabaseName.dbo.TableName t3
WHERE
m.MaterialNumber = t3.MaterialNumber COLLATE Czech_CI_AS
and t3.Currency = …
and ...
ORDER BY [Date] DESC
), m.Vat
) as Vat
FROM Table1 m
JOIN Table2 t2 on (m.Column1 = t2.Column1)
It works but the problem is that it takes too long and LinkedServer cut my connection because it takes more than 10 minutes. The purpose of the query is to get newer data from a different database if it exists (i get newest data by top and ordering it by date and precondition is that every data in that database is newer than in mine, thats why im using COALESCE).
But my though is if I was able to rewrite it to JOIN it could be faster. But another problem could be I dont have an primary key (and cant change that).
How can I speed that query up ? (Im using SQL Server 2008 R2)
Thank you
Here i attached Estimated Query Plan: (Its readable in browser ZOOM :) Estimation is for 2 Coalesce columns.

Try rewriting query using outer apply
SELECT
t1.Column1
,t2.Column2
,COALESCE(ou.vat, m.Vat) as Vat
FROM Table1 t1
JOIN Table2 m on (m.Column1 = t1.Column1)
outer apply
(
SELECT TOP 1 Vat
FROM LinkedDBServer.DatabaseName.dbo.TableName t3
WHERE
m.MaterialNumber = t3.MaterialNumber COLLATE Czech_CI_AS
and t3.Currency = …
and ...
ORDER BY [Date] DESC
) ou

Another option:
; WITH vat AS (
SELECT MaterialNumber COLLATE Czech_CI_AS As MaterialNumber
, Vat
, Row_Number() OVER (PARTITION BY MaterialNumber ORDER BY "Date" DESC) As sequence
FROM LinkedDBServer.DatabaseName.dbo.TableName
WHERE Currency = ...
AND ...
)
SELECT t1.Column1
, m.Column2
, Coalesce(vat.Vat, m.Vat) As Vat
FROM Table1 As t1
INNER
JOIN Table2 As m
ON m.Column1 = t1.Column1
LEFT
JOIN vat
ON vat.MaterialNumber = m.MaterialNumber
AND vat.sequence = 1
;

Related

Impala Error : AnalysisException: Multiple subqueries are not supported in expression: (CASE *** WHEN THEN)

I have a where Clause that I need to check if values exists in a table, and I'm doing that in a (subquery). The problem is, that should be made based on
values - 'FIX' and 'VAR'. Depending on each, we need to check on a different table (subquery). To achieve that goal I'm using a Case When statement in the where clause, as shown below:
select *
FROM T1
where
(upper(trim(ITAXAVAR)) = 'S'
and
(
upper(trim(CTIPAMOR)) not in ('A','U','F')
)
)
and
--problem starts here.....
(case ucase(trim(CTIPTXFX)) --Values 'FIX';'VAR';'PUR'
WHEN 'FIX'
THEN
(concat(trim(CPRZTXFX),trim(CTAXAREF)) not in
(select trim(A.tayd91c0_celemtab)
from cd_estruturais.tat91_tabelas A
where A.tayd91c0_ctabela = 'W03' and
--data_date_part = '${Data_ref}' and --por vezes não temos actualização TAT91 para mesma data_ref das tabelas
A.data_date_part = (select max(B.data_date_part)
from cd_estruturais.tat91_tabelas B
where A.tayd91c0_ctabela = B.tayd91c0_ctabela and
B.data_date_part > date_add(TO_DATE(FROM_UNIXTIME(UNIX_TIMESTAMP())),-5)
)
and length(nvl(trim(A.tayd91c0_celemtab),'')) <> 0
)
)
WHEN 'VAR'
THEN
(concat(trim(CTAXAREF),trim(CPERRVTX)) not in
(select concat(trim(A.CTXREF),trim(A.CPERRVTX))
from land_estruturais.cat01_taxref A
where A.data_date_part > date_add(TO_DATE(FROM_UNIXTIME(UNIX_TIMESTAMP())),-5)
and length(nvl(concat(trim(A.CTXREF),trim(A.CPERRVTX)),'')) <> 0
)
)
END
)
;
Below is a simplified view of the same query:
select *
FROM T1
where
(--first criteria
)
and
--problem starts here.....
(case ucase(trim(CTIPTXFX)) --Values 'FIX';'VAR';'PUR'
WHEN 'FIX'
THEN
(field1 not in
(subquery 1)
)
WHEN 'VAR'
THEN
(field1 not in
(subquery 2)
END
)
;
Can anyone tell me what I'm doing wrong, please?
I seems to me that Impala does not support the subqueries inside a Case When Statement.
Thank you.

Impala doesnt support Subqueries in the select list.
So, you need to rewrite the SQL like below -
Use LEFT ANTI JOIN in place of NOT IN() to link subqueries to T1.
To handle case when, use UNION ALL for different conditions.
SELECT * FROM T1
LEFT ANTI JOIN subqry1 y ON T1.id = y.id
WHERE col='FIX'
UNION ALL
SELECT * FROM T1
LEFT ANTI JOIN subqry2 y ON T1.id = y.id
WHERE col='VAR'
I tried to change the simple SQL you posted above. The main SQL is too complex and need table setup and data to prove the logic.
Here is my version of your simple SQL -
select * FROM T1
LEFT ANTI JOIN subquery1 ON subquery1.column = T1.field1
where (--first criteria )
and ucase(trim(CTIPTXFX))='FIX'
UNION ALL
select * FROM T1
LEFT ANTI JOIN subquery2 ON subquery2.column = T1.field1
where (--first criteria )
and ucase(trim(CTIPTXFX))='VAR'
Pls note, Anti join and union all can be expensive so if your table size if huge, please tune them accordingly.

Oracle SQL Developer - error when referencing fields within nested from statements

For the below query I am getting an error with line 4 when referencing variables within "y". The query runs successfully when I use just " y.* " (line 5), however it generates an error when I try to also pull from the specified fields in line 4 (y.field1 as PRODUCT, y.field2 as PRODUCT_TYPE, y.entity, y.TYPE1). For the output, I want these fields listed first for visual reference.
I have this approach/ logic working for other queries (as i'm re using this logic for multiple variations of queries and various tables). However, I think that the issue with this one lies in my attempt to reference fields from tables that are in my join statements.
(
select
-- categorization fields:
-- table2.field1 as PRODUCT, table2.field2 as PRODUCT_TYPE, table3.entity, table3.TYPE1
y.field1 as PRODUCT,
y.field2 as PRODUCT_TYPE,
y.entity,
y.TYPE1
,y.*
from (
select *
from (
-- table references:
select table1.*,
row_number() over (
partition by
-- categorization fields:
table2.field1,
table2.field2,
table3.entity,
table3.TYPE1
order by table3.entity
) as rn
-- table references
from table1
-- joins, links, and filtering:
inner join table6 on table1.field_1 = table6.code1
inner join table5 on (table6.code = table5.code1)
AND (table6.code = table5.code)
left join table3 on table6.ent1 = table3.ent_code
left join table2 on table1.extid = table2.extID
where table1.tdate between '01-APR-19' and '01-APR-21'
AND table1.refe NOT IN ('OFF')
) x
-- sample rows:
where rn <= 2
) y
);
Let me know if anyone has a way that I can maybe better specify which tables those fields come from. I wish I could just do something like this:
y.table2.field1 as PRODUCT,
y.table2.field2 as PRODUCT_TYPE,
y.table3.entity,
y.table3.TYPE1
Sorry that I don't have a fiddle available!

Let me know if anyone has a way that I can maybe better specify which tables those fields come from.
Don't use select *. Instead, use the column names and give them appropriate aliases so you know where they came from:
As an example:
SELECT small_value,
medium_value,
big_value
FROM (
SELECT small.value AS small_value,
medium.value AS medium_value,
big.value AS big_value
FROM big
CROSS JOIN medium
CROSS JOIN small
)
WHERE 1 = 1
In your query, instead of using SELECT * in y or using SELECT table1.* in x you can name the columns and give them descriptive aliases.
I am getting an error with line 4 when referencing variables within "y".
(
select
-- categorization fields:
-- table2.field1 as PRODUCT, table2.field2 as PRODUCT_TYPE, table3.entity, table3.TYPE1
That is because you cannot see TABLE2 or TABLE3 because the only "view" you are looking at is of the sub-query with the alias y.
If you want to see those columns then you need to SELECT them inside the x subquery and pass them to each subsequent outer-query.
(
select *
from (
-- table references:
select table1.field1 AS t1_product,
table1.field2 AS t1_product_type,
table1.entity AS t1_entity,
table1.type1 AS t1_type1,
table2.field1 AS t2_product,
table2.field2 AS t2_product_type,
table2.entity AS t2_entity,
table2.type1 AS t2_type1,
table3.field1 AS t3_product,
table3.field2 AS t3_product_type,
table3.entity AS t3_entity,
table3.type1 AS t3_type1,
row_number() over (
partition by
-- categorization fields:
table2.field1,
table2.field2,
table3.entity,
table3.TYPE1
order by table3.entity
) as rn
-- table references
from table1
-- joins, links, and filtering:
inner join table6 on table1.field_1 = table6.code1
inner join table5 on (table6.code = table5.code1)
AND (table6.code = table5.code)
left join table3 on table6.ent1 = table3.ent_code
left join table2 on table1.extid = table2.extID
where table1.tdate between '01-APR-19' and '01-APR-21'
AND table1.refe NOT IN ('OFF')
) x
-- sample rows:
where rn <= 2
);

Check if a combination of fields already exists in the table

My weakest area of SQL are self JOINS, currently struggling with an issue.
I need to find the latest entry in a table, I'm using a WHERE DATEFIELD IN (SELECT MAX(DATEFIELD) FROM TABLE) to do this. I then need to establish if 3 columns from that already exist in the same TABLE.
My latest attempt looks like this -
SELECT * FROM PART_TABLE
WHERE NOT EXISTS
(
SELECT
t1.DATEFIELD
t1.CODE1
t1.CODE2
t1.CODE3
FROM PART_TABLE t1
INNER JOIN PART_TABLE t2 ON t1.UNIQUE = t2.UNQIUE
)
WHERE t1.DATEFIELD IN
(
SELECT MAX(DATEFIELD)
FROM PARTTABLE
)
)
I think part of the issue is that I can't exclude the unique row from t1 when checking in t2 using this method.
Using MSSQL 2014.

The following query will return the latest record from your table and a bit flag whether a duplicate tuple {Code1, Code2, Code3} exists in it under a different identifier:
select top (1) p.*,
case when exists (
select 0 from dbo.Part_Table t where t.Unique != p.Unique
and t.Code1 = p.Code1 and t.Code2 = p.Code2 and t.Code3 = p.Code3
) then 1
else 0 end as [IsDuplicateExists]
from dbo.Part_Table p
order by p.DateField desc;
You can use this example as a template to address your specific needs, which unfortunately aren't immediately apparent from your explanation.

Why do I get ORA-00907 in my SQL query?

I have this SQL query which a partner has done for a little project at university (this is the first time we use SQL), but we get the ora-00907 error and both of us don't know why.
I have checked the parenthesis and they seem to be ok, so the problem must be another.
select
persona.nombre,
anyo,
t2.total
from persona join
(
select
t1.idPersona,
count(produccion.anyo) as total,
anyo
from
(
select *
from produccion
join pelicula
on produccion.id = pelicula.id
) as pel
join
(
select *
from participa
where idPapel = 8
) as t1
on t1.idProduccion = pel.id
)
group by t1.idPersona
) as t2
on persona.id = t2.idPersona
where t2.total > 2
order by t2.total desc;

You are selecting * and doing group by on one column which is creating problem. Either you select only respective column under group by condition OR you remove group by.
select *
from (produccion join pelicula on produccion.id=pelicula.id) as pel
join
(select *
from participa
where idPapel=8) as t1
on t1.idProduccion=pel.id)
group by t1.idPersona
Above code section is unallowed use of group by.
If group by is so much needed, i would suggest you to use it later on in the end. Another option is to use analytical function and filter out rest un-wanted records in upper nesting of query which you already have.

You have lots of nested views, which makes your query rather hard to debug. You have lots of brackets, which need to match.
Anyway this is wrong: select t1.idPersona, count(produccion.anyo) as total, anyo. You'll need to include anyo in the GROUP BY clause, which will probably change the result set you want.
select persona.nombre,
t2.anyo,
t2.total
from persona join
(select t1.idPersona,
count(produccion.anyo) as total,
anyo
from (select *
from produccion
join pelicula
on produccion.id=pelicula.id) pel
join
(select *
from participa
where idPapel=8) t1
on t1.idProduccion=pel.id
group by t1.idPersona, t1.anyo) t2
on persona.id=t2.idPersona
where t2.total>2
order by t2.total desc;

I think your query can be simplified/corrected like this:
select persona.nombre,
anyo,
t2.total
from persona
join (
select par.idPersona,
count(produccion.anyo) as total,
anyo
from produccion
join pelicula
on produccion.id = pelicula.id
left join participa par
on par.idProduccion = pelicula.id -- or produccion.id,
-- this was also an error in the original query,
-- since the subquery selected both
and par.idPapel = 8
group by t1.idPersona
, anyo -- Was missing, but it also doesn't make sense, as this is what you count, so you'll just get 1's here. What do you want with this?
) as t2
on persona.id = t2.idPersona
where t2.total > 2
order by t2.total desc;

SQL JOIN Statement

Lets say I have a table e.g
Request No. Type Status
---------------------------
1 New Renewed
and then another table
Action ID Request No LastUpdated
------------------------------------
1 1 06-10-2010
2 1 07-14-2010
3 1 09-30-2010
How can I join the second table with the first table but only get the latest record from the second table(e.g Last Updated DESC)

SELECT T1.RequestNo ,
T1.Type ,
T1.Status,
T2.ActionId ,
T2.LastUpdated
FROM TABLE1 T1
JOIN TABLE2 T2
ON T1.RequestNo = T2.RequestNo
WHERE NOT EXISTS
(SELECT *
FROM TABLE2 T2B
WHERE T2B.RequestNo = T2.RequestNo
AND T2B.LastUpdated > T2.LastUpdated
)

Using aggregates:
SELECT r.*, re.*
FROM REQUESTS r
JOIN REQUEST_EVENTS re ON re.request_no = r.request_no
JOIN (SELECT t.request_no,
MAX(t.lastupdated) AS latest
FROM REQUEST_EVENTS t
GROUP BY t.request_no) x ON x.request_no = re.request_no
AND x.latest = re.lastupdated
Using LEFT JOIN & NOT EXISTS:
SELECT r.*, re.*
FROM REQUESTS r
JOIN REQUEST_EVENTS re ON re.request_no = r.request_no
WHERE NOT EXISTS(SELECT NULL
FROM REQUEST_EVENTS re2
WHERE re2.request_no = r2.request_no
AND re2.LastUpdated > re.LastUpdated)

SELECT *
FROM REQUEST, ACTION
WHERE REQUEST.REQUESTNO = ACTION.REQUESTNO --Joining here
AND ACTION.LastUpdated = (SELECT MAX(LastUpdated) FROM ACTION WHERE REQUEST.REQUESTNO = ACTION.REQUESTNO);
A sub-query is used to get the last updated record's date and matches against itself to prevent the other records being joined.
Granted, depending on how precise the LastUpdated field is, it can have problems with two records being updated on the same date, but that is a problem encountered in any other implementation, so the precision would have to be increased or some other logic would have to be in place or another distinguishing characteristic to prevent multiple rows being returned.

SELECT r.RequestNo, r.Type, r.Status, a.ActionID, MAX(a.LastUpdated)
FROM Request r
INNER JOIN Action a ON r.RequestNo = a.RequestNo
GROUP BY r.RequestNo, r.Type, r.Status, a.ActionID

We can use the operation Top 1 with ORDER BY clause. For instance, if your tables are RequestTable(ID,Type,Status) and ActionTable(ActionID,RequestID,LastUpdated), the query will be like this:
Select Top 1 rq.ID, rq.Status, at.ActionID
From RequestTable as rq
JOIN ActionTable as at ON rq.ID = at.RequestID
Order by at.LastUpdated DESC

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to convert SUBSELECT with TOP and ORDER BY to JOIN - sql

Related

Impala Error : AnalysisException: Multiple subqueries are not supported in expression: (CASE *** WHEN THEN)

Oracle SQL Developer - error when referencing fields within nested from statements

Check if a combination of fields already exists in the table

Why do I get ORA-00907 in my SQL query?

SQL JOIN Statement

Categories

Resources