Turning an outer apply into a left join when you reference parent aliases - sql

I'm currently trying to turn an outer apply into a left join to save some complexity.
SELECT *
FROM fact_table h
OUTER APPLY (SELECT TOP 1
*
FROM dimension mcc WITH (NOLOCK)
WHERE h.product = mcc.product
AND h.country = mcc.country
AND mcc.date IN (SELECT MAX(date)
FROM dimension dd WITH (NOLOCK)
WHERE FORMAT(DATEADD(MONTH, -3, dd.date), 'yyyyMM') <= h.month_in_the_year
AND dd.product = h.product
AND dd.country = h.country)) a;
I basically use it to get the related data from Dimension linked with the latest data point that's earlier than 3 months ago.
I'm trying to turn it into a left join, but it's taking a lot more time since I don't filter the dimension before the join :
SELECT TOP 10
*
FROM fact_table h
LEFT JOIN dimension a ON h.product = a.product
AND h.country = a.country
AND a.pkid = (SELECT TOP 1
pkid
FROM dimension dd
WHERE FORMAT(DATEADD(MONTH, -3, dd.date), 'yyyyMM') <= h.month_in_the_year
ORDER BY date DESC);
Do you have an idea on how to turn it efficiently into a left join ?

It looks like you can significantly simplify this query, by simply adding an ORDER BY. I've also modified the date filter in order to leverage indexing properly.
SELECT *
FROM fact_table h
OUTER APPLY (
SELECT TOP 1 *
FROM dimension mcc
WHERE h.product = mcc.product
AND h.country = mcc.country
AND mcc.date < DATEADD(MONTH, 2, DATEFROMPARTS(LEFT(h.month_in_the_year, 4), RIGHT(h.month_in_the_year, 2), 1))
ORDER BY mcc.date DESC
) a;
To transform this into a LEFT JOIN, you need to utilize row-numbering
SELECT *
FROM (
SELECT *,
rn = ROW_NUMBER() OVER (PARTITION BY h.PrimaryKeyColumn ORDER BY mcc.date)
FROM fact_table h
LEFT JOIN dimension mcc
ON h.product = mcc.product
AND h.country = mcc.country
AND mcc.date < DATEADD(MONTH, 2, DATEFROMPARTS(LEFT(h.month_in_the_year, 4), RIGHT(h.month_in_the_year, 2), 1))
) a
WHERE rn = 1;

Related

Get value from a joined table with no value in primary table

The query shown below is just about right, but I need to have a row for each fiscal Id, i.e. in the output shown below, there needs to be a new row after row 4 with data (screen shot below)
The query I'm using is:
SELECT a.companyId,a.profitCenterID,a.coaID,a.fiscalId,
COALESCE(SUM(a.amount * -1),0) amount,
twelveMo = (
SELECT COALESCE(SUM(amount * -1), 0)
FROM gl a1
LEFT OUTER JOIN fiscal f ON a1.fiscalId=f.Id
WHERE
a1.companyId = a.companyId AND
a1.profitCenterId = a.profitCenterId AND
a1.coaId = a.coaId AND
f.Id > a.fiscalId - 12 AND
f.Id <= a.fiscalId
)
FROM gl a
INNER JOIN coa c ON c.Id=a.coaId AND c.statementType=4
GROUP BY companyId,profitCenterId,coaId,a.fiscalId
ORDER BY companyId,profitCenterId,coaId,a.fiscalId
I don't know your sample datas and your schema's, so I've just added my query on the top of your's.
;WITH CTE_NUM_TEMP AS
(
SELECT 1 AS Fiscal
UNION ALL
SELECT Fiscal+1 FROM CTE_NUM_TEMP
WHERE Fiscal+1<=100
)
SELECT ISNULL(Der.companyId,1) AS companyId,ISNULL(Der.profitCenterID,1) AS profitCenterID,
ISNULL(Der.coaID,40000) AS coaID,IIF(twelveMo IS NULL,LAG(twelveMo,1) OVER(ORDER BY Fiscal),twelveMo) AS twelveMo
FROM CTE_NUM_TEMP AS Num
LEFT JOIN
(
SELECT a.companyId,a.profitCenterID,a.coaID,a.fiscalId,
COALESCE(SUM(a.amount * -1),0) amount,
twelveMo = (
SELECT COALESCE(SUM(amount * -1), 0)
FROM gl a1
LEFT OUTER JOIN fiscal f ON a1.fiscalId=f.Id
WHERE
a1.companyId = a.companyId AND
a1.profitCenterId = a.profitCenterId AND
a1.coaId = a.coaId AND
f.Id > a.fiscalId - 12 AND
f.Id <= a.fiscalId
)
FROM gl a
INNER JOIN coa c ON c.Id=a.coaId AND c.statementType=4
GROUP BY companyId,profitCenterId,coaId,a.fiscalId
)AS Der
ON Num.Fiscal=Der.fiscalId

Avoid SQL Pivot returning duplicate rows

I have the following SQL script which returns duplciate values in PIVOT. How do I combine those duplicate records to one row.
Please check the below image for the results set.
SELECT *
FROM (SELECT X.stockcode,
X.description,
X.pack,
X.location,
X.lname,
X.qty,
Y.stockcode AS StockCode2,
y.periodname,
Y.months,
Y.saleqty
FROM (SELECT dbo.stock_items.stockcode,
dbo.stock_items.description,
dbo.stock_items.pack,
dbo.stock_loc_info.location,
dbo.stock_locations.lname,
dbo.stock_loc_info.qty
FROM dbo.stock_locations
INNER JOIN dbo.stock_loc_info
ON dbo.stock_locations.locno = dbo.stock_loc_info.location
LEFT OUTER JOIN dbo.stock_items
ON dbo.stock_loc_info.stockcode = dbo.stock_items.stockcode
WHERE ( dbo.stock_items.status = 's' )) AS X
LEFT OUTER JOIN (SELECT dbo.dr_invlines.stockcode,
( 12 + Datepart(month, Getdate()) - Datepart(month, dbo.dr_trans.transdate) ) % 12 + 1 AS Months,
Sum(dbo.dr_invlines.quantity) AS SaleQty,
dbo.period_status.periodname
FROM dbo.dr_trans
INNER JOIN dbo.period_status
ON dbo.dr_trans.period_seqno = dbo.period_status.seqno
LEFT OUTER JOIN dbo.stock_items AS STOCK_ITEMS_1
RIGHT OUTER JOIN dbo.dr_invlines
ON STOCK_ITEMS_1.stockcode = dbo.dr_invlines.stockcode
ON dbo.dr_trans.seqno = dbo.dr_invlines.hdr_seqno
WHERE ( STOCK_ITEMS_1.status = 'S' )
AND ( dbo.dr_trans.transtype IN ( 1, 2 ) )
AND ( dbo.dr_trans.transdate >= Dateadd(m, -6, Getdate()) )
GROUP BY dbo.dr_invlines.stockcode,
Datepart(month, dbo.dr_trans.transdate),
dbo.period_status.periodname) AS Y
ON X.stockcode = Y.stockcode) z
PIVOT (Sum(saleqty) FOR [months] IN ([1],[2],[3],[4],[5],[6])) AS pivoted
EDIT: I missed the root-cause of your issue being the inclusion of the periodname column causing the percieved duplication. I am leaving this in place as general solution showing CTE usage, because it could still be useful if you then want to do extra filtering/transformation of your pivot results
One way is to take the results of the pivot query and run it through a SELECT DISTINCT query.
An example of wrapping your pivot query as a CTE and using it to feed a SELECT DISTINCT below (please note: untested, but parses as valid in my SSMS)
WITH PivotResults_CTE (
stockcode,
description,
pack,
location,
lname,
qty,
StockCode2,
periodname,
months,
saleqty
)
AS (
SELECT *
FROM (
SELECT X.stockcode
,X.description
,X.pack
,X.location
,X.lname
,X.qty
,Y.stockcode AS StockCode2
,y.periodname
,Y.months
,Y.saleqty
FROM (
SELECT dbo.stock_items.stockcode
,dbo.stock_items.description
,dbo.stock_items.pack
,dbo.stock_loc_info.location
,dbo.stock_locations.lname
,dbo.stock_loc_info.qty
FROM dbo.stock_locations
INNER JOIN dbo.stock_loc_info ON dbo.stock_locations.locno = dbo.stock_loc_info.location
LEFT OUTER JOIN dbo.stock_items ON dbo.stock_loc_info.stockcode = dbo.stock_items.stockcode
WHERE (dbo.stock_items.STATUS = 's')
) AS X
LEFT OUTER JOIN (
SELECT dbo.dr_invlines.stockcode
,(12 + Datepart(month, Getdate()) - Datepart(month, dbo.dr_trans.transdate)) % 12 + 1 AS Months
,Sum(dbo.dr_invlines.quantity) AS SaleQty
,dbo.period_status.periodname
FROM dbo.dr_trans
INNER JOIN dbo.period_status ON dbo.dr_trans.period_seqno = dbo.period_status.seqno
LEFT OUTER JOIN dbo.stock_items AS STOCK_ITEMS_1
RIGHT OUTER JOIN dbo.dr_invlines ON STOCK_ITEMS_1.stockcode = dbo.dr_invlines.stockcode ON dbo.dr_trans.seqno = dbo.dr_invlines.hdr_seqno WHERE (STOCK_ITEMS_1.STATUS = 'S')
AND (
dbo.dr_trans.transtype IN (
1
,2
)
)
AND (dbo.dr_trans.transdate >= Dateadd(m, - 6, Getdate()))
GROUP BY dbo.dr_invlines.stockcode
,Datepart(month, dbo.dr_trans.transdate)
,dbo.period_status.periodname
) AS Y ON X.stockcode = Y.stockcode
) z
PIVOT(Sum(saleqty) FOR [months] IN (
[1]
,[2]
,[3]
,[4]
,[5]
,[6]
)) AS pivoted
)
SELECT DISTINCT *
FROM
PivotResults_CTE
;
Also note, your sql included in the above may look slightly different to your original but that is only because i ran it through a reformatter to ensure i understood the structure of it.
In other words, the basic CTE wrapper for your pivot query is:
WITH PivotResults_CTE (
Field1,
Field2,
...
)
AS (
YOUR_PIVOT_QUERY_HERE
)
SELECT DISTINCT *
FROM
PivotResults_CTE
;

Count with row_number function SQL CTE

I have the below CTEs that work perfectly, but I want to count the "cl.memb_dim_id" by "cl.post_date" but I am not sure how to do that? When adding in the count function I get an error that highlights the ' row number' so I am assuming I cant have both order and group together ????
WITH
DATES AS
(
select to_date('01-jan-2017') as startdate,to_date('02-jan-2017') as enddate
from dual
),
Claims as (select distinct
cl.memb_dim_id,
row_number () over (partition by cl.Claim_number order by cl.post_date desc) as uniquerow,
cl.Claim_number,
cl.post_date,
ct.claim_type,
ap.claim_status_desc,
dc.company_desc,
dff.io_flag_desc,
pr.product_desc,
cl.prov_dim_id,
cl.prov_type_dim_id
from dw.fact_claim cl
inner join dates d
on 1=1
and cl.post_date >= d.startdate
and cl.post_date <= d.enddate
and cl.provider_par_dim_id in ('2')
and cl.processing_status_dim_id = '1'
and cl.company_dim_id in ('581','585','586','589','590','591','588','592','594','601','602','603','606','596','598','597','579','599','578','577','573','574','576','575')
left join dw.DIM_CLAIM_STATUS ap
on cl.claim_status_dim_id = ap.claim_status_dim_id
left join dw.dim_claim_type ct
on cl.claim_type_dim_id = ct.claim_type_dim_id
and cl.claim_type_dim_id in ('1','2','6','7')
left join dw.DIM_COMPANY dc
on cl.company_dim_id = dc.company_dim_id
left join dw.DIM_IO_FLAG dff
on cl.io_flag_dim_id = dff.io_flag_dim_id
left join dw.dim_product pr
on cl.product_dim_id = pr.product_dim_id
)
Select * from claims where uniquerow ='1'
First, does this work?
count(cl.memb_dim_id) over (partition by cl.Claim_number, cl.post_date) as cnt,
Second, it is strange to be using analytic functions with select distinct.

inner join with two selects sql

I am trying to implement an inner join to compare values of two tables, however failing for some reason and the query is returning zero columns.
I have two tables security and security_his and trying to join them on columns SECURITY_ID and INVESTMENT_OBJECTIVE. Query is as follows
SELECT *
FROM SECURITY origin
INNER JOIN (
SELECT *
FROM SECURITY_HIS t2
WHERE DATED = (
SELECT MAX(DATED)
FROM SECURITY_HIS t1
WHERE t1.SECURITY_ID = t2.SECURITY_ID
)
) history ON origin.SECURITY_ID = history.SECURITY_ID
AND origin.INVESTMENT_OBJECTIVE = history.INVESTMENT_OBJECTIVE;
WITH cte as (
SELECT S.*,
row_number() over
(partition by S.SECURITY_ID ORDER BY SH.DATED DESC)
FROM SECURITY S
JOIN SECURITY_HIS SH
ON S.SECURITY_ID = SH.SECURITY_ID
AND S.INVESTMENT_OBJECTIVE = SH.INVESTMENT_OBJECTIVE
)
SELECT *
FROM cte
WHERE rn = 1
You have no GROUP BY on the innermost query, so only a single value, maxed over the entire table, is returned. However your query can also be simplified for easier understanding:
SELECT origin.*, history.Dated
FROM SECURITY origin
INNER JOIN (
SELECT
SECURITY_ID,
INVESTMENT_OBJECTIVE,
MaxDated = MAX(DATED)
FROM SECURITY_HIS t2
GROUP BY
SECURITY_ID,
INVESTMENT_OBJECTIVE
) history ON origin.SECURITY_ID = history.SECURITY_ID
AND origin.INVESTMENT_OBJECTIVE = history.INVESTMENT_OBJECTIVE

Obtaining only first result from a LEFT JOIN

I'm trying to get the first result of a LEFT JOIN for each row of a SELECT statement.
Because now right now, if I have 100 rows in the joined table, I'll get 100 times the same row from the SELECT. I'd just need the first joined row so that way I wouldn't get any duplicates.
I can't use GROUP BY because I have to get more than only one row from the table.
Here's a basic version of my query:
SELECT bg.PatientID, DATEDIFF(hour, bg.CreateDate, GETDATE()) TimeToTarget
FROM BloodGlucose bg
LEFT JOIN IVProtocol i ON i.PatientID = bg.PatientID
WHERE bg.BGValue >= i.TargetLow AND bg.BGValue <= i.TargetHigh
ORDER BY bg.PatientID ASC
I tried using DISTINCT but since the data from bg.CreateDate isn't always the same it returns duplicates.
I just need the FIRST row of that left joined table.
Any ideas/suggestions?
Thanks!
;WITH x AS
(
SELECT
bg.PatientID,
TimeToTarget = DATEDIFF(hour, bg.CreateDate, GETDATE()),
rn = ROW_NUMBER() OVER (PARTITION BY bg.PatientID ORDER BY bg.CreatedDate DESC)
FROM dbo.BloodGlucose AS bg
LEFT JOIN dbo.IVProtocol AS i
ON i.PatientID = bg.PatientID
WHERE bg.BGValue >= i.TargetLow
AND bg.BGValue <= i.TargetHigh
)
SELECT PatientID, TimeToTarget
FROM x
WHERE rn = 1
ORDER BY PatientID;
To join to other results:
;WITH x AS
(
... same as above ...
)
SELECT x.PatientID, x.TimeToTarget, y.Something
FROM x INNER JOIN dbo.SomethingElse AS y
ON x.PatientID = y.PatientID
WHERE x.rn = 1
ORDER BY x.PatientID;
SELECT bg.PatientID, DATEDIFF(hour, bg.CreateDate, GETDATE()) TimeToTarget
FROM BloodGlucose bg
cross apply (
select top 1 *
from IVProtocol i
where i.PatientID = bg.PatientID
order by SOME_CRITERA
) i
WHERE bg.BGValue >= i.TargetLow AND bg.BGValue <= i.TargetHigh
ORDER BY bg.PatientID ASC
Cross apply is a handy tool for such situations. It works like a join but you can use variables inside the subquery.